Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

indexing runs out of memory for large projects #1219

Closed
martinlippert opened this issue Apr 3, 2024 · 4 comments
Closed

indexing runs out of memory for large projects #1219

martinlippert opened this issue Apr 3, 2024 · 4 comments
Assignees
Labels
for: eclipse something that is specific for Eclipse for: vscode something that is specific for VSCode theme: spring index & symbols type: enhancement

Comments

@martinlippert
Copy link
Member

The indexing infrastructure is running out of memory when indexing projects with a large number of source files (as reported in #1212).

We need to improve the implementation to reduce the overall memory consumption, especially to decouple the memory consumption from the size of the project or the number of projects being parsed.

Step 1: we need to chunk the set of source files into well-defined smaller chunks in order to allow the garbage collection to free up while indexing.

Step 2: we need to cleanup the lookup environment of the parser after each parsing attempt in order to avoid leaking memory or keeping zip files open.

@martinlippert martinlippert added type: enhancement theme: spring index & symbols for: eclipse something that is specific for Eclipse for: vscode something that is specific for VSCode labels Apr 3, 2024
@martinlippert martinlippert added this to the 4.22.1.RELEASE milestone Apr 3, 2024
@martinlippert martinlippert self-assigned this Apr 3, 2024
@martinlippert
Copy link
Member Author

Inviting @licam to this issue in order to provide additional feedback and test early builds, once available.

martinlippert added a commit that referenced this issue Apr 3, 2024
…o smaller chunks to reduceo overall memory needs
martinlippert added a commit that referenced this issue Apr 4, 2024
… after bulk parsing to close zip files and free up memory
martinlippert added a commit that referenced this issue Apr 4, 2024
…s and arrays all the time + reusing common sets instead of creating new set objects all the time
@martinlippert
Copy link
Member Author

@licam The latest pre-release builds for VSCode should already contain a few early optimizations. Would be interesting to hear whether that runs any better in your environment and with your large projects. You can switch to the pre-release in VSCode directly when you click on the Spring Boot Tools entry in the list extensions, and then switch to pre-release.

@martinlippert
Copy link
Member Author

martinlippert commented Apr 9, 2024

Here are some early rough results, measuring the progress here (using my sample project):

Version 1.53.0 is able to:

  • parse projects with 6.500 source code files
  • generate 100.000 symbols

Version 1.54.0 is able to:

  • parse projects with 65.000 source code files
  • generate 1.000.000 symbols

Both measurements used the default max heap setting of 512m for the language server process.
This is a 10x improvement, so quite a good step forward here, I think.

The exact numbers will vary quite a bit, depending on the size of the individual source code files and the number of symbols generated for the concrete project, of course.

If you have larger projects that this, you have to increase the heap space for the language server.

@licam
Copy link

licam commented Apr 9, 2024

@martinlippert Sounds promising. We will test and adapt the new version once it will be released. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
for: eclipse something that is specific for Eclipse for: vscode something that is specific for VSCode theme: spring index & symbols type: enhancement
Projects
None yet
Development

No branches or pull requests

2 participants