New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LinkedHashMap memory leak #3260
Comments
@eduveks thanks for reporting the issue |
@eduveks thanks for the report! Are you reloading classes e.g. using a custom host class loader or some application server? It looks to me as if our host member/method lookup cache seems to be missed a lot. If this is true then the current behavior might be not ideal but is expected. We currently do reference bound host methods strongly. A workaround is to recreate the Engine (or Context if you don't use an explicit Engine) to refresh that cache. Note that this also refreshes all the compiled code. If this turns out to be the case, then I could have another look on improving this, if you provide a small example of your usage pattern. If you are not reloading classes in your host application then it would be immensely helpful if you could provide a sample application that reproduces this leak reliably or, less preferrably, a heap dump (hprof) file of the leak. Thanks! |
Hello @chumer, I did another round of stress test, with screenshots below and here the hprof file made a few minutes ago: Yes, it is an application server, the Netuno platform, with a Jetty server inside and runs over GraalVM. You can check out Netuno install is straightforward, only need to execute one command in the terminal. The installation will download and install GraalVM too inside the netuno folder. Then here an example of how to create services. NodeJS inside GraalVM execute the javascript server-side (services), then the script code like Each request executes multiples scripts (JavaScript) over the same Class that initializes the HostAccess, ContextBuilder, and then the Context, but in the same request may create more than one Context. But always all Contexts initialized will be closed in the end of the request with Multi-threads do not use the same If you need more details just say. Thank you! |
I downloaded the heap dump and computed the retained size and the retained size looks like this: So it looks like if all of the references can actually be freed.
Two remarks here:
[1] https://www.graalvm.org/reference-manual/embed-languages/#code-caching-across-multiple-contexts |
I think that I already tested the implementation with code caching across multiple contexts in the past. But yesterday, I did some tests changing the implementation to supports the same Engine across all threads and with one Engine per thread. The first impression in both cases was memory ate fast and very slow performance, scripts executed before in 300 milliseconds jumped to 20 seconds, CPU with high usage too. I will try to do more tests within the next few days to try to give more details. Yes, periodically, we force the garbage collector with:
Sample of the first results with the Engine: |
Any chance you can give me something runnable that reproduces this problem? |
Yes, I think. I don't have time today, but I will give you more news in the next few days. |
I think I know what is going on here. I was investigating, why there are so many instances of |
Small heap dump for investigation: GH-3260_heapdump.zip . |
Steps to reproduce:
|
I did not yet run your code yet, but I think I found a problematic pattern in our code:
Switching that cyclic reference flag to false fixes the problem (tested on latest JDK 8 on Mac). But this should really not happen, as such cyclic references should be resolvable by the GC. I will double check and maybe file a JDK bug. We probably need to deploy a workaround for this. I need to do some more digging to find out what is going on. |
Turns out this is a known issue: |
It looks like every page reload in the web app adds several instances of |
I could reproduce it with a simple example now locally. |
Fix is in the pipeline. Will backport this to 21.1 and probably also 20.3 (need to double check) |
Thank you, @thurka, for your inputs, and I'm happy you could found the issue with Netuno. @chumer the workaround reusing the To complement the @thurka tip to run locally, if install and start Netuno just with
The first one, This test with Apache BenchMark and the demo application are very different from all tests I had reported before, but I think it is over the same internal issue. My results with Apache BenchMark executed over 5 times: |
@eduveks This is really weird, in my local tests using the same HostAccess instance for all contexts resolved the problem. As the cache is bound and reused for HostAccess instances. So it should be impossible to have the same method loaded twice. Are you sure you are not just building an equal HostAccess instance? To make sure, can you read the HostAccess instance from a static final? Maybe you can drop a snippet how you are doing that atm, so I can plug it into my test? |
@chumer I did another test to ensure and is the same behavior with This is the current full code of the integration with GraalVM:
|
Any reason why you are doing:
With enter/leave? Any context operation should do an implicit enter/leave. In this case doing the enter and leave explicitely is actually slower. Missing a leave on a particular thread can also potentially cause another leak. Is there any multi-threading happening for a GraalRunner instance? |
@chumer, my apologies, I detect a maven malfunction that did not change my code version in all of my tests. 🤦♂️ Finally, it is correct. Your workaround sharing the HostAccess works very well! Sorry for my mistake. This is the result of many stress, my hardest test, rolling smooth: 🥳 Thank you very much! 🍻 |
To clarify, I commented all Yes, just one instance of the And again, thank you for your support. |
With this, I remembered making another test sharing the same Engine with all threads, as @chumer said. And the performance jumped to another level, much less memory and CPU consumption overall, even with a lot of stress is like nothing is going on. Then the key is to share Engine and HostAccess over all threads. 🚀 Finally, the killer result: |
Happy to hear it all works out now as intended! :-) |
Hello,
I'm facing memory leak of many LinkedHashMap with GraalVM.
This behaviour is not new because in the last years I am using GraalVM always seeing memory issues, but now is more critical because we are using in situations with much more stress overall.
Anyway GraalVM is awesome, thank you a lot by all your work. 👏
Hope this report can help to improve.
Describe GraalVM and your environment :
java -Xinternalversion
:The text was updated successfully, but these errors were encountered: