Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak or OutOfMemoryError: Physical memory usage is too high - when using ZGC #315

Closed
liufuyang opened this issue May 4, 2021 · 5 comments

Comments

@liufuyang
Copy link

I have asked about it here bytedeco/javacpp#474 but I feel it's probably just better to mention at this repo again just incase others are facing the same issue.

We have a Java backend service that uses TF to host a model to do predictions. We just upgraded to TF2.x and javacpp got updated at the same time, then the service is crashing with out of memory because we were using ZGC. Turn off ZGC and use the default GC policy resolves the issue for us.

This happens on Mac OS or Linux (in docker container), OpenJDK 11 or 16.

Detailed issue description please see issue on javacpp

If we are sure it is not related to javacpp but to this tf java repo then I can close down the other one and copy the issue description here. Thanks.

@saudet
Copy link
Contributor

saudet commented May 5, 2021

Like I keep telling you, if the only issue that you are facing is a memory leak, this means that you forgot to call close() somewhere. You'll need to make sure that you are calling close() everywhere that we need to.

@Craigacp
Copy link
Collaborator

Craigacp commented May 5, 2021

@saudet No, this is because the RSS is miscounting the memory usage on Linux under ZGC. See my comment on the issue on JavaCPP - bytedeco/javacpp#474 (comment).

@saudet
Copy link
Contributor

saudet commented May 5, 2021

@Craigacp No, as per bytedeco/javacpp#474 (comment), the same exact thing happens when "org.bytedeco.javacpp.nopointergc" is set so this is not related to JavaCPP. Like I told you many many times already, and I'll keep bringing this up until you start acknowledging that maybe there is something wrong in TF Java, the way you're using WeakReference is incorrect. Maybe that's the cause, maybe it's something else, but there is something wrong with TF Java and/or users forgetting to call close() when they should.

@Craigacp
Copy link
Collaborator

Craigacp commented May 5, 2021

That doesn't account for the fact that when using G1GC the issue doesn't appear. Both this report and the other one in the gitter are specific to ZGC, and don't seem to have the issue when using G1GC. There are obviously many differences between the two GC algorithms, but one that impacts JavaCPP is that monitoring the process memory usage using RSS doesn't work properly with ZGC as the Linux kernel triple counts the JVM's memory in the RSS.

@liufuyang
Copy link
Author

Thank you @Craigacp
Closing this as this is not TF related and we have a solution and explanation for it. See bytedeco/javacpp#474 (comment) and
bytedeco/javacpp#474 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants