Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Re-raising the cause of a Java exception can result in leaked memory and IllegalStateExceptions #4699
I only reproduced this using Guava, but I believe it is not a Guava specific bug. If you'd like me to reproduce without an external dependency I'm willing to try, but it was easier to just use Gauva.
Re-raising the cause of a Java exception should work without error or leaked memory.
This script reproduces part of the issue. It runs successfully on 1.7.27 but fails on 22.214.171.124.
To run, download the Guava jar https://repo1.maven.org/maven2/com/google/guava/guava/22.0/guava-22.0.jar, and place it this script:
Running the above script on 126.96.36.199 results in all threads except for one dying from IllegalStateExceptions. It does not fail if
Regarding the memory leak, I have not yet been able to reproduce outside of production but running code similar to what is above resulted in several threads that each had a single 1.1GB
that contained many
I changed our prod code from
and the memory leak went away. So I believe that re-raising a cause from inside the rescue in cases like this can also result in circular dependencies that leak memory.
The memory leak occurred on Linux with JRuby 188.8.131.52:
Let me know if there is any more useful detail that I should add!
We @swiftype have 184.108.40.206 in production and that's where we've observed the leak. We've rolled back, updated the code and deployed the changed version to fix the leak. 220.127.116.11 is mentioned because it was one of the versions tested locally while trying to reproduce the issue locally in development.
@kovyrin so one theory would be we are storing the result of cause in a temporary variable in that rescue and then raising out the value of that temp. Since we are not niling out temporary variable it would leak. HOWEVER, this is unwinding the stack because we are raising so that temporary variable should no longer be in scope and freed. If I was to keep with this theory this thread basically dies here and does not unwind which would then leak...but then I think you would have lots of threads laying around. So probably not that? hmm
Ok, so there seems to be two issues here.
The first problem is caused by two conditions in Throwable.initCause:
I will fix the double-init issue. @marshalium would you be able to test a master or 9.1.13 HEAD build?
@headius Thanks for looking into this! Unfortunately, I ran the script from from my original comment on the
I dug into this a bit, and I think I discovered more of what's actually going on, but I don't know how to fix it.
I added code to RubyKernel.maybeRaiseJavaException(), right before the ex.initCause call, to log thread and object ids:
And what it logged showed that multiple threads are trying to set the cause on the exact same exception object:
The check that got added didn't fix anything since this is race condition where multiple threads can see the cause as
The reason I was seeing this with Guava CacheLoader is that when multiple threads try to load the same key in parallel, threads after the first one block on a future. Once the loading thread sets an error on the future the waiting threads see it and raise an UncheckedExecutionException whose cause is set to the error instance from the future. When the JRuby code re-raises the cause of the UncheckedExecutionException multiple JRuby threads end up raising the exact same exception instance and so they get IllegalStateExceptions.
Here's a gist with a simpler method of reproducing the issue that does not depend on Guava: https://gist.github.com/marshalium/64e9ebb4bb4ac107eee1f5a629feee18
Ok, so this is a weird behavior of initCause that interacts badly with threads. Thanks for the extra investigation.
I think the options we have are:
I am leaning toward (1) since Java typically wants you to init the cause yourself, rather than doing it automatically. I will implement (2) or (3) for now.
@kares I think you added the logic to set cause. What would you think about just dropping it? Java does not automatically set a cause for you unless you pass one into the constructor, so us setting it later based on a Ruby raise -- which might happen many times for the same throwable -- seems like the wrong behavior.
right, this is logic added in 9.1 to go along with with Ruby's (Ruby) exception.cause handling b8933ee
not sure how much its used/spec-ed (probably very little if any). seemed like a good fit to match JI the way Ruby's cause works - a passed or found in error-info gets propagated.
JRuby already has some places where the RaiseException's cause is init-ed (pbly all local - single threaded), since its often hard to tell a failure not seeing the native Java cause - that was the motivation to further help reduce the pain around JI although it makes much less sense now that I think about it.
obviously did not think too far :) ... feel free to remove for 9.2 we can always re-add later if someone complains he/she is not getting its Java cause set on a throwable
I will remove it and we can discuss further whether we want to restore this or something like it for 9.2.
My remaining thoughts on it:
Java only allows setting cause at construction time, or exactly once after that. Ruby sets it to $! for you when raising an exception if that exception has no cause and $! is not that exception.
In Java, cause is never automatically set for you.
If someone is re-raising a Java exception in Ruby, they probably don't want to attach a cause (which might be the source of leaks, if those exceptions carry too much data and stick around).
If someone is raising a new Java exception, they might want to provide a cause. We can't easily do it automatically since we would have to modify the constructor call, and we probably wouldn't want to add something that magic. Unfortunately we also can't pass the exception, because the object in $! is a RubyException, not a subclass of Throwable.
I think we should open a new feature/bug to make it possible for a user to provide a cause to Java, but more and more I think doing it automatically is a mismatch.