New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Circular exception handling can cause infinite loop since 9.3.4.0 #7267
Comments
I think the latter would be the best fix, perhaps using existing logic for recursion detection. I will look into this. |
I have a patch to use a Set, but no reproduction. Can you provide a reproduction or test this patch out? |
diff --git a/core/src/main/java/org/jruby/RubyException.java b/core/src/main/java/org/jruby/RubyException.java
index 0c48097acd..8148adeff2 100644
--- a/core/src/main/java/org/jruby/RubyException.java
+++ b/core/src/main/java/org/jruby/RubyException.java
@@ -55,7 +55,9 @@ import org.jruby.runtime.marshal.UnmarshalStream;
import java.io.IOException;
import java.io.PrintStream;
+import java.util.HashSet;
import java.util.List;
+import java.util.Set;
import static org.jruby.runtime.Visibility.PRIVATE;
import static org.jruby.util.RubyStringBuilder.str;
@@ -390,12 +392,15 @@ public class RubyException extends RubyObject {
private void checkCircularCause(IRubyObject cause) {
IRubyObject currentCause = cause;
+ Set causeSet = new HashSet();
+ causeSet.add(this);
while (currentCause instanceof RubyException) {
- if (currentCause == this) {
+ if (causeSet.contains(currentCause)) {
RaiseException runtimeError = getRuntime().newRuntimeError("circular causes");
runtimeError.getException().setCause(cause);
throw runtimeError;
}
+ causeSet.add(currentCause);
currentCause = ((RubyException) currentCause).cause;
}
} |
@headius I tried to reproduce these circular cause references but couldn't. But then I started to investigate the original stack trace and noticed that checkCircularCause appeared only in the Finalizer thread stack trace. And then I started to think that probably there is not an endless loop, but just the stack of linked causes gets deeper all the time. Here is a test script for this hypothesis: obj = nil
10.times do |i|
obj = Object.new
ObjectSpace.define_finalizer(obj, Proc.new do
STDOUT.puts ">>> finalizing #{i}"
e = $!
while e
STDOUT.puts e.inspect
e = e.cause
end
STDOUT.puts "<<< finalized #{i}"
raise "error #{i}"
end)
obj = nil
sleep 0.1
java.lang.System.gc
sleep 0.1
end which produced the output
As you see, if the object finalizer raises an error then the next finalizer in the same Finalizer thread inherits the current error So if there are badly defined object finalizers (as was the case in appsignal/appsignal-ruby#854 which is now fixed) and if they are called a lot of times, then the Finalizer thread will spend more and more time in the It would be better to clear I tested this on MRI 2.6 and 3.1 and checked that As it seems that there was no infinite loop in |
Great discovery about the finalizer! It makes sense that the finalizer thread would keep accumulating causes, since nothing occurs to clear them. I will investigate CRuby a bit to see how they handle exceptions raised from a finalizer; I suspect they avoid setting |
Yup, good sleuthing... CRuby captures CRuby's finalizers have traditionally run on the same thread the GC runs on, which is often the thread that happens to be running at the time (so they want to restore any in-flight error before returning control to user code. With JDK's separate finalizer, it may make more sense for us to just clear it after each invocation, since capturing and restoring it would have the same effect. |
The relevant code in CRuby: https://github.com/ruby/ruby/blob/0c36ba53192c5a0d245c9b626e4346a32d7d144e/gc.c#L4197-L4233 |
In jruby#7267 we had a report of endless exception cause processing that turned out to be triggered by a bad finalizer (that allowed an exception to bubble out) stacking up causes from previous calls of that finalizer. The fix here mimics what CRuby does: where they reset the errinfo to what it was prior to the finalizer running (because CRuby's GC often/usually runs on the current user thread), we simply clear it after each finalizer has run (because the JDK runs finalizers on a separate thread, as will our future non-JVM-finalizer version of this logic). No spec is provided yet due to the difficulty of testing GC-triggered events across VMs. See ruby/spec#935 for more details. Fixes jruby#7267
In jruby#7267 we had a report of endless exception cause processing that turned out to be triggered by a bad finalizer (that allowed an exception to bubble out) stacking up causes from previous calls of that finalizer. The fix here mimics what CRuby does: where they reset the errinfo to what it was prior to the finalizer running (because CRuby's GC often/usually runs on the current user thread), we simply clear it after each finalizer has run (because the JDK runs finalizers on a separate thread, as will our future non-JVM-finalizer version of this logic). No spec is provided yet due to the difficulty of testing GC-triggered events across VMs. See ruby/spec#935 for more details. Fixes jruby#7267
Related to #7316 ? Relevant CRuby code: ruby/ruby@1849288 |
@evaniainbrooks I don't think this is related to #7316, at least not directly, but I will look at your proposed fix for that today! |
Environment Information
jruby 9.3.6.0 (2.6.8) 2022-06-27 7a2cbcd376 OpenJDK 64-Bit Server VM 11.0.12+7 on 11.0.12+7 +jit [x86_64-linux]
Linux production-app1 4.4.0-210-generic #242-Ubuntu SMP Fri Apr 16 09:57:56 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Expected Behavior
As a result of bd2595c which solved #7035 the
checkCircularCause
method was introduced. It's purpose is to detect circular loops of exception causes.Actual Behavior
After upgrading our Rails app from JRuby 9.3.3.0 to 9.3.6.0 we discovered an infinite loop in the Java Finalizer thread. In each thread dump we see that it is stuck in the
checkCircularCause
method:It seems that it is caused by an invalid FFI::AutoPointer definition in the appsignal gem (I reported the issue to them appsignal/appsignal-ruby#854) which causes
org.jruby.runtime.Arity.checkArity
to fail and which raises an ArgumentError which then causes an infinite loop incheckCircularCause
.I was not able to reproduce this infinite loop with a simple script. But as I was looking at the source of this method
I suspect that in this case there is a circular loop of causes but which does not include the current exception (
this
). I think that it would be better to add prevention of potential infinite loop byThe text was updated successfully, but these errors were encountered: