Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreaded code with function calls fail in precompiled JRuby classes #6210

Closed
mikefkim opened this issue May 9, 2020 · 11 comments · Fixed by #6275
Closed

Multithreaded code with function calls fail in precompiled JRuby classes #6210

mikefkim opened this issue May 9, 2020 · 11 comments · Fixed by #6275

Comments

@mikefkim
Copy link

mikefkim commented May 9, 2020

On JRuby 9.2.9.0, we can successfully run code that calls functions within a Thread from pre-compiled JRuby classes.

The exact same situation (pre-compiled JRuby classes, multi-threaded code that calls a function) crashes on JRuby 9.2.10.0 and JRuby 9.2.11.0 (haven't tested 9.2.11.1 or 9.3).

We are running JDK1.8, and have been able to reproduce this on MacOS, Ubuntu, and Alpine. We have also been able to reproduce with both the default JRUBY_OPTS, and also with compile.invokedynamic=true.

FWIW, the same code with NUM_THREADS = 1 succeeds when run from a pre-compiled JRuby class (fails for NUM_THREADS > 1)

Here are the steps to reproduce the bug:

  1. Using JRuby 9.2.10.0+, compile the following code using jrubyc
# test.rb

NUM_THREADS = 16
SAMPLE_SIZE = 10_000

def sample
  a = Array.new(SAMPLE_SIZE) { |i| rand }
  a.inject(0.0, :+) / SAMPLE_SIZE.to_f
end

means = Array.new(NUM_THREADS) { 0 }

# Works, if not called within a Thread
# NUM_THREADS.times { |i| means[i] = sample }

threads = []
NUM_THREADS.times { |i|
  threads << Thread.new {
    # Crashes when `sample` is called with threads
    means[i] = sample

    # Works, if function body of `sample` is called in the thread directly
    # a = Array.new(SAMPLE_SIZE) { |i| rand }
    # means[i] = a.inject(0.0, :+) / SAMPLE_SIZE.to_f
  }
}
threads.map(&:join)

puts means.inject(0.0, :+) / NUM_THREADS.to_f
  1. Run the compiled class: jruby test.class

Expected Behavior

When compiled and run using JRuby 9.2.9.0, this test should return a number near 0.5.

Actual Behavior

When compiled and run using JRuby 9.2.10.0 and JRuby 9.2.11.0 (again, haven't tested beyond those), this same test crashes with the following stack trace:

warning: thread "Ruby-0-Thread-2: test.class:25" terminated with exception (report_on_exception is true):
warning: thread "Ruby-0-Thread-1: test.class:25" terminated with exception (report_on_exception is true):
java.lang.NullPointerException
	at org.jruby.ir.persistence.IRReaderStream.decodeEncoding(IRReaderStream.java:94)
	at org.jruby.ir.persistence.IRReaderStream.decodeSymbolFromConstantPool(IRReaderStream.java:126)
	at org.jruby.ir.persistence.IRReaderStream.decodeConstantPool(IRReaderStream.java:201)
	at org.jruby.ir.persistence.IRReaderStream.decodeInstructionsAt(IRReaderStream.java:208)
	at org.jruby.ir.persistence.IRReader.lambda$load$0(IRReader.java:47)
	at org.jruby.ir.interpreter.InterpreterContext.getEngine(InterpreterContext.java:74)
	at org.jruby.ir.interpreter.InterpreterContext.hasExplicitCallProtocol(InterpreterContext.java:206)
	at org.jruby.internal.runtime.methods.MixedModeIRMethod.INTERPRET_METHOD(MixedModeIRMethod.java:116)
	at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:108)
	at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:192)
	at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:141)
	at org.jruby.ir.interpreter.InterpreterEngine.processCall(InterpreterEngine.java:345)
	at org.jruby.ir.interpreter.StartupInterpreterEngine.interpret(StartupInterpreterEngine.java:72)
	at org.jruby.ir.interpreter.Interpreter.INTERPRET_BLOCK(Interpreter.java:116)
	at org.jruby.runtime.MixedModeIRBlockBody.commonYieldPath(MixedModeIRBlockBody.java:136)
	at org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:60)
	at org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:52)
	at org.jruby.runtime.Block.call(Block.java:139)
	at org.jruby.RubyProc.call(RubyProc.java:318)
	at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:105)
	at java.lang.Thread.run(Thread.java:745)
java.lang.NullPointerException
	at org.jruby.ir.persistence.IRReaderStream.decode(IRReaderStream.java:478)
	at org.jruby.ir.persistence.IRReaderStream.decodeOperand(IRReaderStream.java:371)
	at org.jruby.ir.persistence.IRReaderStream.decodeVariable(IRReaderStream.java:380)
	at org.jruby.ir.instructions.LoadImplicitClosureInstr.decode(LoadImplicitClosureInstr.java:33)
	at org.jruby.ir.persistence.IRReaderStream.decodeInstr(IRReaderStream.java:289)
	at org.jruby.ir.persistence.IRReaderStream.decodeInstructionsAt(IRReaderStream.java:218)
	at org.jruby.ir.persistence.IRReader.lambda$load$0(IRReader.java:47)
	at org.jruby.ir.interpreter.InterpreterContext.getEngine(InterpreterContext.java:74)
	at org.jruby.ir.interpreter.InterpreterContext.hasExplicitCallProtocol(InterpreterContext.java:206)
	at org.jruby.internal.runtime.methods.MixedModeIRMethod.INTERPRET_METHOD(MixedModeIRMethod.java:116)
	at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:108)
	at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:192)
	at org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:354)
	at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:143)
	at org.jruby.ir.interpreter.InterpreterEngine.processCall(InterpreterEngine.java:345)
	at org.jruby.ir.interpreter.StartupInterpreterEngine.interpret(StartupInterpreterEngine.java:72)
	at org.jruby.ir.interpreter.Interpreter.INTERPRET_BLOCK(Interpreter.java:116)
	at org.jruby.runtime.MixedModeIRBlockBody.commonYieldPath(MixedModeIRBlockBody.java:136)
	at org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:60)
	at org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:52)
	at org.jruby.runtime.Block.call(Block.java:139)
	at org.jruby.RubyProc.call(RubyProc.java:318)
	at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:105)
	at java.lang.Thread.run(Thread.java:745)
Unhandled Java exception: java.lang.NullPointerException
java.lang.NullPointerException: null
                   decode at org/jruby/ir/persistence/IRReaderStream.java:478
            decodeOperand at org/jruby/ir/persistence/IRReaderStream.java:371
           decodeVariable at org/jruby/ir/persistence/IRReaderStream.java:380
                   decode at org/jruby/ir/instructions/LoadImplicitClosureInstr.java:33
              decodeInstr at org/jruby/ir/persistence/IRReaderStream.java:289
     decodeInstructionsAt at org/jruby/ir/persistence/IRReaderStream.java:218
            lambda$load$0 at org/jruby/ir/persistence/IRReader.java:47
                getEngine at org/jruby/ir/interpreter/InterpreterContext.java:74
  hasExplicitCallProtocol at org/jruby/ir/interpreter/InterpreterContext.java:206
         INTERPRET_METHOD at org/jruby/internal/runtime/methods/MixedModeIRMethod.java:116
                     call at org/jruby/internal/runtime/methods/MixedModeIRMethod.java:108
                     call at org/jruby/internal/runtime/methods/DynamicMethod.java:192
             cacheAndCall at org/jruby/runtime/callsite/CachingCallSite.java:354
                     call at org/jruby/runtime/callsite/CachingCallSite.java:143
              processCall at org/jruby/ir/interpreter/InterpreterEngine.java:345
                interpret at org/jruby/ir/interpreter/StartupInterpreterEngine.java:72
          INTERPRET_BLOCK at org/jruby/ir/interpreter/Interpreter.java:116
          commonYieldPath at org/jruby/runtime/MixedModeIRBlockBody.java:136
                     call at org/jruby/runtime/IRBlockBody.java:60
                     call at org/jruby/runtime/IRBlockBody.java:52
                     call at org/jruby/runtime/Block.java:139
                     call at org/jruby/RubyProc.java:318
                      run at org/jruby/internal/runtime/RubyRunnable.java:105
                      run at java/lang/Thread.java:745
@mikefkim mikefkim changed the title Multithreaded function calls with symbol procs fail in precompiled JRuby classes Multithreaded code with function calls fail in precompiled JRuby classes May 9, 2020
@mikefkim
Copy link
Author

mikefkim commented May 9, 2020

At first it looked like the issue was with function calls with symbol procs, for examples:

def sample
  a = Array.new(SAMPLE_SIZE) { |i| rand }
  a.inject(0.0, :+) / SAMPLE_SIZE.to_f
end

But then, I could get it to fail with this version as well:

def sample
  sum = 0.0
  SAMPLE_SIZE.times { sum += rand }
  mean = sum / SAMPLE_SIZE
  return mean
end

Hence, I updated the issue.

@mikefkim
Copy link
Author

mikefkim commented May 9, 2020

Here's something interesting / disturbing about this issue:

The first version of sample in the comment above (with symbol procs) will crash (what appears to be) every time. But, the second version of sample above (without symbol procs) will succeed occasionally: i.e., If I run jrubyc test.class repeatedly, it will eventually succeed.

Even if I replace the sample method with:

def sample
  0.5
end

If I run jruby test.class often enough, I will eventually get a crash (about 1 in 20 times). So the "simpler" the code the more likely running the compiled class is to succeed, but however simple, it seems like it eventually does crash.

@enebo enebo added this to the JRuby 9.3.0.0 milestone May 11, 2020
@enebo
Copy link
Member

enebo commented May 11, 2020

Bleh...This one is aggravating but fairly mundane as to cause. We lazily decode scopes with this line:

            scope.allocateInterpreterContext(() -> file.decodeInstructionsAt(scope, poolOffset, instructionsOffset));

This java closure captures the IRReader instance it is in which has a field 'buf' which decodeEncoding() uses. Two or more threads using the same field at the same time without locks == boom.

Until we release (and apply a fix for this) the only work around would be to load this code once in a single thread. After that point it sould be good for multi-threaded use.

It is interesting that one example happens to hit it less than the other but I am guessing you are just getting lucky that two versions deserializing at the same time is out of sync enough to how happen to be doing encoding decoding at the same time.

@mikefkim
Copy link
Author

mikefkim commented May 11, 2020

Ah -- that totally makes sense! Thank you Thomas!

I would guess that what is happening with the differential impact is the speed at which the deserialization happens: the faster something deserializes (e.g., with just a constant in the function body) the less likely it is we'll have more than one thread attempting deserialization.

@JohnPhillips31416
Copy link

JohnPhillips31416 commented May 27, 2020

We've also hit this issue, it's a blocker for us moving from JRuby 9.2.9.0 and thus a blocker for us to get to JDK11. Would really help us out if this bug could get fixed on the 9.2.11.x line.

headius added a commit to headius/jruby that referenced this issue Jun 9, 2020
Instruction reading is done lazily to avoid the cost of
deserializing method bodies that will never be called. However the
buffer associated with the original file was being shared across
all of these lazy deserializations, causing concurrency issues.

This patch modifies the lazy deserialization lambda to operate
against a duplicate of the original reader, isolating its changes
in a separate buffer and avoiding the concurrency issues.

Fixes jruby#6210
@headius
Copy link
Member

headius commented Jun 9, 2020

I've pushed a fix for this in #6275 that duplicates the IRReaderStream before doing any lazy instruction deserialization. This avoids two threads sharing a buffer and stepping on each other.

We've also had a request for a shorter-term release with this fix, like a 9.2.11.2. We have no plans to do that right now, but with 9.3 growing into a "fix everything" release there may be justification for a 9.2.12.

@headius
Copy link
Member

headius commented Jun 9, 2020

@enebo We need to talk about getting this fix and a few others into a 9.2.12, since I think we have at least another month of work to get 9.3 out the door.

@JohnPhillips31416
Copy link

JohnPhillips31416 commented Jun 9, 2020

Awesome, thank you!

@enebo
Copy link
Member

enebo commented Jun 15, 2020

@headius agreed this should be in 9.2.12. Added the milestone since we have already decided it to be a good idea and targetted this. We need to still cp this over to jruby-9.2 branch.

enebo pushed a commit that referenced this issue Jun 15, 2020
Instruction reading is done lazily to avoid the cost of
deserializing method bodies that will never be called. However the
buffer associated with the original file was being shared across
all of these lazy deserializations, causing concurrency issues.

This patch modifies the lazy deserialization lambda to operate
against a duplicate of the original reader, isolating its changes
in a separate buffer and avoiding the concurrency issues.

Fixes #6210
@enebo
Copy link
Member

enebo commented Jun 15, 2020

Done.

@mikefkim
Copy link
Author

mikefkim commented Jun 15, 2020

Thank you @enebo and @headius!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants