Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jruby crashes intermittently right before coverage generation #5882

Closed
deivid-rodriguez opened this issue Sep 19, 2019 · 7 comments
Closed

Jruby crashes intermittently right before coverage generation #5882

deivid-rodriguez opened this issue Sep 19, 2019 · 7 comments
Milestone

Comments

@deivid-rodriguez
Copy link
Contributor

@deivid-rodriguez deivid-rodriguez commented Sep 19, 2019

Environment

The following is my local environment information, but CI information (where the crash happened should be the same).

$ jruby -v
jruby 9.2.8.0 (2.5.3) 2019-08-12 a1ac7ff OpenJDK 64-Bit Server VM 11.0.4+11-post-Ubuntu-1ubuntu218.04.3 on 11.0.4+11-post-Ubuntu-1ubuntu218.04.3 [linux-x86_64]

$ echo $JRUBY_OPTS 
--dev --debug -J-Xmx1024m

$ uname -a
Linux chaba 5.0.0-25-generic #26~18.04.1-Ubuntu SMP Thu Aug 1 13:51:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Crash happens in activeadmin's test suite, have only seen it once on activeadmin/activeadmin#5858, but without that PR, the behavior is that sometimes jruby hangs right before coverage report generation (instead of crashing).

This is the log of the crash: https://circleci.com/gh/activeadmin/activeadmin/22252, where a backtrace can be seen:

Unhandled Java exception: java.lang.NullPointerException
java.lang.NullPointerException: null
   commonYieldPath at org/jruby/runtime/InterpretedIRBlockBody.java:127
           doYield at org/jruby/runtime/IRBlockBody.java:187
             yield at org/jruby/runtime/BlockBody.java:116
             yield at org/jruby/runtime/Block.java:170
      detectCommon at org/jruby/RubyArray.java:4756
              find at org/jruby/RubyArray.java:4729
            detect at org/jruby/RubyEnumerable.java:634
              call at org/jruby/RubyEnumerable$INVOKER$s$detect_DBG.gen:-1
              call at org/jruby/internal/runtime/methods/JavaMethod.java:575
      cacheAndCall at org/jruby/runtime/callsite/CachingCallSite.java:332
              call at org/jruby/runtime/callsite/CachingCallSite.java:86
          callIter at org/jruby/runtime/callsite/CachingCallSite.java:93
         interpret at org/jruby/ir/instructions/CallBase.java:537
       processCall at org/jruby/ir/interpreter/InterpreterEngine.java:361
         interpret at org/jruby/ir/interpreter/StartupInterpreterEngine.java:72
         interpret at org/jruby/ir/interpreter/InterpreterEngine.java:80
  INTERPRET_METHOD at org/jruby/internal/runtime/methods/InterpretedIRMethod.java:153
              call at org/jruby/internal/runtime/methods/InterpretedIRMethod.java:140
              call at org/jruby/runtime/callsite/CachingCallSite.java:141
       processCall at org/jruby/ir/interpreter/InterpreterEngine.java:345
         interpret at org/jruby/ir/interpreter/StartupInterpreterEngine.java:72
  INTERPRET_METHOD at org/jruby/internal/runtime/methods/InterpretedIRMethod.java:115
              call at org/jruby/internal/runtime/methods/InterpretedIRMethod.java:94
      cacheAndCall at org/jruby/runtime/callsite/CachingCallSite.java:332
              call at org/jruby/runtime/callsite/CachingCallSite.java:86
         interpret at org/jruby/ir/instructions/CallBase.java:540
       processCall at org/jruby/ir/interpreter/InterpreterEngine.java:361
         interpret at org/jruby/ir/interpreter/StartupInterpreterEngine.java:72
  INTERPRET_METHOD at org/jruby/internal/runtime/methods/InterpretedIRMethod.java:115
              call at org/jruby/internal/runtime/methods/InterpretedIRMethod.java:94
      cacheAndCall at org/jruby/runtime/callsite/CachingCallSite.java:332
              call at org/jruby/runtime/callsite/CachingCallSite.java:86
         interpret at org/jruby/ir/instructions/CallBase.java:540
       processCall at org/jruby/ir/interpreter/InterpreterEngine.java:361
         interpret at org/jruby/ir/interpreter/StartupInterpreterEngine.java:72
   INTERPRET_BLOCK at org/jruby/ir/interpreter/Interpreter.java:114
   commonYieldPath at org/jruby/runtime/InterpretedIRBlockBody.java:137
           doYield at org/jruby/runtime/IRBlockBody.java:187
             yield at org/jruby/runtime/BlockBody.java:116
             yield at org/jruby/runtime/Block.java:170
             yield at org/jruby/ir/runtime/IRRuntimeHelpers.java:477
         interpret at org/jruby/ir/instructions/YieldInstr.java:85
    processOtherOp at org/jruby/ir/interpreter/StartupInterpreterEngine.java:178
         interpret at org/jruby/ir/interpreter/StartupInterpreterEngine.java:104
   INTERPRET_BLOCK at org/jruby/ir/interpreter/Interpreter.java:114
   commonYieldPath at org/jruby/runtime/InterpretedIRBlockBody.java:137
              call at org/jruby/runtime/IRBlockBody.java:79
              call at org/jruby/runtime/Block.java:129
              call at org/jruby/RubyProc.java:295
              call at org/jruby/RubyProc.java:274
              call at org/jruby/RubyProc.java:266
              call at org/jruby/RubyProc$INVOKER$i$0$0$call_DBG.gen:-1
      cacheAndCall at org/jruby/runtime/callsite/CachingCallSite.java:332
              call at org/jruby/runtime/callsite/CachingCallSite.java:86
         interpret at org/jruby/ir/instructions/CallBase.java:540
       processCall at org/jruby/ir/interpreter/InterpreterEngine.java:361
         interpret at org/jruby/ir/interpreter/StartupInterpreterEngine.java:72
  INTERPRET_METHOD at org/jruby/internal/runtime/methods/InterpretedIRMethod.java:115
              call at org/jruby/internal/runtime/methods/InterpretedIRMethod.java:94
      cacheAndCall at org/jruby/runtime/callsite/CachingCallSite.java:332
              call at org/jruby/runtime/callsite/CachingCallSite.java:86
         interpret at org/jruby/ir/instructions/CallBase.java:540
       processCall at org/jruby/ir/interpreter/InterpreterEngine.java:361
         interpret at org/jruby/ir/interpreter/StartupInterpreterEngine.java:72
   INTERPRET_BLOCK at org/jruby/ir/interpreter/Interpreter.java:114
   commonYieldPath at org/jruby/runtime/InterpretedIRBlockBody.java:137
     yieldSpecific at org/jruby/runtime/IRBlockBody.java:89
     yieldSpecific at org/jruby/runtime/Block.java:139
     yieldSpecific at org/jruby/ir/runtime/IRRuntimeHelpers.java:481
         interpret at org/jruby/ir/instructions/YieldInstr.java:76
    processOtherOp at org/jruby/ir/interpreter/StartupInterpreterEngine.java:178
         interpret at org/jruby/ir/interpreter/StartupInterpreterEngine.java:104
  INTERPRET_METHOD at org/jruby/internal/runtime/methods/InterpretedIRMethod.java:115
              call at org/jruby/internal/runtime/methods/InterpretedIRMethod.java:94
      cacheAndCall at org/jruby/runtime/callsite/CachingCallSite.java:332
              call at org/jruby/runtime/callsite/CachingCallSite.java:86
          callIter at org/jruby/runtime/callsite/CachingCallSite.java:93
         interpret at org/jruby/ir/instructions/CallBase.java:537
       processCall at org/jruby/ir/interpreter/InterpreterEngine.java:361
         interpret at org/jruby/ir/interpreter/StartupInterpreterEngine.java:72
   INTERPRET_BLOCK at org/jruby/ir/interpreter/Interpreter.java:114
   commonYieldPath at org/jruby/runtime/InterpretedIRBlockBody.java:137
           doYield at org/jruby/runtime/IRBlockBody.java:187
             yield at org/jruby/runtime/BlockBody.java:116
             yield at org/jruby/runtime/Block.java:170
             yield at org/jruby/ir/runtime/IRRuntimeHelpers.java:477
         interpret at org/jruby/ir/instructions/YieldInstr.java:85
    processOtherOp at org/jruby/ir/interpreter/StartupInterpreterEngine.java:178
         interpret at org/jruby/ir/interpreter/StartupInterpreterEngine.java:104
   INTERPRET_BLOCK at org/jruby/ir/interpreter/Interpreter.java:114
   commonYieldPath at org/jruby/runtime/InterpretedIRBlockBody.java:137
              call at org/jruby/runtime/IRBlockBody.java:79
              call at org/jruby/runtime/Block.java:129
              call at org/jruby/RubyProc.java:295
              call at org/jruby/RubyProc.java:274
              call at org/jruby/RubyProc.java:270
               run at org/jruby/internal/runtime/RubyRunnable.java:105
               run at java/lang/Thread.java:748

Expected Behavior

Jruby doesn't crash nor hangs.

Actual Behavior

Jruby hangs intermittently (with simplecov 0.17.0), or crashes intermittenly (with simplecov 0.17.1).

@deivid-rodriguez deivid-rodriguez changed the title Jruby crashes right before coverage generation Jruby crashes intermittently right before coverage generation Sep 19, 2019
@headius

This comment has been minimized.

Copy link
Member

@headius headius commented Sep 21, 2019

Hmm strange. What you show there is a stack trace from a thread, so I suspect this may have something to do with a background thread still running and accessing structures that the coverage output needs to process.

@headius

This comment has been minimized.

Copy link
Member

@headius headius commented Sep 21, 2019

If the lines correspond correctly to the 9.2.8.0 tag, then the NPE you show happens here:

It would be a peculiar place for it to happen, since no InterpretedIRBlockBody should ever get constructed with a null InterpreterContext. Maybe it's possible some of this trace was cut off or lost somehow? @enebo can you see any way that ic could be null here?

I have some minor changes to the coverage logic to better protect it from concurrent threads (both races and deadlocks) but I'm not sure my fixes correlate with this stack trace.

headius added a commit that referenced this issue Sep 21, 2019
This may address some NPE and hangs when requesting coverage
output in a multi-threaded system.

See #5882.
@headius headius added this to the JRuby 9.2.9.0 milestone Sep 21, 2019
@deivid-rodriguez

This comment has been minimized.

Copy link
Contributor Author

@deivid-rodriguez deivid-rodriguez commented Sep 22, 2019

Thanks for improving this! ❤️

Maybe it's possible some of this trace was cut off or lost somehow?

Mmmm, I copy-pasted it from the CircleCI log I linked to. Or you mean that jruby is printing an incorrect backtrace somehow?

@headius

This comment has been minimized.

Copy link
Member

@headius headius commented Sep 23, 2019

@deivid-rodriguez For various reasons sometimes the trace reflects something else, like it gets re-thrown from a different place or for some reason parts of it get cut off.

I'm guessing it's difficult to reproduce this, but if you're able to test with a JRuby nightly it might tell us if my changes fixed the right stuff. Otherwise I think we need to figure out a way to reliably reproduce this.

@enebo

This comment has been minimized.

Copy link
Member

@enebo enebo commented Sep 23, 2019

@headius @deivid-rodriguez yeah I have been trying to trace through with idea that ic is null in that source and although we have some paths which will set ic to null none of them involve closures.

InterpretedIRBlockBody ensureInstrs does potentially have a race. We check interpreterContext for null as a field than return the field. Two inflight executions of this block may be trying to promote to fullInterperterContext which will set the interpreterContext field. I still don't see how that makes it null but we can capture first read of interpreterContext in ensureInstrs to a local to eliminate unsoundness of the check (null followed by assuming it may not have changed state since the if check).

@deivid-rodriguez

This comment has been minimized.

Copy link
Contributor Author

@deivid-rodriguez deivid-rodriguez commented Sep 23, 2019

Thanks both for researching into this. I've only run into the crash once, I'll monitor to see if it reappears. The hang used to be pretty common, but it seems fixed with simplecov's 0.17.1 release.

@headius

This comment has been minimized.

Copy link
Member

@headius headius commented Sep 25, 2019

I think we'll call this fixed until we can get a new reproduction. Hopefully fixes here and in some of the event hook logic will have fixed the issue!

@headius headius closed this Sep 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.