Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Jruby crashes intermittently right before coverage generation #5882
The following is my local environment information, but CI information (where the crash happened should be the same).
Crash happens in activeadmin's test suite, have only seen it once on activeadmin/activeadmin#5858, but without that PR, the behavior is that sometimes jruby hangs right before coverage report generation (instead of crashing).
This is the log of the crash: https://circleci.com/gh/activeadmin/activeadmin/22252, where a backtrace can be seen:
Jruby doesn't crash nor hangs.
Jruby hangs intermittently (with simplecov 0.17.0), or crashes intermittenly (with simplecov 0.17.1).
If the lines correspond correctly to the 22.214.171.124 tag, then the NPE you show happens here:
It would be a peculiar place for it to happen, since no InterpretedIRBlockBody should ever get constructed with a null InterpreterContext. Maybe it's possible some of this trace was cut off or lost somehow? @enebo can you see any way that ic could be null here?
I have some minor changes to the coverage logic to better protect it from concurrent threads (both races and deadlocks) but I'm not sure my fixes correlate with this stack trace.
@deivid-rodriguez For various reasons sometimes the trace reflects something else, like it gets re-thrown from a different place or for some reason parts of it get cut off.
I'm guessing it's difficult to reproduce this, but if you're able to test with a JRuby nightly it might tell us if my changes fixed the right stuff. Otherwise I think we need to figure out a way to reliably reproduce this.
InterpretedIRBlockBody ensureInstrs does potentially have a race. We check interpreterContext for null as a field than return the field. Two inflight executions of this block may be trying to promote to fullInterperterContext which will set the interpreterContext field. I still don't see how that makes it null but we can capture first read of interpreterContext in ensureInstrs to a local to eliminate unsoundness of the check (null followed by assuming it may not have changed state since the if check).