New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marshal.dump sometimes triggers ConcurrencyError for large structure #5301

Open
JasonLunn opened this Issue Aug 31, 2018 · 1 comment

Comments

Projects
None yet
1 participant
@JasonLunn

JasonLunn commented Aug 31, 2018

Environment

  • JRuby 9.1.17.0
  • macOS High Sierra 10.13.6 (17G65)
  • Java 1.8.0_121
  • rails 5.1.6
  • ice_nine 0.11.2
  • parallel 1.12.1

Background

A large configuration data structure (composed of nested ruby hashes, arrays, active record objects, and all their trimmings) is assembled in our application by a worker instance, Marshal#dump'ed, compressed and later shared with front-end servers.

The run time of the job that prepares this structure has been a source of frustration, and so we're in the process of parallelizing the compilation of the data. Updating several RubyHash literals to extend JRuby::Synchronized and a little refactoring cured all the application errors that we encountered getting the parallelized compilation process to complete without errors.

Issue

After having constructed the data structure, we call Marshal.dump as we prepare to export the computed state. Often it succeeds, but not always. A few repetitions of the job - sometimes as little as one, rarely more than 10 - are all that is needed to trigger a ConcurrencyError (see full backtrace below). We're not sure what part of the data structure is triggering the issue other than it must be an instance of RubyHash.

We assumed that we were doing something foolish like forgetting about a background thread that was in the process of mutating the data structure as we were dumping it. We were sure that if we used IceNine.deep_freeze! we'd catch the mutation in the act. No mutation has been detected so far, and yet dump continues to fail with the same ConcurrencyError even though it is now working on a recursively frozen structure.

Is there a way to identify what structure is invalid? Is our expectation that a frozen RubyHash should be immune from ConcurrencyErrors a sound one (because modification should no longer be possible), or is Marshal.dump exposing damage that was done prior to invoking it?

Actual Behavior

#<ConcurrencyError: Detected invalid hash contents due to unsynchronized modifications with concurrent users>
java/lang/Thread.java:1556:in `getStackTrace'
org/jruby/runtime/backtrace/TraceType.java:244:in `getBacktraceData'
org/jruby/runtime/backtrace/TraceType.java:47:in `getBacktrace'
org/jruby/RubyException.java:245:in `prepareBacktrace'
org/jruby/exceptions/RaiseException.java:216:in `preRaise'
org/jruby/exceptions/RaiseException.java:183:in `preRaise'
org/jruby/exceptions/RaiseException.java:111:in `<init>'
org/jruby/Ruby.java:4127:in `newRaiseException'
org/jruby/Ruby.java:3843:in `newConcurrencyError'
org/jruby/RubyHash.java:2124:in `concurrentModification'
org/jruby/RubyHash.java:676:in `visitLimited'
org/jruby/RubyHash.java:1993:in `marshalTo'
org/jruby/runtime/marshal/MarshalStream.java:280:in `writeObjectData'
org/jruby/runtime/marshal/MarshalStream.java:203:in `writeDirectly'
org/jruby/RubyClass.java:1956:in `dump'
org/jruby/RubyClass.java:2034:in `smartDump'
org/jruby/runtime/marshal/MarshalStream.java:153:in `writeAndRegister'
org/jruby/runtime/marshal/MarshalStream.java:107:in `dumpObject'
org/jruby/RubyArray.java:4339:in `marshalTo'
org/jruby/runtime/marshal/MarshalStream.java:238:in `writeObjectData'
org/jruby/runtime/marshal/MarshalStream.java:203:in `writeDirectly'
org/jruby/RubyClass.java:1956:in `dump'
org/jruby/RubyClass.java:2034:in `smartDump'
org/jruby/runtime/marshal/MarshalStream.java:153:in `writeAndRegister'
org/jruby/runtime/marshal/MarshalStream.java:107:in `dumpObject'
org/jruby/runtime/marshal/MarshalStream.java:342:in `userNewCommon'
org/jruby/runtime/marshal/MarshalStream.java:323:in `userNewMarshal'
org/jruby/RubyClass.java:1959:in `dump'
org/jruby/RubyClass.java:2034:in `smartDump'
org/jruby/runtime/marshal/MarshalStream.java:153:in `writeAndRegister'
org/jruby/runtime/marshal/MarshalStream.java:107:in `dumpObject'
org/jruby/runtime/marshal/MarshalStream.java:419:in `dumpVariablesShared'
org/jruby/runtime/marshal/MarshalStream.java:412:in `dumpVariables'
org/jruby/RubyClass.java:1322:in `marshalTo'
org/jruby/RubyClass.java:1296:in `marshal'
org/jruby/runtime/marshal/MarshalStream.java:293:in `writeObjectData'
org/jruby/runtime/marshal/MarshalStream.java:203:in `writeDirectly'
org/jruby/RubyClass.java:1956:in `dump'
org/jruby/RubyClass.java:2034:in `smartDump'
org/jruby/runtime/marshal/MarshalStream.java:153:in `writeAndRegister'
org/jruby/runtime/marshal/MarshalStream.java:107:in `dumpObject'
org/jruby/runtime/marshal/MarshalStream.java:419:in `dumpVariablesShared'
org/jruby/runtime/marshal/MarshalStream.java:412:in `dumpVariables'
org/jruby/RubyClass.java:1322:in `marshalTo'
org/jruby/RubyClass.java:1296:in `marshal'
org/jruby/runtime/marshal/MarshalStream.java:293:in `writeObjectData'
org/jruby/runtime/marshal/MarshalStream.java:203:in `writeDirectly'
org/jruby/RubyClass.java:1970:in `dump'
org/jruby/RubyClass.java:2034:in `smartDump'
org/jruby/runtime/marshal/MarshalStream.java:153:in `writeAndRegister'
org/jruby/runtime/marshal/MarshalStream.java:107:in `dumpObject'
org/jruby/RubyArray.java:4339:in `marshalTo'
org/jruby/runtime/marshal/MarshalStream.java:238:in `writeObjectData'
org/jruby/runtime/marshal/MarshalStream.java:203:in `writeDirectly'
org/jruby/RubyClass.java:1956:in `dump'
org/jruby/RubyClass.java:2034:in `smartDump'
org/jruby/runtime/marshal/MarshalStream.java:153:in `writeAndRegister'
org/jruby/runtime/marshal/MarshalStream.java:107:in `dumpObject'
org/jruby/runtime/marshal/MarshalStream.java:419:in `dumpVariablesShared'
org/jruby/runtime/marshal/MarshalStream.java:412:in `dumpVariables'
org/jruby/RubyClass.java:1322:in `marshalTo'
org/jruby/RubyClass.java:1296:in `marshal'
org/jruby/runtime/marshal/MarshalStream.java:293:in `writeObjectData'
org/jruby/runtime/marshal/MarshalStream.java:203:in `writeDirectly'
org/jruby/RubyClass.java:1970:in `dump'
org/jruby/RubyClass.java:2034:in `smartDump'
org/jruby/runtime/marshal/MarshalStream.java:153:in `writeAndRegister'
org/jruby/runtime/marshal/MarshalStream.java:107:in `dumpObject'
org/jruby/RubyArray.java:4339:in `marshalTo'
org/jruby/runtime/marshal/MarshalStream.java:238:in `writeObjectData'
org/jruby/runtime/marshal/MarshalStream.java:203:in `writeDirectly'
org/jruby/RubyClass.java:1956:in `dump'
org/jruby/RubyClass.java:2034:in `smartDump'
org/jruby/runtime/marshal/MarshalStream.java:153:in `writeAndRegister'
org/jruby/runtime/marshal/MarshalStream.java:107:in `dumpObject'
org/jruby/runtime/marshal/MarshalStream.java:419:in `dumpVariablesShared'
org/jruby/runtime/marshal/MarshalStream.java:412:in `dumpVariables'
org/jruby/RubyClass.java:1322:in `marshalTo'
org/jruby/RubyClass.java:1296:in `marshal'
org/jruby/runtime/marshal/MarshalStream.java:293:in `writeObjectData'
org/jruby/runtime/marshal/MarshalStream.java:203:in `writeDirectly'
org/jruby/RubyClass.java:1970:in `dump'
org/jruby/RubyClass.java:2034:in `smartDump'
org/jruby/runtime/marshal/MarshalStream.java:153:in `writeAndRegister'
org/jruby/runtime/marshal/MarshalStream.java:107:in `dumpObject'
org/jruby/RubyArray.java:4339:in `marshalTo'
org/jruby/runtime/marshal/MarshalStream.java:238:in `writeObjectData'
org/jruby/runtime/marshal/MarshalStream.java:203:in `writeDirectly'
org/jruby/RubyClass.java:1956:in `dump'
org/jruby/RubyClass.java:2034:in `smartDump'
org/jruby/runtime/marshal/MarshalStream.java:153:in `writeAndRegister'
org/jruby/runtime/marshal/MarshalStream.java:107:in `dumpObject'
org/jruby/runtime/marshal/MarshalStream.java:419:in `dumpVariablesShared'
org/jruby/runtime/marshal/MarshalStream.java:412:in `dumpVariables'
org/jruby/RubyClass.java:1322:in `marshalTo'
org/jruby/RubyClass.java:1296:in `marshal'
org/jruby/runtime/marshal/MarshalStream.java:293:in `writeObjectData'
org/jruby/runtime/marshal/MarshalStream.java:203:in `writeDirectly'
org/jruby/RubyClass.java:1956:in `dump'
org/jruby/RubyClass.java:2034:in `smartDump'
org/jruby/runtime/marshal/MarshalStream.java:153:in `writeAndRegister'
org/jruby/runtime/marshal/MarshalStream.java:107:in `dumpObject'
org/jruby/RubyHash.java:2006:in `visit'
org/jruby/RubyHash.java:2001:in `visit'
org/jruby/RubyHash.java:669:in `visitLimited'
org/jruby/RubyHash.java:1993:in `marshalTo'
org/jruby/runtime/marshal/MarshalStream.java:280:in `writeObjectData'
org/jruby/runtime/marshal/MarshalStream.java:203:in `writeDirectly'
org/jruby/RubyClass.java:1956:in `dump'
org/jruby/RubyClass.java:2034:in `smartDump'
org/jruby/runtime/marshal/MarshalStream.java:153:in `writeAndRegister'
org/jruby/runtime/marshal/MarshalStream.java:107:in `dumpObject'
org/jruby/RubyMarshal.java:163:in `dumpToStream'
org/jruby/RubyMarshal.java:102:in `dump'
org/jruby/RubyMarshal$INVOKER$s$0$2$dump_DBG.gen:-1:in `call'
org/jruby/internal/runtime/methods/DynamicMethod.java:206:in `call'
org/jruby/internal/runtime/methods/DynamicMethod.java:202:in `call'
org/jruby/runtime/callsite/CachingCallSite.java:153:in `call'
org/jruby/ir/interpreter/InterpreterEngine.java:315:in `processCall'
org/jruby/ir/interpreter/StartupInterpreterEngine.java:72:in `interpret'
org/jruby/internal/runtime/methods/MixedModeIRMethod.java:109:in `INTERPRET_METHOD'
org/jruby/internal/runtime/methods/MixedModeIRMethod.java:95:in `call'
org/jruby/internal/runtime/methods/AliasMethod.java:101:in `call'
org/jruby/runtime/callsite/CachingCallSite.java:77:in `call'
org/jruby/ir/instructions/CallBase.java:432:in `interpret'
org/jruby/ir/interpreter/InterpreterEngine.java:360:in `processCall'
org/jruby/ir/interpreter/StartupInterpreterEngine.java:72:in `interpret'
org/jruby/ir/interpreter/Interpreter.java:127:in `INTERPRET_BLOCK'
org/jruby/runtime/MixedModeIRBlockBody.java:149:in `commonYieldPath'
org/jruby/runtime/IRBlockBody.java:84:in `yieldSpecific'
org/jruby/runtime/Block.java:134:in `yieldSpecific'
org/jruby/ir/runtime/IRRuntimeHelpers.java:421:in `yieldSpecific'
org/jruby/ir/targets/YieldSite.java:119:in `yieldSpecific'
/Users/jason/.rvm/gems/jruby-9.1.17.0@player/gems/newrelic_rpm-5.2.0.345/lib/new_relic/agent/method_tracer_helpers.rb:30:in `trace_execution_scoped'
org/jruby/internal/runtime/methods/CompiledIRMethod.java:77:in `call'
org/jruby/internal/runtime/methods/MixedModeIRMethod.java:93:in `call'
org/jruby/runtime/callsite/CachingCallSite.java:77:in `call'
org/jruby/runtime/callsite/CachingCallSite.java:84:in `callIter'
org/jruby/ir/instructions/CallBase.java:429:in `interpret'
org/jruby/ir/interpreter/InterpreterEngine.java:360:in `processCall'
org/jruby/ir/interpreter/StartupInterpreterEngine.java:72:in `interpret'
org/jruby/ir/interpreter/InterpreterEngine.java:78:in `interpret'
org/jruby/internal/runtime/methods/MixedModeIRMethod.java:144:in `INTERPRET_METHOD'
org/jruby/internal/runtime/methods/MixedModeIRMethod.java:130:in `call'
org/jruby/internal/runtime/methods/DynamicMethod.java:194:in `call'
org/jruby/internal/runtime/methods/AliasMethod.java:56:in `call'
org/jruby/runtime/callsite/CachingCallSite.java:125:in `call'
org/jruby/ir/interpreter/InterpreterEngine.java:344:in `processCall'
org/jruby/ir/interpreter/StartupInterpreterEngine.java:72:in `interpret'
org/jruby/ir/interpreter/Interpreter.java:127:in `INTERPRET_BLOCK'
org/jruby/runtime/MixedModeIRBlockBody.java:149:in `commonYieldPath'
org/jruby/runtime/IRBlockBody.java:73:in `call'
org/jruby/runtime/Block.java:124:in `call'
org/jruby/RubyProc.java:289:in `call'
org/jruby/RubyProc.java:273:in `call19'
org/jruby/RubyProc$INVOKER$i$0$0$call19_DBG.gen:-1:in `call'
org/jruby/internal/runtime/methods/DynamicMethod.java:198:in `call'
org/jruby/internal/runtime/methods/DynamicMethod.java:194:in `call'
org/jruby/runtime/callsite/CachingCallSite.java:125:in `call'
org/jruby/ir/interpreter/InterpreterEngine.java:344:in `processCall'
org/jruby/ir/interpreter/StartupInterpreterEngine.java:72:in `interpret'
org/jruby/ir/interpreter/Interpreter.java:127:in `INTERPRET_BLOCK'
org/jruby/runtime/MixedModeIRBlockBody.java:149:in `commonYieldPath'
org/jruby/runtime/IRBlockBody.java:84:in `yieldSpecific'
org/jruby/runtime/Block.java:134:in `yieldSpecific'
org/jruby/ir/runtime/IRRuntimeHelpers.java:421:in `yieldSpecific'
org/jruby/ir/instructions/YieldInstr.java:74:in `interpret'
org/jruby/ir/interpreter/StartupInterpreterEngine.java:178:in `processOtherOp'
org/jruby/ir/interpreter/StartupInterpreterEngine.java:104:in `interpret'
org/jruby/internal/runtime/methods/MixedModeIRMethod.java:109:in `INTERPRET_METHOD'
org/jruby/internal/runtime/methods/MixedModeIRMethod.java:95:in `call'
org/jruby/runtime/callsite/CachingCallSite.java:77:in `call'
org/jruby/runtime/callsite/CachingCallSite.java:84:in `callIter'
org/jruby/ir/instructions/CallBase.java:429:in `interpret'
org/jruby/ir/interpreter/InterpreterEngine.java:360:in `processCall'
org/jruby/ir/interpreter/StartupInterpreterEngine.java:72:in `interpret'
org/jruby/internal/runtime/methods/MixedModeIRMethod.java:109:in `INTERPRET_METHOD'
org/jruby/internal/runtime/methods/MixedModeIRMethod.java:95:in `call'
org/jruby/internal/runtime/methods/AliasMethod.java:101:in `call'
org/jruby/runtime/callsite/CachingCallSite.java:77:in `call'
org/jruby/ir/instructions/CallBase.java:432:in `interpret'
org/jruby/ir/interpreter/InterpreterEngine.java:360:in `processCall'
org/jruby/ir/interpreter/StartupInterpreterEngine.java:72:in `interpret'
org/jruby/ir/interpreter/Interpreter.java:127:in `INTERPRET_BLOCK'
org/jruby/runtime/MixedModeIRBlockBody.java:149:in `commonYieldPath'
org/jruby/runtime/IRBlockBody.java:84:in `yieldSpecific'
org/jruby/runtime/Block.java:134:in `yieldSpecific'
org/jruby/ir/runtime/IRRuntimeHelpers.java:421:in `yieldSpecific'
org/jruby/ir/instructions/YieldInstr.java:74:in `interpret'
org/jruby/ir/interpreter/StartupInterpreterEngine.java:178:in `processOtherOp'
org/jruby/ir/interpreter/StartupInterpreterEngine.java:104:in `interpret'
org/jruby/ir/interpreter/InterpreterEngine.java:84:in `interpret'
org/jruby/internal/runtime/methods/MixedModeIRMethod.java:179:in `INTERPRET_METHOD'
org/jruby/internal/runtime/methods/MixedModeIRMethod.java:165:in `call'
org/jruby/runtime/callsite/CachingCallSite.java:163:in `call'
org/jruby/runtime/callsite/CachingCallSite.java:170:in `callIter'
org/jruby/ir/interpreter/InterpreterEngine.java:336:in `processCall'
org/jruby/ir/interpreter/StartupInterpreterEngine.java:72:in `interpret'
org/jruby/internal/runtime/methods/MixedModeIRMethod.java:109:in `INTERPRET_METHOD'
org/jruby/internal/runtime/methods/MixedModeIRMethod.java:95:in `call'
org/jruby/internal/runtime/methods/DynamicMethod.java:176:in `call'
org/jruby/internal/runtime/methods/AliasMethod.java:76:in `call'
org/jruby/javasupport/proxy/JavaProxyConstructor.java:255:in `invokeRuby'
org/jruby/javasupport/proxy/JavaProxyConstructor.java:238:in `invoke'
org/jruby/proxy/java/lang/Thread$Proxy6:-1:in `run'
@JasonLunn

This comment has been minimized.

Show comment
Hide comment
@JasonLunn

JasonLunn Sep 10, 2018

Update:

I built JRuby 9.1.17.0 locally with some additional diagnostics in the error message to capture the output of inspect and frozen? when the exception is thrown. This enabled me to pinpoint what hash was the source, and I can now confirm that it was not frozen at the time.

I altered the application code to make sure that the hash in question (which turned out to be the attributes hash of one of the ActiveRecord objects) was frozen, expecting to see an error raised where alteration of a frozen object was attempted. Not only do I not see such an error, I am no longer seeing the ConcurrencyError raised at all. I'm in the process of stress testing the changes to see if it stays solved or is simply harder to reproduce.

@headius - Does freeze'ing a hash have any expected impact on the conditions needed to cause aConcurrencyErrors? Are concurrent reads of a hash enough to put it in a bad state, or is modification required?

JasonLunn commented Sep 10, 2018

Update:

I built JRuby 9.1.17.0 locally with some additional diagnostics in the error message to capture the output of inspect and frozen? when the exception is thrown. This enabled me to pinpoint what hash was the source, and I can now confirm that it was not frozen at the time.

I altered the application code to make sure that the hash in question (which turned out to be the attributes hash of one of the ActiveRecord objects) was frozen, expecting to see an error raised where alteration of a frozen object was attempted. Not only do I not see such an error, I am no longer seeing the ConcurrencyError raised at all. I'm in the process of stress testing the changes to see if it stays solved or is simply harder to reproduce.

@headius - Does freeze'ing a hash have any expected impact on the conditions needed to cause aConcurrencyErrors? Are concurrent reads of a hash enough to put it in a bad state, or is modification required?

@JasonLunn JasonLunn changed the title from Marshal.dump sometimes triggers ConcurrencyError for large structure, even when frozen to Marshal.dump sometimes triggers ConcurrencyError for large structure Sep 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment