New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JRuby thread stuck on IntHashMap.rehash #3394

Closed
banker opened this Issue Oct 13, 2015 · 26 comments

Comments

Projects
None yet
5 participants
@banker

banker commented Oct 13, 2015

I have a multi-threaded application running on JRuby 9.0.1.0 (Java 1.8.0_45-b14). The application runs with 8 Ruby threads. One of those threads has been stuck on the IntHashMap.rehash method for the last 13 hours. The remaining threads are all functioning normally. Here is (most of) the stack trace for the thread in question as reported by jstack:

"Ruby-0-Thread-7: /data/apps/my_app/current/vendor/bundle/jruby/2.2.0/gems/my-gem-java/lib/java_executor_simple.rb:22" #21 daemon prio=5 os_prio=0 tid=0x00007f1f585bf800 nid=0x505e runnable [0x00007f1f0db99000]
   java.lang.Thread.State: RUNNABLE
        at org.jruby.util.collections.IntHashMap.rehash(IntHashMap.java:139)
        at org.jruby.util.collections.IntHashMap.put(IntHashMap.java:170)
        at org.jruby.java.dispatch.CallableSelector.matchingCallableArityOne(CallableSelector.java:118)
        at org.jruby.java.invokers.RubyToJavaInvoker.findCallableArityOne(RubyToJavaInvoker.java:282)
        at org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:98)
        at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:139)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:145)
        at data.apps.my_app.releases.my_app_minus_03751e0c_minus_1caa0727.lib.my_app.application.invokeOther31:time(/data/apps/my_app/releases/my_app-03751e0c-1caa0727/lib/my_app/application.rb)
        at data.apps.my_app.releases.my_app_minus_03751e0c_minus_1caa0727.lib.my_app.application.RUBY$method$md?$0(/data/apps/my_app/releases/my_app-03751e0c-1caa0727/lib/my_app/application.rb:149)
        at java.lang.invoke.LambdaForm$DMH/885284298.invokeStatic_L7_L(LambdaForm$DMH)
        at java.lang.invoke.LambdaForm$MH/2130192211.invokeExact_MT(LambdaForm$MH)
        at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:131)
        at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:181)
        at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:197)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:161)
        at data.apps.my_app.releases.my_app_minus_03751e0c_minus_1caa0727.lib.my_app.application.invokeOther3:md?(/data/apps/my_app/releases/my_app-03751e0c-1caa0727/lib/my_app/application.rb)
        at data.apps.my_app.releases.my_app_minus_03751e0c_minus_1caa0727.lib.my_app.application.RUBY$block$process$2(/data/apps/my_app/releases/my_app-03751e0c-1caa0727/lib/my_app/application.rb:182)
        at java.lang.invoke.LambdaForm$DMH/885284298.invokeStatic_L7_L(LambdaForm$DMH)
        at java.lang.invoke.LambdaForm$MH/2130192211.invokeExact_MT(LambdaForm$MH)
        at org.jruby.runtime.CompiledIRBlockBody.commonYieldPath(CompiledIRBlockBody.java:70)
        at org.jruby.runtime.IRBlockBody.yieldSpecific(IRBlockBody.java:67)
        at org.jruby.runtime.Block.yieldSpecific(Block.java:116)
        at org.jruby.ir.runtime.IRRuntimeHelpers.yieldSpecific(IRRuntimeHelpers.java:457)
        at data.apps.my_app.releases.my_app_minus_03751e0c_minus_1caa0727.lib.my_app.my_method.RUBY$method$cache$0(/data/apps/my_app/releases/my_app-03751e0c-1caa0727/lib/my_app/bdcc.rb:19)
        at java.lang.invoke.LambdaForm$DMH/664223387.invokeStatic_L8_L(LambdaForm$DMH)
        at java.lang.invoke.LambdaForm$MH/684822005.invokeExact_MT(LambdaForm$MH)
        at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:146)
        at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:216)
        at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:203)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:209)
        at data.apps.my_app.releases.my_app_minus_03751e0c_minus_1caa0727.lib.my_app.application.invokeOther8:cache(/data/apps/my_app/releases/my_app-03751e0c-1caa0727/lib/my_app/application.rb)
        at data.apps.my_app.releases.my_app_minus_03751e0c_minus_1caa0727.lib.my_app.application.RUBY$block$process$1(/data/apps/my_app/releases/my_app-03751e0c-1caa0727/lib/my_app/application.rb:176)
        at java.lang.invoke.LambdaForm$DMH/885284298.invokeStatic_L7_L(LambdaForm$DMH)
        at java.lang.invoke.LambdaForm$MH/2130192211.invokeExact_MT(LambdaForm$MH)
        at org.jruby.runtime.CompiledIRBlockBody.commonYieldPath(CompiledIRBlockBody.java:70)
        at org.jruby.runtime.IRBlockBody.doYield(IRBlockBody.java:140)
        at org.jruby.runtime.BlockBody.yield(BlockBody.java:77)

I cannot yet reliably reproduce this, but it does reliably occur once the app has been running for a few hours.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 14, 2015

Member

That's very strange. Nothing should block in that map, so it sounds like something has put it into an unstable state that loops forever. I'm not surprised you can't reproduce it.

Can you build JRuby? Perhaps you can stop by IRC tomorrow during the day US time and we'll help you look at it? It might be easiest to attach a debugger or console to see what that thread is actually doing.

Member

headius commented Oct 14, 2015

That's very strange. Nothing should block in that map, so it sounds like something has put it into an unstable state that loops forever. I'm not surprised you can't reproduce it.

Can you build JRuby? Perhaps you can stop by IRC tomorrow during the day US time and we'll help you look at it? It might be easiest to attach a debugger or console to see what that thread is actually doing.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 14, 2015

Member

@kares I was just looking quickly, and I don't see any synchronization around our overload resolution logic in JI. Can you confirm?

Member

headius commented Oct 14, 2015

@kares I was just looking quickly, and I don't see any synchronization around our overload resolution logic in JI. Can you confirm?

@kares

This comment has been minimized.

Show comment
Hide comment
@kares

kares Oct 14, 2015

Member

@headius no sync indeed ... adding the IntHashMap did some improvement - e.g. eliminating some Integer allocations. there was a FIXME around the previous ConcurrentHashMap used as a signature cache which now seems wasn't valid ... some logic to handle concurrent cache updates needs to be done 805ccb8#diff-1284db960bb926e3532f0aa650c24916L39 ... as the map is currently used keys are never replaced in the map, so maybe tweaking or synchronizing the rehash logic will mostly do ?

Member

kares commented Oct 14, 2015

@headius no sync indeed ... adding the IntHashMap did some improvement - e.g. eliminating some Integer allocations. there was a FIXME around the previous ConcurrentHashMap used as a signature cache which now seems wasn't valid ... some logic to handle concurrent cache updates needs to be done 805ccb8#diff-1284db960bb926e3532f0aa650c24916L39 ... as the map is currently used keys are never replaced in the map, so maybe tweaking or synchronizing the rehash logic will mostly do ?

@kares

This comment has been minimized.

Show comment
Hide comment
@kares

kares Oct 14, 2015

Member

simplest thing would be synchronized (cache) on these puts

Member

kares commented Oct 14, 2015

simplest thing would be synchronized (cache) on these puts

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 14, 2015

Member

Without an atomic solution like ConcurrentHashMap, we will need to synchronize both puts and gets, since gets could see an unstable state in the data structure if a put is active.

Member

headius commented Oct 14, 2015

Without an atomic solution like ConcurrentHashMap, we will need to synchronize both puts and gets, since gets could see an unstable state in the data structure if a put is active.

@headius headius added this to the JRuby 9.0.2.0 milestone Oct 14, 2015

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 14, 2015

Member

This needs to be resolved before 9.0.2.0.

Member

headius commented Oct 14, 2015

This needs to be resolved before 9.0.2.0.

headius added a commit that referenced this issue Oct 14, 2015

Refactor Java invoker cache logic to allow encapsulation, sync.
The move to IntHashMap in RubyToJavaInvoker improved throughput
for multiple-arity calls, but since IntHashMap is not synchronized
this could cause corrupted data under concurrent load as seen in
CallableSelector to encapsulate the cache and synchronize access
to it. I believe this is a minimum effort to make the cache access
safe and fix #3394.

I have not measured the impact of this synchronization in a high
concurrency setting. It may indeed turn out to be more overhead
and a worse bottleneck than using ConcurrentHashMap with Integer
objects. Unfortunately there's no middle ground here unless we
build or adopt some concurrency-friendly int-based map.

Another heavy option would be to maintain two references to
IntHashMap in RubyToJavaInvoker, copying from a synchronized
writable copy to a read-only copy after every write. This would
double the size of the cache structures in memory but would provide
a fast, unsynchronized cache for read-mostly signature lookups.
This pattern is also easier to support with the refactoring in
this commit.
@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 14, 2015

Member

@banker If you have the ability to do so, it would be a great help if you could test the code in PR #3395 and possibly also the additional patch I provided in a comment there.

See BUILDING.md for building your own JRuby or poke us on IRC today.

Member

headius commented Oct 14, 2015

@banker If you have the ability to do so, it would be a great help if you could test the code in PR #3395 and possibly also the additional patch I provided in a comment there.

See BUILDING.md for building your own JRuby or poke us on IRC today.

headius added a commit that referenced this issue Oct 14, 2015

@headius headius closed this in #3395 Oct 14, 2015

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 14, 2015

Member

@banker The fixed IntHashMap usage has been pushed to master and will be in 9.0.2.

Member

headius commented Oct 14, 2015

@banker The fixed IntHashMap usage has been pushed to master and will be in 9.0.2.

@banker

This comment has been minimized.

Show comment
Hide comment
@banker

banker Oct 14, 2015

Thanks for looking into this, @headius. I'll build from master and re-run in my environment. Since the bug has taken up to 12 hours to appear, it will take some time for me to verify that the issue is resolved. I'll report back by Friday.

banker commented Oct 14, 2015

Thanks for looking into this, @headius. I'll build from master and re-run in my environment. Since the bug has taken up to 12 hours to appear, it will take some time for me to verify that the issue is resolved. I'll report back by Friday.

@banker

This comment has been minimized.

Show comment
Hide comment
@banker

banker Oct 15, 2015

Bad news, @headius. I ran one instance of my app overnight using the new code, and it's stuck again, but differently.

All 8 of my threads are BLOCKED on a call into a Java metrics library. All of the thread stacks look like this:

"Ruby-0-Thread-4: /data/apps/my_app/current/vendor/bundle/jruby/2.2.0/gems/mygem/lib/mylib/java_executor_simple.rb:22" #1
8 daemon prio=5 os_prio=0 tid=0x00007f3fa5333000 nid=0x3fa9 waiting for monitor entry [0x00007f3f7bdf9000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.jruby.java.invokers.RubyToJavaInvoker.putSignature(RubyToJavaInvoker.java:203)
        - waiting to lock <0x00000005c1ae89d0> (a org.jruby.util.collections.IntHashMap)
        at org.jruby.java.dispatch.CallableSelector.matchingCallableArityOne(CallableSelector.java:66)
        at org.jruby.java.invokers.RubyToJavaInvoker.findCallableArityOne(RubyToJavaInvoker.java:348)
        at org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:98)
        at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:139)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:145)
        at data.apps.my_app.releases.my_app_minus_03751e0c_minus_1caa0727.lib.my.app.application.invokeOther31:time(/data/apps/my_app/releases/my_app-03751e0c-1caa0727/lib/mylib/application.rb)

Let me know if I can provide any further information, and thanks again for such a quick response for far!

banker commented Oct 15, 2015

Bad news, @headius. I ran one instance of my app overnight using the new code, and it's stuck again, but differently.

All 8 of my threads are BLOCKED on a call into a Java metrics library. All of the thread stacks look like this:

"Ruby-0-Thread-4: /data/apps/my_app/current/vendor/bundle/jruby/2.2.0/gems/mygem/lib/mylib/java_executor_simple.rb:22" #1
8 daemon prio=5 os_prio=0 tid=0x00007f3fa5333000 nid=0x3fa9 waiting for monitor entry [0x00007f3f7bdf9000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.jruby.java.invokers.RubyToJavaInvoker.putSignature(RubyToJavaInvoker.java:203)
        - waiting to lock <0x00000005c1ae89d0> (a org.jruby.util.collections.IntHashMap)
        at org.jruby.java.dispatch.CallableSelector.matchingCallableArityOne(CallableSelector.java:66)
        at org.jruby.java.invokers.RubyToJavaInvoker.findCallableArityOne(RubyToJavaInvoker.java:348)
        at org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:98)
        at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:139)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:145)
        at data.apps.my_app.releases.my_app_minus_03751e0c_minus_1caa0727.lib.my.app.application.invokeOther31:time(/data/apps/my_app/releases/my_app-03751e0c-1caa0727/lib/mylib/application.rb)

Let me know if I can provide any further information, and thanks again for such a quick response for far!

@enebo enebo reopened this Oct 15, 2015

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 15, 2015

Member

@banker I think this is the older code that had a lock involved, because the newest version does not have a lock (object monitor). Can you show me the -v for that JRuby?

Member

headius commented Oct 15, 2015

@banker I think this is the older code that had a lock involved, because the newest version does not have a lock (object monitor). Can you show me the -v for that JRuby?

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 15, 2015

Member

Ahh actually I'm sorry...the new code does have a lock for writes, as in putSignature. But I'm very confused now why it would be hanging, since it's only a single lock, no other locks are involved, and reads should have no locking of their own.

Member

headius commented Oct 15, 2015

Ahh actually I'm sorry...the new code does have a lock for writes, as in putSignature. But I'm very confused now why it would be hanging, since it's only a single lock, no other locks are involved, and reads should have no locking of their own.

@banker

This comment has been minimized.

Show comment
Hide comment
@banker

banker Oct 15, 2015

Right. The only thing I can add is that, anecdotally, this new code is much more likely to trigger this deadlock.

banker commented Oct 15, 2015

Right. The only thing I can add is that, anecdotally, this new code is much more likely to trigger this deadlock.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 15, 2015

Member

@banker Harumph. Very confusing. It sounds like there may simply be a bug in our IntHashMap at this point.

Member

headius commented Oct 15, 2015

@banker Harumph. Very confusing. It sounds like there may simply be a bug in our IntHashMap at this point.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 15, 2015

Member

@banker Can you gist a full stack dump of your VM in this bad state? I would expect to see another thread inside IntHashMap that has acquired the lock. My theory at the moment is that IntHashMap just has a bug that causes it to endlessly loop, and the object monitor trace is just a side effect.

Member

headius commented Oct 15, 2015

@banker Can you gist a full stack dump of your VM in this bad state? I would expect to see another thread inside IntHashMap that has acquired the lock. My theory at the moment is that IntHashMap just has a bug that causes it to endlessly loop, and the object monitor trace is just a side effect.

@thedarkone

This comment has been minimized.

Show comment
Hide comment
@thedarkone

thedarkone Oct 15, 2015

Contributor

Yeah, I also don't see how all threads could be stuck waiting for that lock.

Contributor

thedarkone commented Oct 15, 2015

Yeah, I also don't see how all threads could be stuck waiting for that lock.

@banker

This comment has been minimized.

Show comment
Hide comment
@banker

banker Oct 15, 2015

@headius Okay, it turns out that one thread was runnable, which I've included below. Can I email you the gist? Not completely comfortable making it public.

"Ruby-0-Thread-9: /data/apps/my_app/current/vendor/bundle/jruby/2.2.0/gems/my-gem/lib/mylibjava_executor_simple.rb:22" #23 daemon prio=5 os_prio=0 tid=0x00007
f3fa533c000 nid=0x3fae runnable [0x00007f3f7b3f4000]
   java.lang.Thread.State: RUNNABLE
        at org.jruby.util.collections.IntHashMap.put(IntHashMap.java:160)
        at org.jruby.util.collections.IntHashMap.clone(IntHashMap.java:406)
        at org.jruby.java.invokers.RubyToJavaInvoker.putSignature(RubyToJavaInvoker.java:205)
        - locked <0x00000005c1ae89d0> (a org.jruby.util.collections.IntHashMap)
        at org.jruby.java.dispatch.CallableSelector.matchingCallableArityOne(CallableSelector.java:66)
        at org.jruby.java.invokers.RubyToJavaInvoker.findCallableArityOne(RubyToJavaInvoker.java:348)
        at org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:98)
        at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:139)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:145)
        at data.apps.my_app.releases.my_app_minus_03751e0c_minus_1caa0727.lib.foo.myapp.application.invokeOther31:time(/data/apps/my_app/releases/my_app-03751e0c-1caa0727/lib/mylibmyapp/application.rb)
        at data.apps.my_app.releases.my_app_minus_03751e0c_minus_1caa0727.lib.foo.myapp.application.RUBY$method$duplicate_content?$0(/data/apps/my_app/releases/my_app-03751e0c-1caa0727/lib/mylibmyapp/application.rb:149)

banker commented Oct 15, 2015

@headius Okay, it turns out that one thread was runnable, which I've included below. Can I email you the gist? Not completely comfortable making it public.

"Ruby-0-Thread-9: /data/apps/my_app/current/vendor/bundle/jruby/2.2.0/gems/my-gem/lib/mylibjava_executor_simple.rb:22" #23 daemon prio=5 os_prio=0 tid=0x00007
f3fa533c000 nid=0x3fae runnable [0x00007f3f7b3f4000]
   java.lang.Thread.State: RUNNABLE
        at org.jruby.util.collections.IntHashMap.put(IntHashMap.java:160)
        at org.jruby.util.collections.IntHashMap.clone(IntHashMap.java:406)
        at org.jruby.java.invokers.RubyToJavaInvoker.putSignature(RubyToJavaInvoker.java:205)
        - locked <0x00000005c1ae89d0> (a org.jruby.util.collections.IntHashMap)
        at org.jruby.java.dispatch.CallableSelector.matchingCallableArityOne(CallableSelector.java:66)
        at org.jruby.java.invokers.RubyToJavaInvoker.findCallableArityOne(RubyToJavaInvoker.java:348)
        at org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:98)
        at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:139)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:145)
        at data.apps.my_app.releases.my_app_minus_03751e0c_minus_1caa0727.lib.foo.myapp.application.invokeOther31:time(/data/apps/my_app/releases/my_app-03751e0c-1caa0727/lib/mylibmyapp/application.rb)
        at data.apps.my_app.releases.my_app_minus_03751e0c_minus_1caa0727.lib.foo.myapp.application.RUBY$method$duplicate_content?$0(/data/apps/my_app/releases/my_app-03751e0c-1caa0727/lib/mylibmyapp/application.rb:149)
@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 16, 2015

Member

@banker Yeah go ahead, headius@headius.com. I'm also pushing a change to master to use a high-throughput, concurrency-friendly map implementation rather than using our own. It should be in nightly builds tomorrow morning.

Member

headius commented Oct 16, 2015

@banker Yeah go ahead, headius@headius.com. I'm also pushing a change to master to use a high-throughput, concurrency-friendly map implementation rather than using our own. It should be in nightly builds tomorrow morning.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 16, 2015

Member

After a brief search (and some lazy Twittering) I was pointed at https://github.com/boundary/high-scale-lib, a maintained, maven-released fork of Cliff Click's high-throughput map implementations. They are fast, memory-efficient, concurrency-friendly, and non-blocking. The best part is that there's an implementation that takes a long key, so we can basically drop it in wherever we use IntHashMap.

Here's perf results.

Before (COW IntHashMap):

                                                    user     system      total        real
Control                                         0.590000   0.000000   0.590000 (  0.341363)
Math.abs with byte-ranged Fixnum                1.630000   0.010000   1.640000 (  1.379069)
Math.abs with short-ranged Fixnum               1.730000   0.080000   1.810000 (  1.533326)
Math.abs with int-ranged Fixnum                 1.700000   0.020000   1.720000 (  1.479799)
Math.abs with long-ranged Fixnum                1.660000   0.000000   1.660000 (  1.507284)
Math.abs with float-ranged Float                1.900000   0.020000   1.920000 (  1.671365)
Math.abs with double-ranged Float               1.650000   0.010000   1.660000 (  1.609894)
Contended Math.abs with double-ranged Float     3.950000   0.030000   3.980000 (  0.539143)
Control                                         0.400000   0.000000   0.400000 (  0.389647)
Math.abs with byte-ranged Fixnum                1.570000   0.000000   1.570000 (  1.560834)
Math.abs with short-ranged Fixnum               1.650000   0.010000   1.660000 (  1.617118)
Math.abs with int-ranged Fixnum                 1.660000   0.000000   1.660000 (  1.644870)
Math.abs with long-ranged Fixnum                1.680000   0.010000   1.690000 (  1.659800)
Math.abs with float-ranged Float                1.700000   0.010000   1.710000 (  1.675863)
Math.abs with double-ranged Float               1.690000   0.000000   1.690000 (  1.662213)
Contended Math.abs with double-ranged Float     3.120000   0.020000   3.140000 (  0.444798)

After (NonBlockingHashMapLong

                                                    user     system      total        real
Control                                         0.930000   0.020000   0.950000 (  0.416645)
Math.abs with byte-ranged Fixnum                1.570000   0.010000   1.580000 (  1.350059)
Math.abs with short-ranged Fixnum               1.710000   0.080000   1.790000 (  1.483133)
Math.abs with int-ranged Fixnum                 1.640000   0.020000   1.660000 (  1.422112)
Math.abs with long-ranged Fixnum                1.620000   0.010000   1.630000 (  1.445246)
Math.abs with float-ranged Float                1.950000   0.020000   1.970000 (  1.646687)
Math.abs with double-ranged Float               1.680000   0.010000   1.690000 (  1.591855)
Contended Math.abs with double-ranged Float     3.730000   0.030000   3.760000 (  0.511322)
Control                                         0.400000   0.000000   0.400000 (  0.402811)
Math.abs with byte-ranged Fixnum                1.530000   0.000000   1.530000 (  1.531421)
Math.abs with short-ranged Fixnum               1.620000   0.010000   1.630000 (  1.608175)
Math.abs with int-ranged Fixnum                 1.590000   0.010000   1.600000 (  1.574305)
Math.abs with long-ranged Fixnum                1.540000   0.000000   1.540000 (  1.518596)
Math.abs with float-ranged Float                1.640000   0.010000   1.650000 (  1.617539)
Math.abs with double-ranged Float               1.630000   0.010000   1.640000 (  1.614268)
Contended Math.abs with double-ranged Float     2.920000   0.010000   2.930000 (  0.416824)

Performance is essentially the same. I'm going to push the change, since the library only adds about 100kB to JRuby.

Member

headius commented Oct 16, 2015

After a brief search (and some lazy Twittering) I was pointed at https://github.com/boundary/high-scale-lib, a maintained, maven-released fork of Cliff Click's high-throughput map implementations. They are fast, memory-efficient, concurrency-friendly, and non-blocking. The best part is that there's an implementation that takes a long key, so we can basically drop it in wherever we use IntHashMap.

Here's perf results.

Before (COW IntHashMap):

                                                    user     system      total        real
Control                                         0.590000   0.000000   0.590000 (  0.341363)
Math.abs with byte-ranged Fixnum                1.630000   0.010000   1.640000 (  1.379069)
Math.abs with short-ranged Fixnum               1.730000   0.080000   1.810000 (  1.533326)
Math.abs with int-ranged Fixnum                 1.700000   0.020000   1.720000 (  1.479799)
Math.abs with long-ranged Fixnum                1.660000   0.000000   1.660000 (  1.507284)
Math.abs with float-ranged Float                1.900000   0.020000   1.920000 (  1.671365)
Math.abs with double-ranged Float               1.650000   0.010000   1.660000 (  1.609894)
Contended Math.abs with double-ranged Float     3.950000   0.030000   3.980000 (  0.539143)
Control                                         0.400000   0.000000   0.400000 (  0.389647)
Math.abs with byte-ranged Fixnum                1.570000   0.000000   1.570000 (  1.560834)
Math.abs with short-ranged Fixnum               1.650000   0.010000   1.660000 (  1.617118)
Math.abs with int-ranged Fixnum                 1.660000   0.000000   1.660000 (  1.644870)
Math.abs with long-ranged Fixnum                1.680000   0.010000   1.690000 (  1.659800)
Math.abs with float-ranged Float                1.700000   0.010000   1.710000 (  1.675863)
Math.abs with double-ranged Float               1.690000   0.000000   1.690000 (  1.662213)
Contended Math.abs with double-ranged Float     3.120000   0.020000   3.140000 (  0.444798)

After (NonBlockingHashMapLong

                                                    user     system      total        real
Control                                         0.930000   0.020000   0.950000 (  0.416645)
Math.abs with byte-ranged Fixnum                1.570000   0.010000   1.580000 (  1.350059)
Math.abs with short-ranged Fixnum               1.710000   0.080000   1.790000 (  1.483133)
Math.abs with int-ranged Fixnum                 1.640000   0.020000   1.660000 (  1.422112)
Math.abs with long-ranged Fixnum                1.620000   0.010000   1.630000 (  1.445246)
Math.abs with float-ranged Float                1.950000   0.020000   1.970000 (  1.646687)
Math.abs with double-ranged Float               1.680000   0.010000   1.690000 (  1.591855)
Contended Math.abs with double-ranged Float     3.730000   0.030000   3.760000 (  0.511322)
Control                                         0.400000   0.000000   0.400000 (  0.402811)
Math.abs with byte-ranged Fixnum                1.530000   0.000000   1.530000 (  1.531421)
Math.abs with short-ranged Fixnum               1.620000   0.010000   1.630000 (  1.608175)
Math.abs with int-ranged Fixnum                 1.590000   0.010000   1.600000 (  1.574305)
Math.abs with long-ranged Fixnum                1.540000   0.000000   1.540000 (  1.518596)
Math.abs with float-ranged Float                1.640000   0.010000   1.650000 (  1.617539)
Math.abs with double-ranged Float               1.630000   0.010000   1.640000 (  1.614268)
Contended Math.abs with double-ranged Float     2.920000   0.010000   2.930000 (  0.416824)

Performance is essentially the same. I'm going to push the change, since the library only adds about 100kB to JRuby.

@headius headius closed this in 22be3c6 Oct 16, 2015

@headius headius reopened this Oct 16, 2015

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 16, 2015

Member

Reopening one last time. If the new collection works, then it would indicate there's a bug in IntHashMap. In that case, we should also reimpl IntHashMap in terms of NonBlockingHashMapLong in both JRuby 1.7.x and 9k.

Member

headius commented Oct 16, 2015

Reopening one last time. If the new collection works, then it would indicate there's a bug in IntHashMap. In that case, we should also reimpl IntHashMap in terms of NonBlockingHashMapLong in both JRuby 1.7.x and 9k.

@thedarkone

This comment has been minimized.

Show comment
Hide comment
@thedarkone

thedarkone Oct 16, 2015

Contributor

@headius just a note (in case you are unaware of it and end up using Cliff's hash map elsewhere in JRuby), it leaks deleted keys (only keys not values). Obviously, this might not matter depending on use case (no deletions) or, as is the case with NonBlockingHashMapLong, the keys are primitive longs/ints.

Contributor

thedarkone commented Oct 16, 2015

@headius just a note (in case you are unaware of it and end up using Cliff's hash map elsewhere in JRuby), it leaks deleted keys (only keys not values). Obviously, this might not matter depending on use case (no deletions) or, as is the case with NonBlockingHashMapLong, the keys are primitive longs/ints.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 16, 2015

Member

@thedarkone That's good information, thank you. I think we'll probably still be using CHM for our Object:Object maps but this will be a good replacement for IntHashMap, which obviously seems to have some issues.

Member

headius commented Oct 16, 2015

@thedarkone That's good information, thank you. I think we'll probably still be using CHM for our Object:Object maps but this will be a good replacement for IntHashMap, which obviously seems to have some issues.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Oct 20, 2015

Member

@banker Hopefully this has fixed your issues. This change will go out in the release today. I'm going to optimistically close this.

Member

headius commented Oct 20, 2015

@banker Hopefully this has fixed your issues. This change will go out in the release today. I'm going to optimistically close this.

@headius headius closed this Oct 20, 2015

@banker

This comment has been minimized.

Show comment
Hide comment
@banker

banker Oct 20, 2015

@headius I'll build again, run, and get back to you. I hope it's fixed, too! Currently just restarting the app on a cron to preemptively avoid :-\

banker commented Oct 20, 2015

@headius I'll build again, run, and get back to you. I hope it's fixed, too! Currently just restarting the app on a cron to preemptively avoid :-\

@kares

This comment has been minimized.

Show comment
Hide comment
@kares

kares Oct 30, 2015

Member

@banker or anyone ... could you confirm its behaving properly - considering back-porting to jruby-1_7 if so

Member

kares commented Oct 30, 2015

@banker or anyone ... could you confirm its behaving properly - considering back-porting to jruby-1_7 if so

kares added a commit to kares/jruby that referenced this issue Oct 30, 2015

Use Cliff Click's high-scale collections instead of IntHashMap.
This should hopefully fix #3394.

Conflicts:
	core/pom.rb
	core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java

kares added a commit to kares/jruby that referenced this issue Nov 2, 2015

refactor (internal) Java.__jcreate! impl to use same cache as RubyToJ…
…avaInvoker-s

... introducing an internal CallableCache interface to loose coupling in CallableSelector

aligned with main fix for #3394

kares added a commit that referenced this issue Nov 2, 2015

refactor (internal) Java.__jcreate! impl to use same cache as RubyToJ…
…avaInvoker-s

... introducing an internal CallableCache interface to loose coupling in CallableSelector

aligned with main fix for #3394

kares added a commit to kares/jruby that referenced this issue Nov 11, 2015

Use Cliff Click's high-scale collections instead of IntHashMap.
This should hopefully fix #3394.

Conflicts:
	core/pom.rb
	core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java

kares added a commit to kares/jruby that referenced this issue Nov 11, 2015

refactor (internal) Java.__jcreate! impl to use same cache as RubyToJ…
…avaInvoker-s

... introducing an internal CallableCache interface to loose coupling in CallableSelector

aligned with main fix for #3394

this commit also includes pieces of da1b849 and so

Conflicts:
	core/src/main/java/org/jruby/java/dispatch/CallableSelector.java
	core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java
	core/src/main/java/org/jruby/javasupport/Java.java
@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Nov 24, 2015

Member

Also released in 1.7.23.

Member

headius commented Nov 24, 2015

Also released in 1.7.23.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment