New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid hash value for nested arrays #3884

Closed
flash-gordon opened this Issue May 12, 2016 · 21 comments

Comments

Projects
None yet
2 participants
@flash-gordon

Environment

It seems that every version of jruby starting from at least 9k-pre1 has the following behavior.

Expected Behavior

2.3.1 :001 > [1, [1, []]].hash
 => 2397343324588010773
2.3.1 :002 > [2, [2, []]].hash
 => 4238421788679554352

Actual Behavior

jruby-9.1.0.0 :086 > [1, [1, []]].hash
 => 0
jruby-9.1.0.0 :087 > [2, [2, []]].hash
 => 0

We expect things to work in dry-validation because we use caching based on a hash of AST which in order is represented by a deeply nested array. The issue leads to returning the same result based on different outputs which surprises a lot.

Original values that you can use for testing a fix:

jruby-9.1.0.0 :001 > [[:and, [[:val, [:predicate, [:key?, [:age]]]], [:key, [:age, [:predicate, [:lt?, [23]]]]]]]].hash
 => 8481884717
jruby-9.1.0.0 :002 > [[:and, [[:val, [:predicate, [:key?, [:foo]]]], [:key, [:foo, [:predicate, [:lt?, [23]]]]]]]].hash
 => 8481884717

I don't know what exactly is going wrong but I tried to provide as minimal failing case as I could and it would supercool to have this fixed because you guys are doing a great job and having a support for jruby really matters for us, thank you!

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 12, 2016

Member

That is very strange.

I will point out that just using a hash of the AST has potential to collide. In the past I have used SHA1 or similar to reduce that risk, but they're obviously much more expensive than a simple hash calculation.

Member

headius commented May 12, 2016

That is very strange.

I will point out that just using a hash of the AST has potential to collide. In the past I have used SHA1 or similar to reduce that risk, but they're obviously much more expensive than a simple hash calculation.

@flash-gordon

This comment has been minimized.

Show comment
Hide comment
@flash-gordon

flash-gordon May 12, 2016

Thanks! I will do the math but I guess a memory leak we introduce by caching is more likely than having an issue with hash collisions :) Anyway, I'll check the probability tomorrow, just need to recover some science from my mind places 🔬

Thanks! I will do the math but I guess a memory leak we introduce by caching is more likely than having an issue with hash collisions :) Anyway, I'll check the probability tomorrow, just need to recover some science from my mind places 🔬

@headius headius closed this in 1b92064 May 12, 2016

headius added a commit that referenced this issue May 12, 2016

Add a spec for hashing a double-nested array with inner zarray.
Relates to #3884, where JRuby
was always returning the same hash for certain combinations of
nested array with an innermost empty array.

@headius headius added this to the JRuby 9.1.1.0 milestone May 12, 2016

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 12, 2016

Member

It turns out that the main issue was that we always hashed an empty array as 0, which had cascading effects for the rest of the Array#hash calculation. I went ahead and re-ported MRI's logic with some JRuby-specific tweaks.

Member

headius commented May 12, 2016

It turns out that the main issue was that we always hashed an empty array as 0, which had cascading effects for the rest of the Array#hash calculation. I went ahead and re-ported MRI's logic with some JRuby-specific tweaks.

@flash-gordon

This comment has been minimized.

Show comment
Hide comment
@flash-gordon

flash-gordon May 12, 2016

@headius you're awesome, thank you! Will it be backported and which versions are affected? In other words could you tell me what versions of jruby we should test against (starting from 9k)?

@headius you're awesome, thank you! Will it be backported and which versions are affected? In other words could you tell me what versions of jruby we should test against (starting from 9k)?

@flash-gordon

This comment has been minimized.

Show comment
Hide comment
@flash-gordon

flash-gordon May 12, 2016

btw I tested the code and it works now!

btw I tested the code and it works now!

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 12, 2016

Member

It will be in 9.1.1.0. If 1.7 exhibits the same problem we could backport there, but we only maintain one 9.x release at a time (i.e. there will never be a 9.0.6.0 now that 9.1 is out).

Member

headius commented May 12, 2016

It will be in 9.1.1.0. If 1.7 exhibits the same problem we could backport there, but we only maintain one 9.x release at a time (i.e. there will never be a 9.0.6.0 now that 9.1 is out).

@flash-gordon

This comment has been minimized.

Show comment
Hide comment
@flash-gordon

flash-gordon May 13, 2016

@headius OK. We don't support jruby 1.7, only 9k, so it's not an issue for us, feel free to decide about backporting.

@headius OK. We don't support jruby 1.7, only 9k, so it's not an issue for us, feel free to decide about backporting.

@headius headius reopened this May 13, 2016

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 13, 2016

Member

My ported hash logic appears to have broken some recursive detection, so I'm backing off that change and trying to just do the [] hash fix. I think that will be enough, but we might need you to reverify.

Member

headius commented May 13, 2016

My ported hash logic appears to have broken some recursive detection, so I'm backing off that change and trying to just do the [] hash fix. I think that will be enough, but we might need you to reverify.

@flash-gordon

This comment has been minimized.

Show comment
Hide comment
@flash-gordon

flash-gordon May 13, 2016

@headius yeah, sure. A few minutes ago I found that something is not quite right too :)

@headius yeah, sure. A few minutes ago I found that something is not quite right too :)

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 13, 2016

Member

I have a patch to master that gets specs passing again (silly bit width mistake on my part):

diff --git a/core/src/main/java/org/jruby/RubyArray.java b/core/src/main/java/org/jruby/RubyArray.java
index 6a0bb4c..1dd924f 100644
--- a/core/src/main/java/org/jruby/RubyArray.java
+++ b/core/src/main/java/org/jruby/RubyArray.java
@@ -678,7 +678,7 @@ public class RubyArray extends RubyObject implements List, RandomAccess {
     public RubyFixnum hash(ThreadContext context) {
         final Ruby runtime = context.runtime;
         int begin = RubyArray.this.begin;
-        long h = (realLength << 32) & System.identityHashCode(RubyArray.class);
+        long h = (((long) realLength) << 32) & System.identityHashCode(RubyArray.class);
         for (int i = begin; i < begin + realLength; i++) {
             h = (h << 1) | (h < 0 ? 1 : 0);
             final IRubyObject value = safeArrayRef(runtime, values, i);

Doing further testing on it locally before pushing.

Member

headius commented May 13, 2016

I have a patch to master that gets specs passing again (silly bit width mistake on my part):

diff --git a/core/src/main/java/org/jruby/RubyArray.java b/core/src/main/java/org/jruby/RubyArray.java
index 6a0bb4c..1dd924f 100644
--- a/core/src/main/java/org/jruby/RubyArray.java
+++ b/core/src/main/java/org/jruby/RubyArray.java
@@ -678,7 +678,7 @@ public class RubyArray extends RubyObject implements List, RandomAccess {
     public RubyFixnum hash(ThreadContext context) {
         final Ruby runtime = context.runtime;
         int begin = RubyArray.this.begin;
-        long h = (realLength << 32) & System.identityHashCode(RubyArray.class);
+        long h = (((long) realLength) << 32) & System.identityHashCode(RubyArray.class);
         for (int i = begin; i < begin + realLength; i++) {
             h = (h << 1) | (h < 0 ? 1 : 0);
             final IRubyObject value = safeArrayRef(runtime, values, i);

Doing further testing on it locally before pushing.

@flash-gordon

This comment has been minimized.

Show comment
Hide comment
@flash-gordon

flash-gordon May 13, 2016

OK I'll be able to run the specs again In a couple of hours
On Пт, 13 мая 2016 г. at 20:54, Charles Oliver Nutter <
notifications@github.com> wrote:

I have a patch to master that gets specs passing again (silly bit width
mistake on my part):

diff --git a/core/src/main/java/org/jruby/RubyArray.java b/core/src/main/java/org/jruby/RubyArray.java
index 6a0bb4c..1dd924f 100644--- a/core/src/main/java/org/jruby/RubyArray.java+++ b/core/src/main/java/org/jruby/RubyArray.java@@ -678,7 +678,7 @@ public class RubyArray extends RubyObject implements List, RandomAccess {
public RubyFixnum hash(ThreadContext context) {
final Ruby runtime = context.runtime;
int begin = RubyArray.this.begin;- long h = (realLength << 32) & System.identityHashCode(RubyArray.class);+ long h = (((long) realLength) << 32) & System.identityHashCode(RubyArray.class);
for (int i = begin; i < begin + realLength; i++) {
h = (h << 1) | (h < 0 ? 1 : 0);
final IRubyObject value = safeArrayRef(runtime, values, i);

Doing further testing on it locally before pushing.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#3884 (comment)

OK I'll be able to run the specs again In a couple of hours
On Пт, 13 мая 2016 г. at 20:54, Charles Oliver Nutter <
notifications@github.com> wrote:

I have a patch to master that gets specs passing again (silly bit width
mistake on my part):

diff --git a/core/src/main/java/org/jruby/RubyArray.java b/core/src/main/java/org/jruby/RubyArray.java
index 6a0bb4c..1dd924f 100644--- a/core/src/main/java/org/jruby/RubyArray.java+++ b/core/src/main/java/org/jruby/RubyArray.java@@ -678,7 +678,7 @@ public class RubyArray extends RubyObject implements List, RandomAccess {
public RubyFixnum hash(ThreadContext context) {
final Ruby runtime = context.runtime;
int begin = RubyArray.this.begin;- long h = (realLength << 32) & System.identityHashCode(RubyArray.class);+ long h = (((long) realLength) << 32) & System.identityHashCode(RubyArray.class);
for (int i = begin; i < begin + realLength; i++) {
h = (h << 1) | (h < 0 ? 1 : 0);
final IRubyObject value = safeArrayRef(runtime, values, i);

Doing further testing on it locally before pushing.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#3884 (comment)

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 13, 2016

Member

Sorry...I was wrong, it doesn't pass specs. I'll report back here when I have everything passing.

Member

headius commented May 13, 2016

Sorry...I was wrong, it doesn't pass specs. I'll report back here when I have everything passing.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 13, 2016

Member

Yeah for various reasons I think this is going to require a completely new recursion detection. I have a prototype that appears to be passing specs.

The new code should address #3887 too. May be a little too risky for 9.1.1.0...@enebo?

Member

headius commented May 13, 2016

Yeah for various reasons I think this is going to require a completely new recursion detection. I have a prototype that appears to be passing specs.

The new code should address #3887 too. May be a little too risky for 9.1.1.0...@enebo?

headius added a commit that referenced this issue May 14, 2016

Rewrite recursive guard and wire it up.
The old recursion guard (ported from MRI years ago) had a number
of flaws:

* It forced an object ID for every object encountered.
* It created transient objects for every recursive stack.
* It was not using identity hashing (#3887).
* It used a RubyHash internally, which has much more overhead than
  a typical JDK Map.

The new implementation largely follows the pattern of the original
but fixes all the above items. It passes all untagged specs.

See also #3884, which started this whole thing.

@headius headius closed this in 89d8af7 May 14, 2016

headius added a commit that referenced this issue May 14, 2016

Smaller fix for #3884.
The fix here is as follows:

* Always provide a non-zero hash for any array, including empty.
* Better hash combination function with less collision.

An improved recursion guard will go into the next release.
@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 14, 2016

Member

Ok, I've merged in a simpler fix for 9.1.1.0, which you could verify. You might also be interested in trying out the test-new-recursion branch, which has the improved logic.

Member

headius commented May 14, 2016

Ok, I've merged in a simpler fix for 9.1.1.0, which you could verify. You might also be interested in trying out the test-new-recursion branch, which has the improved logic.

headius added a commit that referenced this issue May 14, 2016

headius added a commit that referenced this issue May 14, 2016

headius added a commit that referenced this issue May 14, 2016

Rewrite recursive guard and wire it up.
The old recursion guard (ported from MRI years ago) had a number
of flaws:

* It forced an object ID for every object encountered.
* It created transient objects for every recursive stack.
* It was not using identity hashing (#3887).
* It used a RubyHash internally, which has much more overhead than
  a typical JDK Map.

The new implementation largely follows the pattern of the original
but fixes all the above items. It passes all untagged specs.

See also #3884, which started this whole thing.
@flash-gordon

This comment has been minimized.

Show comment
Hide comment
@flash-gordon

flash-gordon May 14, 2016

Thanks! I tested both solutions and here is what I got:

  1. Smaller fix f1e056b works fine, all specs (2K+) are green with several runs & random order.
  2. test-new-recursion fails the following example:
irb(main):017:0> [{a: 1}].hash
=> 5019126971421186946
irb(main):018:0> [{b: 1}].hash
=> 5019126971421186946

Thanks! I tested both solutions and here is what I got:

  1. Smaller fix f1e056b works fine, all specs (2K+) are green with several runs & random order.
  2. test-new-recursion fails the following example:
irb(main):017:0> [{a: 1}].hash
=> 5019126971421186946
irb(main):018:0> [{b: 1}].hash
=> 5019126971421186946
@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 16, 2016

Member

@flash-gordon I do not see that on the current branch, but I believe I had fixes go in after you tested.

[] ~/projects/jruby $ jruby -e 'p [{a: 1}].hash'
1994458792184350789

[] ~/projects/jruby $ jruby -e 'p [{b: 1}].hash'
4863057760227230173

I do have one MRI test failure to investigate.

Member

headius commented May 16, 2016

@flash-gordon I do not see that on the current branch, but I believe I had fixes go in after you tested.

[] ~/projects/jruby $ jruby -e 'p [{a: 1}].hash'
1994458792184350789

[] ~/projects/jruby $ jruby -e 'p [{b: 1}].hash'
4863057760227230173

I do have one MRI test failure to investigate.

@flash-gordon

This comment has been minimized.

Show comment
Hide comment
@flash-gordon

flash-gordon May 16, 2016

@headius you should test it in the same process

gordon@mac ~/dev/jruby [test-new-recursion] $ ./bin/jruby -e 'p [{a: 1}].hash; p [{b: 1}].hash;'
6962820890589519712
6962820890589519712

@headius you should test it in the same process

gordon@mac ~/dev/jruby [test-new-recursion] $ ./bin/jruby -e 'p [{a: 1}].hash; p [{b: 1}].hash;'
6962820890589519712
6962820890589519712
@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 16, 2016

Member

Yeah I realized that afterward, and it turns out the MRI failure was the same issue. It's now fixed on the branch and will be merged into 9.1.2.0. Feel free to verify again if you like :-)

Member

headius commented May 16, 2016

Yeah I realized that afterward, and it turns out the MRI failure was the same issue. It's now fixed on the branch and will be merged into 9.1.2.0. Feel free to verify again if you like :-)

@flash-gordon

This comment has been minimized.

Show comment
Hide comment
@flash-gordon

flash-gordon May 16, 2016

Sure. I ran specs (again, 2.2K specs) ten times with random order and had no issues, so it seems to work fine and it's very awesome, thank you!

Sure. I ran specs (again, 2.2K specs) ten times with random order and had no issues, so it seems to work fine and it's very awesome, thank you!

@flash-gordon

This comment has been minimized.

Show comment
Hide comment
@flash-gordon

flash-gordon May 16, 2016

Btw, when do you plan to release 9.1.1.0. I mean rough estimation of course.

Btw, when do you plan to release 9.1.1.0. I mean rough estimation of course.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius May 16, 2016

Member

Within the next couple days. Blocked on #3867 and related issues right now (problems activating gems inside bundle exec.

Member

headius commented May 16, 2016

Within the next couple days. Blocked on #3867 and related issues right now (problems activating gems inside bundle exec.

flash-gordon added a commit to dry-rb/dry-validation that referenced this issue May 16, 2016

Do not run specs on jruby-9000 (closes #159)
* See jruby/jruby#3884 for retails
* I'll replace it with add 9.1.1.0 release when it is available

headius added a commit that referenced this issue May 17, 2016

Rewrite recursive guard and wire it up.
The old recursion guard (ported from MRI years ago) had a number
of flaws:

* It forced an object ID for every object encountered.
* It created transient objects for every recursive stack.
* It was not using identity hashing (#3887).
* It used a RubyHash internally, which has much more overhead than
  a typical JDK Map.

The new implementation largely follows the pattern of the original
but fixes all the above items. It passes all untagged specs.

See also #3884, which started this whole thing.

headius added a commit that referenced this issue May 18, 2016

Rewrite recursive guard and wire it up.
The old recursion guard (ported from MRI years ago) had a number
of flaws:

* It forced an object ID for every object encountered.
* It created transient objects for every recursive stack.
* It was not using identity hashing (#3887).
* It used a RubyHash internally, which has much more overhead than
  a typical JDK Map.

The new implementation largely follows the pattern of the original
but fixes all the above items. It passes all untagged specs.

See also #3884, which started this whole thing.

eregon added a commit to ruby/spec that referenced this issue May 29, 2016

Add a spec for hashing a double-nested array with inner zarray.
Relates to jruby/jruby#3884, where JRuby
was always returning the same hash for certain combinations of
nested array with an innermost empty array.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment