New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2.4] Hash key randomization, universal hashing, new Hash impl #4708

Closed
headius opened this Issue Jul 7, 2017 · 5 comments

Comments

Projects
None yet
4 participants
@headius
Copy link
Member

headius commented Jul 7, 2017

MRI has implemented a new Hash that has some innovative features:

  • Open addressing for better cache locality
  • Automatically switching over to a secure hash when a "hash DOS" attack is detected.

We currently have the original "chained buckets" implementation of Hash with no generalized hash randomization except for strings. This is open work to be done for JRuby.

See https://bugs.ruby-lang.org/issues/12142
See https://bugs.ruby-lang.org/issues/13002

Part of TBD work for 2.4 support in #4293.

headius added a commit that referenced this issue Jul 7, 2017

Exclude tests for randomized hash calculation.
We do not randomize all hash calculation, even though MRI made
the decision to do that at some point. I have opened #4708 to
track this and a possible reimplementation of our Hash to compare
to the open-adressing version introduced in MRI 2.4.
@headius

This comment has been minimized.

Copy link
Member Author

headius commented Jul 7, 2017

Tag #4687 for other unimplemented features.

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jun 11, 2018

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jun 11, 2018

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jun 11, 2018

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jul 13, 2018

Refactor RubyHash to be more performant
This commit basically implements two approaches
to improve the performance of RubyHash:

Switching from closed addressing hashing (double linked list)
to open addressing hashing because of better cache locality.
Furthermore we removed almost all RubyHashEntry objects
for smaller memory allocation (which helps for better cache
locality as well).

Further more small hashes (less than 8 entires) are now
implemented via a linear search which reduces memory
allocation for a buckets. For fast bucket skips we maintain
in this case a hashes array to cache the hash values.

Implements jruby#4708 & jruby#2989
@eregon

This comment has been minimized.

Copy link
Member

eregon commented Jul 13, 2018

Automatically switching over to a secure hash when a "hash DOS" attack is detected.

AFAIK this was removed later, as it was discovered to be a user-visible change (2 different objects with same hash and eql? could no longer find the same key in a Hash IIRC, mentioned in https://bugs.ruby-lang.org/issues/13002).

@headius

This comment has been minimized.

Copy link
Member Author

headius commented Jul 13, 2018

@eregon Thanks, makes sense.

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jul 14, 2018

Refactor RubyHash to be more performant
This commit basically implements two approaches
to improve the performance of RubyHash:

Switching from closed addressing hashing (double linked list)
to open addressing hashing because of better cache locality.
Furthermore we removed almost all RubyHashEntry objects
for smaller memory allocation (which helps for better cache
locality as well).

Further more small hashes (less than 8 entires) are now
implemented via a linear search which reduces memory
allocation for a buckets. For fast bucket skips we maintain
in this case a hashes array to cache the hash values.

Implements jruby#4708 & jruby#2989

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jul 16, 2018

Refactor RubyHash to be more performant
This commit basically implements two approaches
to improve the performance of RubyHash:

Switching from closed addressing hashing (double linked list)
to open addressing hashing because of better cache locality.
Furthermore we removed almost all RubyHashEntry objects
for smaller memory allocation (which helps for better cache
locality as well).

Further more small hashes (less than 8 entires) are now
implemented via a linear search which reduces memory
allocation for a buckets. For fast bucket skips we maintain
in this case a hashes array to cache the hash values.

Implements jruby#4708 & jruby#2989

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jul 16, 2018

Implement open addressing algorithm for RubyHash
to improve the performance by leverage better
cache locality.

Switching from closed addressing hash algorithm (linked list)
to open addressing hashing because of a better cache locality
on modern CPU architectures.
Furthermore we removed almost all RubyHashEntry objects
for smaller memory allocation.
This is already implemented in MRI since 2.4, see
https://bugs.ruby-lang.org/issues/12142

Small hashes (less than 8 entries) are now
implemented via a linear search which reduces memory
allocation in this case and has almost no performance
implication. For a fast bucket skip we maintain
in this case a hashes array to cache the hash values.

Implements jruby#4708 & jruby#2989

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jul 16, 2018

Implement open addressing algorithm for RubyHash
to improve the performance by leverage better
cache locality.

Switching from closed addressing hash algorithm (linked list)
to open addressing hashing because of a better cache locality
on modern CPU architectures.
Furthermore we removed almost all RubyHashEntry objects
for smaller memory allocation.
This is already implemented in MRI since 2.4, see
https://bugs.ruby-lang.org/issues/12142

Small hashes (less than 8 entries) are now
implemented via a linear search which reduces memory
allocation in this case and has almost no performance
implication. For a fast bucket skip we maintain
in this case a hashes array to cache the hash values.

Implements jruby#4708 & jruby#2989

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jul 16, 2018

Implement open addressing algorithm for RubyHash
to improve the performance by leverage better
cache locality.

Switching from closed addressing hash algorithm (linked list)
to open addressing hashing because of a better cache locality
on modern CPU architectures.
Furthermore we removed almost all RubyHashEntry objects
for smaller memory allocation.
This is already implemented in MRI since 2.4, see
https://bugs.ruby-lang.org/issues/12142

Small hashes (less than 8 entries) are now
implemented via a linear search which reduces memory
allocation in this case and has almost no performance
implication. For a fast bucket skip we maintain
in this case a hashes array to cache the hash values.

Implements jruby#4708 & jruby#2989

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jul 17, 2018

Implement open addressing algorithm for RubyHash
to improve the performance by leverage better
cache locality.

Switching from closed addressing hash algorithm (linked list)
to open addressing hashing because of a better cache locality
on modern CPU architectures.
Furthermore we removed almost all RubyHashEntry objects
for smaller memory allocation.
This is already implemented in MRI since 2.4, see
https://bugs.ruby-lang.org/issues/12142

Small hashes (less than 8 entries) are now
implemented via a linear search which reduces memory
allocation in this case and has almost no performance
implication. For a fast bucket skip we maintain
in this case a hashes array to cache the hash values.

Implements jruby#4708 & jruby#2989

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jul 23, 2018

Implement open addressing algorithm for RubyHash
to improve the performance by leverage better
cache locality.

Switching from closed addressing hash algorithm (linked list)
to open addressing hashing because of a better cache locality
on modern CPU architectures.
Furthermore we removed almost all RubyHashEntry objects
for smaller memory allocation.
This is already implemented in MRI since 2.4, see
https://bugs.ruby-lang.org/issues/12142

Small hashes (less than 8 entries) are now
implemented via a linear search which reduces memory
allocation in this case and has almost no performance
implication. For a fast bucket skip we maintain
in this case a hashes array to cache the hash values.

Implements jruby#4708 & jruby#2989

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jul 27, 2018

Implement open addressing algorithm for RubyHash
to improve the performance by leverage better
cache locality.

Switching from closed addressing hash algorithm (linked list)
to open addressing hashing because of a better cache locality
on modern CPU architectures.
Furthermore we removed almost all RubyHashEntry objects
for smaller memory allocation.
This is already implemented in MRI since 2.4, see
https://bugs.ruby-lang.org/issues/12142

Small hashes (less than 8 entries) are now
implemented via a linear search which reduces memory
allocation in this case and has almost no performance
implication. For a fast bucket skip we maintain
in this case a hashes array to cache the hash values.

Implements jruby#4708 & jruby#2989

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Jul 29, 2018

Implement open addressing algorithm for RubyHash
to improve the performance by leverage better
cache locality.

Switching from closed addressing hash algorithm (linked list)
to open addressing hashing because of a better cache locality
on modern CPU architectures.
Furthermore we removed almost all RubyHashEntry objects
for smaller memory allocation.
This is already implemented in MRI since 2.4, see
https://bugs.ruby-lang.org/issues/12142

Small hashes (less than 8 entries) are now
implemented via a linear search which reduces memory
allocation in this case and has almost no performance
implication. For a fast bucket skip we maintain
in this case a hashes array to cache the hash values.

Implements jruby#4708 & jruby#2989

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Aug 2, 2018

Implement open addressing algorithm for RubyHash
to improve the performance by leverage better
cache locality.

Switching from closed addressing hash algorithm (linked list)
to open addressing hashing because of a better cache locality
on modern CPU architectures.
Furthermore we removed almost all RubyHashEntry objects
for smaller memory allocation.
This is already implemented in MRI since 2.4, see
https://bugs.ruby-lang.org/issues/12142

Small hashes (less than 8 entries) are now
implemented via a linear search which reduces memory
allocation in this case and has almost no performance
implication. For a fast bucket skip we maintain
in this case a hashes array to cache the hash values.

Implements jruby#4708 & jruby#2989

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Aug 3, 2018

Implement open addressing algorithm for RubyHash
to improve the performance by leverage better
cache locality.

Switching from closed addressing hash algorithm (linked list)
to open addressing hashing because of a better cache locality
on modern CPU architectures.
Furthermore we removed almost all RubyHashEntry objects
for smaller memory allocation.
This is already implemented in MRI since 2.4, see
https://bugs.ruby-lang.org/issues/12142

Small hashes (less than 8 entries) are now
implemented via a linear search which reduces memory
allocation in this case and has almost no performance
implication. For a fast bucket skip we maintain
in this case a hashes array to cache the hash values.

Implements jruby#4708 & jruby#2989

ChrisBr added a commit to ChrisBr/jruby that referenced this issue Aug 4, 2018

Implement open addressing algorithm for RubyHash
to improve the performance by leverage better
cache locality.

Switching from closed addressing hash algorithm (linked list)
to open addressing hashing because of a better cache locality
on modern CPU architectures.
Furthermore we removed almost all RubyHashEntry objects
for smaller memory allocation.
This is already implemented in MRI since 2.4, see
https://bugs.ruby-lang.org/issues/12142

Small hashes (less than 8 entries) are now
implemented via a linear search which reduces memory
allocation in this case and has almost no performance
implication. For a fast bucket skip we maintain
in this case a hashes array to cache the hash values.

Implements jruby#4708 & jruby#2989

@kares kares added this to the JRuby 9.2.1.0 milestone Sep 7, 2018

@kares

This comment has been minimized.

Copy link
Member

kares commented Sep 7, 2018

guess we can call this a day for 9.2.1 (with @ChrisBr excellent open-addressing work)
... although the description mentions some more things such as specialization?

@ChrisBr

This comment has been minimized.

Copy link
Contributor

ChrisBr commented Sep 7, 2018

According to @eregon this was removed from CRuby later on... So yep, should be fine to close

@kares kares closed this Sep 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment