Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace the hash function used in YJIT with a faster implementation #561

Closed
whtsht opened this issue Sep 29, 2024 · 2 comments
Closed

Replace the hash function used in YJIT with a faster implementation #561

whtsht opened this issue Sep 29, 2024 · 2 comments

Comments

@whtsht
Copy link

whtsht commented Sep 29, 2024

Issue description:

Replace the hash function used in YJIT from the default SipHash in the standard library to a faster alternative.

Motivation:

YJIT currently uses HashMap and HashSet from the standard library, which utilize SipHash by default. While SipHash is collision-resistant, it is relatively slow. Replacing it with a faster hash function, even at the cost of reduced collision resistance, may improve performance.

Task:

Concerns:

Switching to a faster, non-collision-resistant hash function may increase the risk of hash flooding attacks. To mitigate this, careful implementation and random seeding will be required.

For example, Node.js faced a hash flooding vulnerability in the past: https://v8.dev/blog/hash-flooding.

Conclusion:

I welcome feedback, especially regarding potential security concerns or implementation suggestions. Thank you.

@eregon
Copy link

eregon commented Sep 29, 2024

For reference, the 2 CRuby CVE about HashDoS:

And the move to SipHash-1-3: https://bugs.ruby-lang.org/issues/13017

The second CVE means just MurmurHash + a random seed wasn't enough, at least for hashing keys of user-facing Hash.
The link there seems no longer accessible but there is https://web.archive.org/web/20240401163550/2012.appsec-forum.ch/conferences/#c17

At 28c3, Klink and Waelde demonstrated that a number of technologies (PHP, .NET, Ruby, Java, etc.) remained vulnerable to the decade-old hash-flooding DoS attacks. These attacks work by enforcing worst-case insert time in hash tables by sending many inputs hashing to the same value (a “multicollision”). Many vendors fixed the issue by replacing the weak deterministic hash function with stronger and randomized hash functions. In this presentation, we will show examples of such stronger randomized hash functions that fail to protect against hash-flooding, by presenting “universal multicollision” attacks based on differential cryptanalysis techniques.
We will present demos showing how to exploit these attacks to DoS a Ruby on Rails application, as well as the latest Java OpenJDK; two technologies that chose to “fix” hash-flooding by using the MurmurHash hash functions. Finally, we will describe a reliable fix to hash-flooding with the SipHash family of pseudorandom functions: SipHash provides the adequate cryptographic strength to mitigate hash-flooding, yet is competitive in performance with the non-cryptographic hashes.

And I think that's why we still have quite expensive hash functions in CRuby these days and not just a random seed.

FWIW Java or at least HotSpot chose to fix differently: java.lang.String#hashCode is still fully deterministic to this day but the hashtables implementations like java.util.HashMap switch internally to a tree-like comparison Map when they detect too many hash collisions.

@whtsht
Copy link
Author

whtsht commented Sep 29, 2024

Using SipHash seems reasonable.

  • Many languages ​​use SipHash as a countermeasure against universal multicollisions attacks (Node.js also uses it)

  • It works fast for small inputs

My research was insufficient. Thank you @eregon for your opinion.

I will close this issue.

@whtsht whtsht closed this as completed Sep 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants