Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce string hash collisions #168

Closed
bungle opened this issue Apr 28, 2016 · 22 comments
Closed

Reduce string hash collisions #168

bungle opened this issue Apr 28, 2016 · 22 comments

Comments

@bungle
Copy link

bungle commented Apr 28, 2016

I started this discussion first on LuaJIT mailing list:
http://www.freelists.org/post/luajit/Slowness-due-to-memory-allocation-related-problems

This was also discussed on Hacker News a few years ago:
https://news.ycombinator.com/item?id=3401900

Since then it looks like many of the programming languages have fixed this issue using e.g. randomization. PUC Lua 5.3.2 also seems to have fix for this (don't know exactly when it was fixed there, though). LuaJIT has no fix for this, but there are at least a few unofficial fixes that DO work:

How to reproduce:

  1. Download this: https://github.com/bungle/lua-resty-template
  2. Run luajit -e 'require "resty.template.microbenchmark".run(1000)' (in lib dir)
  3. Run luajit -e 'require "resty.template.microbenchmark".run(10000)' (in lib dir)

Now compare the results with and without the patches mentioned above. Without patches the 3. will be about 10x slower than the patched run on the same test (you can try also luajit -e 'require "resty.template.microbenchmark".run(100000)' – 1000000 will start emitting out of memory when 2 GB limit is hit if you have not patched LuaJIT with some bigger memory space patches).

@DemiMarie
Copy link

The best strong hash function is SipHash. An optimized implementation that runs quite quickly can be found here. SipHash is considered to be cryptographically secure, so it avoids all hash collision attacks provided that the 128-bit key is chosen at random and kept secret.

If SipHash is too slow (it might be) another option is to switch the layout of tables to use a balanced binary tree to handle collisions. This mostly mitigates the problem since the degradation is only to O(log n).

@funny-falcon
Copy link

SipHash is too slow (for LuaJIT).

Look to the page you've linked: every 7byte string you push to Lua will consume 135 cycle with SSE2 on 32bit platform and 61 cycle on 64bit platform - it is huge amount of CPU time.

So it should be smarter. In #174 I'm proposing "smart hashing":

  • use current hash most of time
  • fallback to "harder" hash on collisions

With this scheme, SipHash could be used as fallback function. In fact, I've add an option to use 32bit cousin to SipHash, cause it is faster on 32bit platform, and I wish not use SSE.

But even with this "fallback" approach, it is measurable performance hit to use "strong cryptographic" function instead of "fast and dump" simple whole string hash. And larger function is, larger the hit is.

Perhaps, hardware assisted functions (using aesenc hardware instruction) may give good quality function with affordable performance. But, given Mike doesn't want to use platform dependent functions in this place, and even multiplication is forbidden, it will be hard to push it into code base.

And do not forget, when you will suggest SipHash as a function for hash table next time:

  • SipHash13 is just enough for this use case, Jean-Philippe Aumasson (author of SipHash) confirms that
  • SipHash24 is recommended for MAC - Message Authentication Code - ie when you sign a message, and attacker may see this sign.

@MikePall
Copy link
Member

MikePall commented May 6, 2016

The discussion mixes two different issues:

  1. Cryptographically secure hashing.
  2. Improving the string hash quality.

In detail:

  1. Cryptographically secure hashing is an explicit non-goal.
    1. It's simply unattainable for the core data structure of LuaJIT.
    2. A weakened variant is pointless, since it can still be exploited.
    3. Corollary: if you're operating in a context where maliciously crafted data might affect your service, then make sure this doesn't end up as an interned string.
  2. Improving the hash quality may be a potentially interesting goal.
    1. But it has to be weighed against the resulting performance impact. Changing a sparse, near constant-time hash to a dense, linear-time hash is a drastic step.
    2. NB: hashing has to be deterministic, due to a couple of internal shortcuts.
    3. To evaluate such a change, it's not helpful and likely misleading to use benchmarks that only employ synthetic strings. It's easy to construct collisions for any non-cryptographically secure hash. But that doesn't say anything about the impact on the average case or the likeliness of the worst case (still) occuring.
  3. There are more options:
    1. Don't intern long strings. This has wider reaching consequences for the whole VM and is a non-trivial change.
    2. Use a different data structure, such as a parallel binary search. Needs very careful evaluation.
    3. Combine both options.

@MikePall MikePall changed the title String hash collision Reduce string hash collisions May 6, 2016
@daurnimator
Copy link

Don't intern long strings. This has wider reaching consequences for the whole VM and is a non-trivial change.

This option is notable as PUC-Rio lua has taken this route.

@funny-falcon
Copy link

@MikePall look at #174 - it keeps using current sparse hash until long collision chain is detected. So that, usual workload is not affected.

When long collision chain is detected, only strings that are not covered by sparse hash (ie longer than 12bytes) and only for this collisioned chain are created with full hash (for default case).

"internal shortcuts" are handled by computed "sparse hash" for a string, that has "full hash" computed. It looks like, all "shortcuts" are for small strings, so that, their performance is not affected.

Benchmark shows no effect on benchmark with no "bad" strings (case 1: unpatched 0.196ms , patched 0.192ms - statistically same) (it could be even faster cause of first commit - compare hashsum while traversing collision chain).

As a bonus, last commit adds option "SMART_STRINGS=2": strong hash function used for all strings that fall in long collision chain - it will cover all paranoid needs. Used function is as resistant to "Seed Independent Collisions" as SipHash, but it is not crypto-safe, cause uses less state and produce just 32bit output. It is slower than SipHash on 64bit platform, but faster on 32bit platform.

@DemiMarie
Copy link

A couple suggestions:

  • Instead of a weakened SipHash, just use SipHash-1-3 and truncate the output if necessary. Crypto algorithms are brittle – you can't change them and expect any guarantees unless you are a cryptographer yourself.
  • The table I am most worried about is the string intern table. While it might be okay for LuaJIT's ordinary tables to not be resistant to HashDOS, if the intern table is vulnerable then all strings must be stored in FFI data structures. That means that – unless one is willing to risk bugs leading to arbitrary code execution due to violations of memory safety – one must pay an extra indirection (to keep the FFI data structures in the metatable hidden), and one loses compatibility with most libraries.
  • Judy arrays would be a good choice. Unfortunately the only implementation is LGPL and it doesn't allow strings with NUL bytes in them.

@funny-falcon
Copy link

SipHash has awful performance on 32bit platform. This 32bit version has all SipHash's properties valuable for hash table, and it has same perfornance either on 32bit or 64bit.

@DemiMarie
Copy link

DemiMarie commented Jun 26, 2016

@funny-falcon Yes, SipHash is slow on 32 bit. The solution is to write it using SIMD instructions, either using compiler intrinsics or just placing it in the assembler part of LuaJIT.

The reason why a 32-bit version of SipHash doesn't have SipHash's problems is that one does not get as much nonlinearity, which is critical for security. The best 32-bit crypto hash function is Chaskey, which is not really any faster than SipHash.

The best solution is to use SipHash-1-3, but to avoid computing hashes at run time as much as possible. If a table key is a literal in the source code, its hash should be computed during bytecode compilation, not after. Neither x.a = b, y = x.c, nor x['alpha'] = b, nor q = x['beta'] should ever cause a hash recomputation at runtime. String interning should be lazy: a string's hash code should not be computed until it is used as a hashtable key and/or compared to another string – and, in the latter case, the strings should be compared first; if they are equal, only one needs to be hashed.

Saying "don't let untrusted data in an interned string" is a very bad idea, since it means that the obvious code (that almost everyone writing Lua code will use) is vulnerable to a denial of service attack. In fact, it goes even further: it means that one must put all untrusted data (which, in a web app, is pretty much everything not in the source code itself) in FFI datastructures. That means manual memory management and no bounds checks. This is very bad. Security must always be the default.

One alternative is to switch to hash tables backed by trees or tries. This prevents the quadratic time DoS attack, since the degradation is only to O(n log n).

@MikePall If hash tables cannot be randomized, then it is game over anyway – finding collisions ahead of time is trivial. Unless you want to use 160-bit or beyond hash codes.

For a universal hash function, it is probably better to use a fresh key for every single table. The stream cipher ChaCha20 is a sufficiently fast PRNG.

Finally, note that none of this is needed for hashing pointers, at least ones that are not numbers. Since they represent machine addresses, they can be assumed to not be under an attacker's control. Therefore, a very simple hash function can be used that is very fast.

@funny-falcon
Copy link

funny-falcon commented Jun 27, 2016

The reason why a 32-bit version of SipHash doesn't have SipHash's problems is that one does not get as much nonlinearity, which is critical for security.

@DemiMarie But SipHash's author (Jean-Philippe Aumasson) doesn't agree with you. He says, 32bit version actually mixes bits faster than 64bit version so gives better non-linearity. The single reason why 32bit version is not "secure" is smaller state and 32bit result. That is what he said.

Don't mix crypto-security and hash tables, please. Hash tables doesn't need cryptographically secure functions. They need protection from hash flood (seed independent collisions). Cryptographically secure hash provides protection from hash flood, but it is really overkill.

All your other suggestions are huge modifications. You are free to make a pull request. Lets write meaningfull code, not meaningless words. Ok?

@DemiMarie
Copy link

@funny-falcon sorry!

@DemiMarie
Copy link

Here are two concrete proposals. Both keep the current data structure in the fast path, while falling back to a slower (but more robust) alternative in the slow path.

  1. Instead of using linear probing to back the hash table, use a tree. This solves most of the DoS problem – a slowdown to O(n log(n)) is likely to be unexploitable.
  2. Use a 2 level hash, with a (cryptographically) strong hash function as the backup hash. This solves the problem entirely.

@strrchr
Copy link

strrchr commented Aug 19, 2016

I found some hard coded hash values in lj_cparse.c and lib_ffi.c.
If I change the hash algorithm for string interning, I must change these hard coded hash values too.
Horrible!
I suggest use strcmp, simple!
Then the string hash algorithm will be simply configured.

@funny-falcon
Copy link

funny-falcon commented Aug 19, 2016

My patch has work around ( see #174 )

@funny-falcon
Copy link

@DemiMarie
One proposal has implementation and bechmark ( #174 ).
Other is just theoretical suggestion.

@DemiMarie
Copy link

I like your proposal and hope that it will be merged.

On Aug 19, 2016 7:00 AM, "Sokolov Yura" notifications@github.com wrote:

@DemiMarie https://github.com/DemiMarie
One proposal has implementation and bechmark ( #174
#174 ).
Other is just theoretical suggestion.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#168 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AGGWBwqZR_bFATLgc9rxcvpVEbmQ_cPWks5qhYzmgaJpZM4IRxfJ
.

@corsix
Copy link

corsix commented Aug 19, 2016

An alternative workaround would be to extend GG_State to include all of
these strings, and then identify said strings based on their offset
relative to g rather than based on their (fast) hash value. The same
inclusion-in-GG_State/offset trick could be used for lexically reserved
identifiers instead of the current GCstr::reserved field, which could be
beneficial in the context of the New GC (as abolishing the GCHeader would
leave no convenient space for the GCstr::reserved field, and inclusion in
the GG_State block would be a nice alternative to the LJ_GC_FIXED flag).

On Fri, Aug 19, 2016 at 11:58 AM, Sokolov Yura notifications@github.com
wrote:

My patch has work around.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#168 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAAGhknUo4HKaIQtizAQMzcvJ3BgGYE1ks5qhYxxgaJpZM4IRxfJ
.

@DemiMarie
Copy link

Do you (or anyone else) plan on implementing the new GC?

On Aug 19, 2016 6:31 PM, "Peter Cawley" notifications@github.com wrote:

An alternative workaround would be to extend GG_State to include all of
these strings, and then identify said strings based on their offset
relative to g rather than based on their (fast) hash value. The same
inclusion-in-GG_State/offset trick could be used for lexically reserved
identifiers instead of the current GCstr::reserved field, which could be
beneficial in the context of the New GC (as abolishing the GCHeader would
leave no convenient space for the GCstr::reserved field, and inclusion in
the GG_State block would be a nice alternative to the LJ_GC_FIXED flag).

On Fri, Aug 19, 2016 at 11:58 AM, Sokolov Yura notifications@github.com
wrote:

My patch has work around.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#168 (comment),
or mute
the thread
<https://github.com/notifications/unsubscribe-auth/
AAAGhknUo4HKaIQtizAQMzcvJ3BgGYE1ks5qhYxxgaJpZM4IRxfJ>
.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#168 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AGGWB75esqLDvTuabT_i_O3hd27YLtPfks5qhi6vgaJpZM4IRxfJ
.

@fsfod
Copy link

fsfod commented Aug 20, 2016

Maybe someone 😉 has already working partial implementation of the new GC. That also allocates the GG_State in the arena and was thinking about the a similar idea to Corsix's for built-in strings allocating them after it you could use the strings cellId in place of reserved id.
I'm not the biggest fan of losing the gct field from objects though because it will no longer be possible to enumerate the objects in an arena based on block bits and for fixed strings the ctype system also creates them you would have to keep them all in a big hash table instead of a dense cellId list per arena.

@DemiMarie
Copy link

@fsfod are you that someone 😉 ?

@thegrb93
Copy link

thegrb93 commented Jul 9, 2018

Apparently HashDOS is being used against luajit powered games now? Facepunch/garrysmod-issues#3526

@gonzalezjo
Copy link

Hey, that's me. Yeah, here was the code that I used to generate collisions: https://github.com/gonzalezjo/ljhashdos/blob/master/gen.c

@MikePall
Copy link
Member

Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants