-
Notifications
You must be signed in to change notification settings - Fork 939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce string hash collisions #168
Comments
The best strong hash function is SipHash. An optimized implementation that runs quite quickly can be found here. SipHash is considered to be cryptographically secure, so it avoids all hash collision attacks provided that the 128-bit key is chosen at random and kept secret. If SipHash is too slow (it might be) another option is to switch the layout of tables to use a balanced binary tree to handle collisions. This mostly mitigates the problem since the degradation is only to O(log n). |
SipHash is too slow (for LuaJIT). Look to the page you've linked: every 7byte string you push to Lua will consume 135 cycle with SSE2 on 32bit platform and 61 cycle on 64bit platform - it is huge amount of CPU time. So it should be smarter. In #174 I'm proposing "smart hashing":
With this scheme, SipHash could be used as fallback function. In fact, I've add an option to use 32bit cousin to SipHash, cause it is faster on 32bit platform, and I wish not use SSE. But even with this "fallback" approach, it is measurable performance hit to use "strong cryptographic" function instead of "fast and dump" simple whole string hash. And larger function is, larger the hit is. Perhaps, hardware assisted functions (using And do not forget, when you will suggest SipHash as a function for hash table next time:
|
The discussion mixes two different issues:
In detail:
|
This option is notable as PUC-Rio lua has taken this route. |
@MikePall look at #174 - it keeps using current sparse hash until long collision chain is detected. So that, usual workload is not affected. When long collision chain is detected, only strings that are not covered by sparse hash (ie longer than 12bytes) and only for this collisioned chain are created with full hash (for default case). "internal shortcuts" are handled by computed "sparse hash" for a string, that has "full hash" computed. It looks like, all "shortcuts" are for small strings, so that, their performance is not affected. Benchmark shows no effect on benchmark with no "bad" strings (case 1: unpatched 0.196ms , patched 0.192ms - statistically same) (it could be even faster cause of first commit - compare hashsum while traversing collision chain). As a bonus, last commit adds option "SMART_STRINGS=2": strong hash function used for all strings that fall in long collision chain - it will cover all paranoid needs. Used function is as resistant to "Seed Independent Collisions" as SipHash, but it is not crypto-safe, cause uses less state and produce just 32bit output. It is slower than SipHash on 64bit platform, but faster on 32bit platform. |
A couple suggestions:
|
SipHash has awful performance on 32bit platform. This 32bit version has all SipHash's properties valuable for hash table, and it has same perfornance either on 32bit or 64bit. |
@funny-falcon Yes, SipHash is slow on 32 bit. The solution is to write it using SIMD instructions, either using compiler intrinsics or just placing it in the assembler part of LuaJIT. The reason why a 32-bit version of SipHash doesn't have SipHash's problems is that one does not get as much nonlinearity, which is critical for security. The best 32-bit crypto hash function is Chaskey, which is not really any faster than SipHash. The best solution is to use SipHash-1-3, but to avoid computing hashes at run time as much as possible. If a table key is a literal in the source code, its hash should be computed during bytecode compilation, not after. Neither Saying "don't let untrusted data in an interned string" is a very bad idea, since it means that the obvious code (that almost everyone writing Lua code will use) is vulnerable to a denial of service attack. In fact, it goes even further: it means that one must put all untrusted data (which, in a web app, is pretty much everything not in the source code itself) in FFI datastructures. That means manual memory management and no bounds checks. This is very bad. Security must always be the default. One alternative is to switch to hash tables backed by trees or tries. This prevents the quadratic time DoS attack, since the degradation is only to O(n log n). @MikePall If hash tables cannot be randomized, then it is game over anyway – finding collisions ahead of time is trivial. Unless you want to use 160-bit or beyond hash codes. For a universal hash function, it is probably better to use a fresh key for every single table. The stream cipher ChaCha20 is a sufficiently fast PRNG. Finally, note that none of this is needed for hashing pointers, at least ones that are not numbers. Since they represent machine addresses, they can be assumed to not be under an attacker's control. Therefore, a very simple hash function can be used that is very fast. |
@DemiMarie But SipHash's author (Jean-Philippe Aumasson) doesn't agree with you. He says, 32bit version actually mixes bits faster than 64bit version so gives better non-linearity. The single reason why 32bit version is not "secure" is smaller state and 32bit result. That is what he said. Don't mix crypto-security and hash tables, please. Hash tables doesn't need cryptographically secure functions. They need protection from hash flood (seed independent collisions). Cryptographically secure hash provides protection from hash flood, but it is really overkill. All your other suggestions are huge modifications. You are free to make a pull request. Lets write meaningfull code, not meaningless words. Ok? |
@funny-falcon sorry! |
Here are two concrete proposals. Both keep the current data structure in the fast path, while falling back to a slower (but more robust) alternative in the slow path.
|
I found some hard coded hash values in lj_cparse.c and lib_ffi.c. |
My patch has work around ( see #174 ) |
@DemiMarie |
I like your proposal and hope that it will be merged. On Aug 19, 2016 7:00 AM, "Sokolov Yura" notifications@github.com wrote:
|
An alternative workaround would be to extend GG_State to include all of On Fri, Aug 19, 2016 at 11:58 AM, Sokolov Yura notifications@github.com
|
Do you (or anyone else) plan on implementing the new GC? On Aug 19, 2016 6:31 PM, "Peter Cawley" notifications@github.com wrote:
|
Maybe someone 😉 has already working partial implementation of the new GC. That also allocates the GG_State in the arena and was thinking about the a similar idea to Corsix's for built-in strings allocating them after it you could use the strings cellId in place of reserved id. |
@fsfod are you that someone 😉 ? |
Apparently HashDOS is being used against luajit powered games now? Facepunch/garrysmod-issues#3526 |
Hey, that's me. Yeah, here was the code that I used to generate collisions: https://github.com/gonzalezjo/ljhashdos/blob/master/gen.c |
Fixed. |
I started this discussion first on LuaJIT mailing list:
http://www.freelists.org/post/luajit/Slowness-due-to-memory-allocation-related-problems
This was also discussed on Hacker News a few years ago:
https://news.ycombinator.com/item?id=3401900
Since then it looks like many of the programming languages have fixed this issue using e.g. randomization. PUC Lua 5.3.2 also seems to have fix for this (don't know exactly when it was fixed there, though). LuaJIT has no fix for this, but there are at least a few unofficial fixes that DO work:
How to reproduce:
luajit -e 'require "resty.template.microbenchmark".run(1000)'
(inlib
dir)luajit -e 'require "resty.template.microbenchmark".run(10000)'
(inlib
dir)Now compare the results with and without the patches mentioned above. Without patches the
3.
will be about 10x slower than the patched run on the same test (you can try alsoluajit -e 'require "resty.template.microbenchmark".run(100000)'
– 1000000 will start emitting out of memory when 2 GB limit is hit if you have not patched LuaJIT with some bigger memory space patches).The text was updated successfully, but these errors were encountered: