Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
runtime: Extend Go's map crypto hash guarentee to all platforms in 1.5 #9365
It would be much better to extend Go's map crypto hash guarentee to all platforms.
With all due respect I think the proposed new fallback hash function is a mistake. Seriously, I was about to make a "Merry Christmas" post on go-nuts thanking the Go Authors for using a crypto quality hash to implement maps on X86-64. As many other languages struggle with their hash functions being hacked with differential crypto, Go is in much better shape.
My proposal is SipHash.
On that page note "C++ program to find universal (key-independent) multicollisions for CityHash64"
I wonder how many users have Go programs and APIs that accept uncontrolled input over the web and pump that into a map? The Go team has an opportunity here to protect users from themselves starting with 1.5.
There is already an X86_64 optimized version of SipHash here and it's in the "public domain".
I'll try to post some numbers in a few days showing that SipHash could be a viable replacement for the fallback hash.
FWIW, I am working on Go project to offer a large number of hash functions along with testing and benchmarking under one project here:
I took SMHasher and aeshash code from the Go runtime. There is also a proposal for a new, non streaming hash interface, and more. It's still in pretty rough shape, as I've been hacking on it every day. The benchmarking that it does is currently kind of broken and suffers from using an generic adapter function. I am working on that.
I really hope Go 1.5 can extend map's crypto guarentee to all platforms.
I think he means (strong) collision resistant.
If the OP could demonstrate that it's possible for an adversary to generate
I'd rather implement AES hash for power64 and the upcoming arm64 port,
OP, could you be more specific about platforms that are exposed? You mention that amd64 is not affected, but then leave the loop open.
Do you mean arm, i386, or power64 ?
I am not a crypto expert. But I believe that the Go runtime is somewhat resistant to this kind of attack because every map uses an individual hash seed that is chosen randomly at run time. Since an attacker who is not on the local machine has very limited visibility into map lookup times, I think it would be quite difficult to run such an attack remotely.
changed the title from
Extend Go's map crypto hash guarentee to all platforms in 1.5
runtime: Extend Go's map crypto hash guarentee to all platforms in 1.5
Dec 17, 2014
I am also not a crypto expert. The slides below show the hash flood attacks are independent of seed for CityHash and Murmur3.
Here is an example I just ran in HashLand that demonstrates seed independent collisions, what they call "multicollisions", for murmur3, 32 bit:
Just to be clear these keys are strings stored in a file in utf8/hex format and I converted them to
The source code that generated these keys is referenced above.
That's not what we did. AesHash is not a crypto quality hash. Despite its name, it has no crypto guarantees. Its only relation to AES is that it coopts assembly instructions that were originally designed to make AES faster. AesHash's main features are that it is fast and that it passes SMHasher, a good (non-cryptographic) hash test suite. A nice-to-have additional feature is that it can fold in some process startup randomness to thwart DoS attacks (In addtion to the per-map random seed that Ian mentioned).
I intend to add some startup randomness to the new hash functions once they are in. It should only require a judicious xor or two to add that in.
SipHash-2-4 is about 4x slower on small keys and 10x slower on large keys than the new functions for 1.5. (Comparing C versions of both)
I appreciate you clearing up the fact that aesHash is not a crypto quality hash. Unfortunately, "now matters are worse."
I still feel strongly that Go should seriously consider using a crypto quality hash for the map implementation in 1.5. As the code and slides referenced above demonstrate, using a seed with CitiyHash and Murmur3 is completely broken. Pick any seed you want, it doesn't make any difference. Yet, you still mention the "per-map random seed". It almost seems as if you are claiming you can thwart "hash-flooding DoS" attacks with your new hash code by doing a "judicious xor or two" combined with a random seed based on "some process startup randomness"? Is that your claim? If so, I suggest you write it up and present a paper for peer review. Why not send it to the SipHash authors for review? I am serious. Better to be proactively picked apart then to cower after you've been hacked.
One fine point about SipHash. If SipHash is parameterized (the benchmarked version below is not) without a performance loss and Go's map were ever hacked, binaries could be patched to increase c and/or d (# rounds) and the attacks could be instantly mitigated.
My AesHash vs SipHash benchmarks are very different than the numbers you mention. Mine are Go 1.4 w/ asm numbers. Only the hash function is called and the result is not saved. In my numbers siphash starts off 1.5 times slower for a 4 byte keys and ends up 2.1 times slower with 1k keys. My machine is an Intel i7-2860QM @ 2.5 GHz. Code is all in HashLand but you have to recompile to get these results. Results below. Assume for a second my numbers are correct.
I understand that everyone is motivated to increase the performance of Go. Me too, that's why I wrote #9337. Creating a new language is more than winning the benchmark wars. Go isn't going to win that one anyway.
Perhaps SipHash could be further optimized using AVX or AVX2 instructions. Perhaps the number of d rounds could be decreased. Maybe there's a faster crypto hash out there. Google should expend some of it's considerable resources to develop the next great high performance crypto hashing algorithm, pay some others to do so, or hold a $10M contest to find one.
Go can mitigate "hash-flooding DoS" attacks with a 2X performance hit to map. I admit that a 2X speed decrease is a big deal. Users would complain. Let's face the facts. Many users will continue to pump uncontrolled keys directly from the web into maps and put themselves, their employers, and maybe even our country, at risk. Look at the headlines today ("Sony"). Look at the slides I pointed to above where the SipHash authors pick appart Python, Java, and others. It takes real leadership to slow the map implementation down 1.5-2.1x to increase Go's security. Security is more important than it ever was. Performance is less important than it ever was. Please adjust your priorities. I beg you to reconsider.
I couldn't get any useful information from the slides, but there is a good paper at https://131002.net/siphash/siphash.pdf . It basically argues that timing information is always available, even across the network. The basic attack on an HTTP server is to send many headers, attempting to get them to collide in such a way that hash lookups become less efficient. The Go net/http package is somewhat resistant to this kind of attack, because it limits the size of the headers in an incoming HTTP request. The default limit is 1MB. So an attack on the Go net/http package has only a limited number of strings to play with. And each new HTTP request gets a new map, with a different seed, so the earlier timing information is useless. So I don't see a serious attack on Go's HTTP server here.
An attack requires a single map stored across multiple requests, such that each new request can add entries to the map, with no limit on the number of entries, and where timing information is available for each new request. When that is the case, an attacker can cause the map to flood. In the case of Go's implementation this will mean a long series of overflow buckets for the same hash value, shifting map lookup time, and, perhaps more importantly, map insertion time, from O(1) to O(N).
It's difficult for me to judge the risks here. In a modern server, a single map that retains entries across multiple requests does not seem to be a likely case. A single map that accepts new entries with no bounds also seems unlikely, and such a map would seem to be susceptible to much simpler attackes. But it is certainly possible. The question at hand is whether to slow down the map implementation for everybody to protect against this unlikely but possible case.
Another possible approach would be to change the hash seed each time the hashmap grows. That would require sometimes computing the hash key twice while map growth was in progress. I think we already rehash each key as we evacuate an old bucket, so I don't think it would cost anything there. I think this would significantly reduce the scope of any attack, as the seed would have to be recomputed each time the hash map grows.
Anyhow, khr can make a real decision here. I'm having a hard time translating from the theoretical attacks on things like caching DNS servers to practical attacks on real Go programs, but there could well be aspects of this that I am missing.
It is not a surprise that hashes not designed for DoS resistance have no DoS resistance. For instance, CityHash does f(g(msg), seed). Collisions in g cause collisions in CityHash independent of seed. My conclusion: don't fold in the seed like that if you want DoS resistance.
Yes, and there is an existence proof. SipHash does exactly that with 4 xors.
That's something we should figure out one way or another. I found a performance bug in my implementation (byte instead of word loads for siphash). After fixing it I get more like 2.6x/6.7x (short/long) compared to the new hash algorithm and 2.9x/12.6x compared to aeshash.
My testbed is inside the SMHasher test suite, instructions to grab it and try it yourself are below.
Equating DoS attacks with attacks like these doesn't help your argument.
As Ian said, no one is claiming that DoS attacks aren't worth defending against. The question is one of cost to every other user of maps. Also, DoS defenses don't have to be perfect. A low-cost solution that makes DoS attacks 100x harder may be better than a high-cost solution that eliminates them entirely.
Finally, note that the new fallback hash that started this whole conversation is 1e9x better than the old fallback hash. So, progress.
That does not sound like a good thing to do, IMHO. Maps should work bets for the most generic cases, anything else can be implemented elsewhere. I am not a crypto expert but I think this issue is about finding a way to improve maps in generic cases -- not adding special ones.
@slimsag Its worth pointing out that what I am recommending is an optional argument that is analogous to the optional "cap" argument when using the make built in to allocate a slice.
In other words the feature isn't something developer's have to interact with in general unless they explicitly want to use the feature.
In my opinion I don't think every map in a user's program would always benefit more from the security add by a crypto hash, than it would from the performance it must sacrifice in order to use a more computationally expensive hash.
Also its worth noting that allowing us to override the hash function could allow us to return a map's value's hash along side the actual value in a map lookup statement, (e.g
Modifying the language for optional security feature won't work well, because that ultimately
The author of a package might think that a certain map in this package is not going to hold
Even if we had such a feature of optionally using hardened hash functions for certain maps,
That said, I don't think there is any evidence that the current hash functions are insecure