Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Siphash to use one of the faster variants of the algorithm (Siphash13, Highwayhash) #29754

Closed
funny-falcon opened this issue Nov 10, 2015 · 77 comments
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@funny-falcon
Copy link

Jean-Philippe Aumasson answered on "can Siphash13 used instead of Siphash24 for hash function of hash tables":

we proposed SipHash-2-4 as a (strong) PRF/MAC, and so far no attack whatsoever has been found,
although many competent people tried to break it. However, fewer rounds may be sufficient and I would
be very surprised if SipHash-1-3 introduced weaknesses for hash tables.

So that, there is no reason to burn more cpu power on redundant rounds.

@funny-falcon funny-falcon changed the title Use lightweighter Siphash variant: Siphash13 instead of Siphash24 Use light Siphash variant: Siphash13 instead of Siphash24 Nov 10, 2015
@bstrie
Copy link
Contributor

bstrie commented Nov 10, 2015

Sounds like it would only take a two-line change to test the perf impact of this change. @gankro, did you have some hashtable benchmark scripts lying around?

@arthurprs
Copy link
Contributor

That'd be interesting. Sub.

@Gankra
Copy link
Contributor

Gankra commented Nov 10, 2015

You can fork http://cglab.ca/~abeinges/blah/hash-rs/ if you want to do some ok-ish comparisons.

@bluss is my goto expert on siphash.

@Gankra
Copy link
Contributor

Gankra commented Nov 10, 2015

Note that direct consumers of SipHash may break if they decided to manually seed it and store the results. So this isn't a totally free change.

@bluss
Copy link
Member

bluss commented Nov 11, 2015

Aumasson is the cryptographer, maybe we can ask him what he thinks about Rust going for SipHash-1-3 specifically? I'm curious what people have learned about SipHash in general now that it's been out for several years.

If we weaken it so much that it doesn't prevent hash flooding anymore, then we lose the reason to use SipHash and could use a faster hash function.

I know I tried long ago and noted that SipHash-1-2 didn't even pass the Smhasher testsuite (which is probably bad). (SipHash-N-M is for N hash rounds per 8 bytes of input and M fixed rounds as finalization).

@funny-falcon
Copy link
Author

Siphash-1-3 passes Smhasher. It is smallest configuration, which has good avalanche.

Siphash is protected from "seed independent collisions" with any configuration with good avalanche.

Siphash24 is recomended for MAC cause when you do MAC you put result in public, so there are other vectors of attack: plain text, chosen plain text, differential, etc. As cryptographers, Siphash authors simply recommend more rounds than minimal required to protect against future investigations.

Within hash-table result of hash-function is not published, so there is no way to recover seed. Then protection from "seed independent collisions" is just enough.

Note that direct consumers of SipHash may break if they decided to manually seed it and store the results. So this isn't a totally free change.

Agree. There should be separate SipHash13 for internal/hash-table usage, and SipHash (24) remains for direct/external usage.

@funny-falcon
Copy link
Author

Aumasson is the cryptographer, maybe we can ask him what he thinks about Rust going for SipHash-1-3 specifically?

I asked him year ago about using SipHash13 for hash tables. Answer is in issue text.

I think, we can ask him again.

@tarcieri
Copy link
Contributor

If @veorq feels its fine then it should be fine. The worst case is there's a hashDoS against Siphash13, in which case you can upgrade to a stronger hash function.

@bstrie
Copy link
Contributor

bstrie commented Nov 12, 2015

@gankro, how likely do you think this is to break code in the wild? I was under the impression that we considered the precise underlying default hash function to be an implementation detail (with the caveat that you can count on it to be cryptographically secure).

@tarcieri
Copy link
Contributor

@bstrie from a security perspective, I'd say you definitely should not couple to the hash function in any way. Java made that mistake (the hashing algorithm was included in the Java Language Specification) and that made it so they were not able to adequately respond to hashDoS.

@bstrie
Copy link
Contributor

bstrie commented Nov 12, 2015

@tarcieri That's actually tied in with what I'm implying here. Supposing that we did switch to 1-3, and that an attack on 1-3 was later discovered, we'd want to reserve the right to bump it back up to 2-4 (or, you know, whatever the state of the art is by then) without risking massive breakage in the ecosystem. If changing the default hash function should prove to be disruptive at the current point in time then that's a separate problem that we need to consider, hence why I'm curious as to the true magnitude of such a change.

Furthermore, I get the impression that the breakage that Gankro's envisioning wouldn't be detectable by Crater.

@bstrie bstrie added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Nov 12, 2015
@bluss
Copy link
Member

bluss commented Nov 12, 2015

Using 1-3 doesn't fix the fundamental slowness of our hashing, but it will instead speed up the long data case. The short data case remains a problem, due to (I think):

  • Lack of one-shot hashing, means the code visits branches to read and or update the partially written tail
  • Strings and slices write both data and length / delimiter, which is slow since it enters SipHash::write twice, hitting the code in the first point.

Of course we should still go for it, if possible, but I don't think it's a substitute for allowing faster alternatives to be used (application may choose to sacrifice hash flooding defence, deciding it's not useful for them).

@tarcieri
Copy link
Contributor

@bstrie it seems when Java was last affected by hashDoS they ultimately ended up making a breaking change to the language specification:

http://mail.openjdk.java.net/pipermail/core-libs-dev/2012-May/010238.html

@ranma42
Copy link
Contributor

ranma42 commented Nov 12, 2015

Did anybody experiment adding padding to 8 bytes in SipHash?
In particular, if we padded whatever is input to 8 bytes, it would be possible to remove the ntail handling code here

if self.ntail != 0 {
needed = 8 - self.ntail;
if length < needed {
self.tail |= u8to64_le!(msg, 0, length) << 8 * self.ntail;
self.ntail += length;
return
}
let m = self.tail | u8to64_le!(msg, 0, needed) << 8 * self.ntail;
self.v3 ^= m;
compress!(self.v0, self.v1, self.v2, self.v3);
compress!(self.v0, self.v1, self.v2, self.v3);
self.v0 ^= m;
self.ntail = 0;
}

and
self.tail = u8to64_le!(msg, i, left);
self.ntail = left;

@veorq
Copy link

veorq commented Nov 12, 2015

SipHash designer here, haven't changed my opinion about SipHash-1-3 :-)

A warning, though: SipHash-1-3 leaves 4 rounds between the last attacker-controlled input and the output. There's a "distinguisher" on 4 rounds in https://eprint.iacr.org/2014/722.pdf, or in simplest terms a statistical bias that shows up given a specific difference pattern in the input of the 4-round sequence. But you can't inject that pattern in SipHash-1-3 because you don't control all the state. And even if you could inject that pattern the bias wouldn't be exploitable anyway.

@arthurprs
Copy link
Contributor

@bluss yeah, sooner of later that will need to be addressed as well.

@bstrie
Copy link
Contributor

bstrie commented Nov 12, 2015

@veorq, thanks for your input!

@bluss, indeed I don't believe anybody here is suggesting this would obviate the desired ability to provide an alternative hashing algorithm.

@bstrie
Copy link
Contributor

bstrie commented Nov 12, 2015

@ranma42, would you be so kind as to open an issue for that? It likely deserves its own discussion and shouldn't get tangled up with this one.

@tarcieri, do you know of any experience reports from the fallout of that change in the broader Java ecosystem?

@arthurprs
Copy link
Contributor

I benchmarked it with @gankro machinery, it definitely improves the speed. See https://i.imgur.com/5dKecOW.jpg

As we got confirmation it's secure enough it looks like a worthy improvement.

@Gankra
Copy link
Contributor

Gankra commented Nov 12, 2015

Yeah, looks good. Only worry is people "inappropriately" depending on the algorithm being stable.

@arthurprs
Copy link
Contributor

Can't we include this variant as SipHasher13 and set it as the default hasher? I don't see how this can break anybodys code. Am I missing something?

@bluss
Copy link
Member

bluss commented Nov 13, 2015

I don't think we should change std::hash::SipHash, since it's stable and documented to be SipHash-2-4. We can keep the new hasher private for now.

@bstrie
Copy link
Contributor

bstrie commented Nov 14, 2015

@bluss I can imagine support for an RFC to deprecate std::hash::SipHash and replace it with std::hash::SipHash24 and std::hash::SipHash13.

@bstrie
Copy link
Contributor

bstrie commented Nov 14, 2015

@gankro, I'm leaning towards saying that this warrants an RFC, unless you think breakage will be essentially nanoscule. (Did we really document that our SipHash variant is specifically 2-4, and did we make it sound like that's guaranteed? If so, that's worth changing too.)

@bluss
Copy link
Member

bluss commented Nov 14, 2015

I think our HashMap is completely free to pick its hash function. Changing it should not be a problem.

@Gankra
Copy link
Contributor

Gankra commented Nov 14, 2015

SipHash 2-4 is, as far as I can tell, "the" SipHash, so it's totally sound for it to be exposed as SipHash. Having WeakSipHash or whatevs should be fine.

@arthurprs
Copy link
Contributor

@briansmith sounds like a good idea to me.

@ticki
Copy link
Contributor

ticki commented May 27, 2016

I've been working on trhasher, which can be used to benchmark such an implementation.

@funny-falcon
Copy link
Author

funny-falcon commented May 27, 2016

@briansmith cause SipHash-1-3 is already secure enough for any usage in hash table.
More over, it is still not broken for any other usage, so at this moment of time, it is still cryptographically secure.
It is better to use "faster implementation of SipHash-1-3", than "faster implementation of SipHash-2-4".

bors added a commit that referenced this issue Jul 1, 2016
hashmap: use siphash-1-3 as default hasher

Also exposes `SipHash13` and `SipHash24` in `core::hash::sip`, for those that want to differentiate.

For motivation, see this quote from the original issue:

> we proposed SipHash-2-4 as a (strong) PRF/MAC, and so far no attack whatsoever has been found,
although many competent people tried to break it. However, fewer rounds may be sufficient and I would
be very surprised if SipHash-1-3 introduced weaknesses for hash tables.

This keeps a type alias of `SipHasher` to `SipHash24`, and since the internal default hasher of HashMap is specified as "not specified", changing it should not be a breaking change.

Closes #29754
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Jul 9, 2016
…hash24

SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Jul 10, 2016
…hash24

SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Jul 10, 2016
…hash24

SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Jul 10, 2016
…hash24

SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Jul 12, 2016
…hash24

SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Jul 12, 2016
…hash24

SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Sep 5, 2016
…hash24

SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Sep 26, 2016
…hash24

SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Sep 26, 2016
…hash24

SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Sep 27, 2016
…hash24

SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Nov 2, 2016
…hash24

SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Nov 4, 2016
…hash24

SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Nov 5, 2016
…hash24

SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
funny-falcon added a commit to funny-falcon/ruby that referenced this issue Dec 8, 2016
SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940
hsbt pushed a commit to ruby/ruby that referenced this issue Jan 20, 2017
SipHash13 is secure enough to be used in hash-tables,
and SipHash's author confirms that.
Rust already considered switch to SipHash13:
  rust-lang/rust#29754 (comment)
Jean-Philippe Aumasson confirmation:
  rust-lang/rust#29754 (comment)
Merged pull request:
  rust-lang/rust#33940

From: Sokolov Yura aka funny_falcon <funny.falcon@gmail.com>
Date: Thu, 8 Dec 2016 20:31:29 +0300
Signed-off-by: Urabe, Shyouhei <shyouhei@ruby-lang.org>
Fixes: [Feature #13017]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57382 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

No branches or pull requests