-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add (Half)SipHash PRF and integrate it into Crystal::Hasher #5104
Conversation
src/crystal/hasher.cr
Outdated
end | ||
|
||
def slice(value) | ||
@siphash.update(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine for Slice(UInt8)
, maybe okay-ish with other primitives (e.g. booleans, integers, floats), but most probably 💩 for anything else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, fine for Int::Primitive
and Float::Primitive
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akzhan , it is debatable question.
Slice[1, 2, 3].hash.should eq(Slice[1_u64, 2_u64, 3_u64].hash)
or should_not?
We know siphash-1-3 is fine for a hash function, but do we know for sure that halfsiphash-1-3 is? |
I guess this fits our need... unless someone starts using |
Ah, so halfsiphash's typical variant is 1-3? That makes it a lot clearer, thanks. |
Also, this response by Jean-Philippe Aumasson on linux.kernel that pretty much details the above warning:
|
Implements streaming versions of the siphash family of pseudo random functions, limited to a fixed result size. Digest::SipHash computes an UInt64 hash, whereas Digest::HalfSipHash computes an UInt32 hash.
Improves Crystal::Hasher to use the SipHash family of pseudo random functions for improved security of hashes; for example protecting against hash flooding DoS attacks. Since the hashes are never disclosed to a potential attacker, we rely on the faster 1-3 versions instead of the cryptographically verified 2-4 versions. Last but not least, this uses siphash-1-3 on 64-bit systems and halfsiphash-1-3 on 32-bit systems, for best overall performance.
e3bb7ba
to
4a4f447
Compare
end | ||
|
||
def string(value) | ||
value.to_slice.hash(self) | ||
@siphash.update(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaking change: "some data".hash
and "some data".to_slice.hash
no longer have the same hash!
String#hash
will compress4-bytes8-bytes at the time ("some dat", "a");Slice#hash
will iterate each byte, casted to UInt64, and finally compressed as 8 bytes ("s" => 115 => 0x0000_0000_0000_0073).
For performance of Hash(String, V)
we definitely want a fast String#hash
, but we also want 1_u8.hash == 1_u64.hash
, but we introduce a difference between String#hash
and Bytes#hash
. Maybe it's fine, maybe it's not. I don't know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slice#hash
will iterate each byte, casted to UInt64, and finally compressed (as 8 bytes).
Is there a performance penalty to do it byte by byte?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. SipHash will only compress 8 plain bytes at once (4 bytes with halfsiphash) —no more, no less, unless we finalize — if it's smaller or not a multiple of 8 bytes (or 4 bytes with halfsiphash) then we must buffer bytes until we have enough to compress —it adds to the overhead.
Furthermore, encoding "some data"
will result in compressing "some dat"
then "a"
(thus call update
once then finalize), whereas "some data".to_slice
will call update
9 times (for each byte casted to UInt64).
Note that there is little we can do: Slice(T) may be any T, and we must properly call T#hash(hasher)
for each entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite think this is a "real" breaking change, as "some data" != "some data".to_slice
. People can't depend on the output of hash
even after 1.0. They can only depend on the invariants around ==
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation
Note that there is little we can do:
Slice(T)
may be any T, and we must properly callT#hash(hasher)
for each entry.
You're right, maybe we could (in an other PR?) add an optimization for some T
types, like Bytes
(Slice(UInt8)
). And/Or more generally to any T that is a Number
(and Bool
)?
src/crystal/hasher.cr
Outdated
alias HashType = UInt64 | ||
{% else %} | ||
alias SipHash = Digest::HalfSipHash | ||
alias HashType = UInt32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m very unhappy to handle two different hash types on different architectures. Is it really unavoidable?
I should note that you must update StringPool accordingly in this case
src/string_pool.cr
Outdated
@@ -178,6 +178,6 @@ class StringPool | |||
hasher = Crystal::Hasher.new | |||
hasher = str.to_slice(len).hash(hasher) | |||
# hash should be non-zero, so `or` it with high bit | |||
hasher.result | 0x8000000000000000_u64 | |||
hasher.result.to_u64 | 0x8000_0000_0000_0000_u64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use 32/64 bit arithmetic here or simply use 64 bit hashes everywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly what I propose in the updated PR description.
I chose the easy solution here, but tweaking StringPool to take whatever is easy (except for high not or).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm strongly vote for UInt64
type. Just because I have unfinished work that hardly depends on it (number normalization).
Perhaps the hash type should always be a |
I modified halfsiphash to compute 64-bit hashes, which is slower (doubles finalizations), but still faster than siphash (on 32-bit systems). Some unscientific benchmarks (
|
CircleCI is broken: |
LGTM (but we need optimization for |
Due to how Hasher#string was implemented (iterating each byte of the slice casted to u64), the performance impact of siphash (hashing bytes directly) is contained. Around 33.5µs (siphash) vs 32.5µs (current) on a 64-bit host. |
So Crystal::Hasher#result always returns an UInt64 across systems. Generating 8-bytes instead of 4-bytes hashes doubles the finalization rounds but is still significantly faster in 32-bit systems than Digest::SipHash.
735bac7
to
a2af730
Compare
For some reason the last commit didn't appear on GitHub. I amended and force pushed it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a way to optimize slices. May be Hasher.slice(slice)...
Find the way, we should declare struct Hasher
def slice(value) : self
@siphash.update(value)
self
end
def string(value)
value.to_slice.hash(self)
end
end
struct Slice(T)
def hash(hasher)
hasher.slice(self)
end
end and revert |
Tried already, but We could consider a Slice to be a binary blob, and hash it as is: struct Slice(T)
def hash(hasher)
hasher.bytes Bytes.new(to_unsafe.as(UInt8*), size * sizeof(T))
end
end But I'm not sure; what if T is Time or LibC::SockaddrIn? Shall we iterate the slice and hash each item as struct Slice(T)
def hash(hasher)
{% if [UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64, Bool, Float32, Float64].includes?(T) %}
hasher.bytes Bytes.new(to_unsafe.as(UInt8*), size * sizeof(T))
{% else %}
super hasher
{% end %}
end
end
|
I prefer to think about Slice as blob (may be with stricture on T.struct? || T.is_a?(Primitive)). Just because |
I got the wrong assumption that Slice(T) only accepted primitives and structs for T, but it accepts anything, so yeah, we should filter-out references, but there doesn't seem to be |
I now think that |
I disagree. There's plenty of times where C structs and other int primitives make sense. Please keep Slice generally generic. I'd at least need to hear some concrete arguments against it first. |
Since Pointer(Struct) is possible I suppose say that Slice(Struct) makes sense too, if we consider slice to be safe pointers with bound checks. Maybe aliasing Bytes as Slice(UInt8) is the issue? It's basically the same, but conceptually maybe there is a tiny difference: a chunk of bytes vs a collection of ordered T? |
I still don't see the problem, this is simply a performance enhancement right? The typical method is simply to iterate the slice and call |
Btw, is macro system allows to know |
@akzhan Why? Float doesn't have inner pointers, but it still must use |
@RX14 Slice on T without any references can be interpreted as raw byte array. It's usable for hashing too. |
No, We can perform optimizations in some cases where we can read the slice in blocks which suit siphash, but that's all it is: an optimization. class Test
def ==(other : Test)
true
end
def hash(hasher)
hasher.bool(true)
end
end
Slice(Test).new(1, Test.new).hash == Slice(Test).new(1, Test.new).hash This should always be true even if they're different pointers to different objects because we defined them to be the same using The simple optimization we would perform, that @ysbaddaden already wrote the code for (without float though), preserves this property (all equal ints have one unique bit representation on a given machine), so it's valid. |
Ok, lets get it merged. |
@ysbaddaden you should rework https://github.com/crystal-lang/crystal/pull/5000/files#diff-ecf42db578886535ae723622017d8e95R179 code piece to keep |
I read more closely the issue. As I said, Slice(UInt8) is by far the most common type of slice. If we can optimize for that case then we should do it. Other slices can use a slower hash for now. That should also optimize String if we use str.to_slice.hash. |
I've just profiled the compiler compiling itself: a full 10% of compiler time is spent in |
@RX14 What if you profile the compiler before we started making all these changes to |
In my version I made only specialization for I don't see reason this shouldn't be done. At least until crystal has specialized primitive "string_slice". BTW, don't forget about hashing of HTTP headers. |
|
||
v1 ^= 0xdd | ||
|
||
DROUNDS.times { sipround } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need in whole second finalization. Just increase DROUNDS by 1, and use v1 and v3 separately, without xor.
Given hash-value is not going to be exposed, there is no need in being such strong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I follow the official implementation for halfsiphash to generate an UInt64. I'm not competent enough to change the algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, believe me at least once.
You've aleady mentioned that #hash
should not be exposed. With this limitation my suggestion is absolutely safe. Just believe me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't do it, it's not safe. Just believe me.
@funny-falcon Actually, I do believe you. But if two people come with contradictory arguments and each say "Please believe me" there's no way to tell who is right, and why we should believe one or the other.
Can you provide a proof that doing that is safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asterite, its your language, do what you want.
I have no degree in cryptography, still I understand what I'm talking about.
You are just joking.
Nobody uses 32bit now, so this is irrelevant. Just add empty loop here, and no one will complain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, no one will complain if this function will be much slower than it needs to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really have nothing against you! But I'm really incompetent to judge.
Those changes also mean that Digest::HalfSipHash would no longer be compliant to the specs, right? Since I want to expose it to developers, then the implementation should be compliant.
Since generating an UInt32 is okayish, maybe we can just return an UInt32 in Digest::HalfSipHash, and merely expand it to an UInt64 in Crystal::Hasher, so we don't have to deal with per-platform types, yet keep the algorithm compliant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think yes. Hash is for example used in the compiler, and @RX14 benchmarked the compiler and with this change (I think) String#hash takes 10~15% of the time. So a fast hash function is important. But together with that we should probably optimize Hash
entirely, avoid invoking #hash
for small hashes (just do a linear lookup). And maybe even have a specialized Hash for the compiler where order of insertion doesn't matter.
There's just so much to do...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and merely expand it to an UInt64 in Crystal::Hasher
I think, it is good option. Simple multiplication by non-trivial odd number does the trick.
I pushed a commit to optimize Int32.slice(1, 2, 3).hash == Int64.slice(1, 2, 3)
Int32.slice(1, 2, 3).hash != UInt8.slice(1, 2, 3) |
@ysbaddaden what about this trick: struct Crystal::Hasher
def slice(value) : self
return self unless value.bytesize
if value.first.is_a?(Int::Primitive) || value.first.is_a?(Float::Primitive)
p "fast"
p64 = value.to_unsafe.as(UInt64*)
size64 = value.bytesize / sizeof(UInt64)
size64.times do |i|
@result = @result * 31 + p64[i]
end
full_len = size64 * 4
remainder = value.bytesize & 0x03
remainder.times do |r|
@result = @result * 31 + value.to_unsafe.as(UInt8*)[full_len + r]
end
self
else
self
end
end
end
p Crystal::Hasher.new.slice(Slice(UInt8).new(4)).result Looks like code piece that checks Slice T should be rewritten. |
Wow. I have this benchmark: words = ["one", "two", "hello world", "welcome", "good bye", "さようなら"]
hash = {} of String => Int32
words.each do |word|
hash[word] = word.bytesize
end
time = Time.now
a = 0
10_000_000.times do
words.each do |word|
a += hash[word]
end
end
puts Time.now - time In Crystal 0.23.1 it takes 0.95s. In master (because we now randomize hashes) it takes 1.69s. And with this PR it takes 4.27s! But then, the same benchmark in Ruby takes 12.98s, so it might be ok. However, in Go it takes 0.76s, so I guess we better optimize our Hash and |
Try funny-falcon’ hasher1 branch to compare :-)
чт, 19 окт. 2017 г. в 17:23, Ary Borenszweig <notifications@github.com>:
… Wow. I have this benchmark:
words = ["one", "two", "hello world", "welcome", "good bye", "さようなら"]
hash = {} of String => Int32
words.each do |word|
hash[word] = word.bytesizeend
time = Time.now
a = 010_000_000.times do
words.each do |word|
a += hash[word]
endendputs Time.now - time
In Crystal 0.23.1 it takes 0.95s. In master (because we now randomize
hashes) it takes 1.69s. And with this PR it takes 4.27s!
But then, the same benchmark in Ruby takes 12.98s, so it might be ok.
However, in Go it takes 0.76s, so I guess we better optimize our Hash and
#hash algorithms. I checked and Go doesn't seem to use siphash.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#5104 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAG3hPpNpKGfuZP3qx-svHDi7gO8YVsbks5st1t0gaJpZM4P1c6G>
.
|
On my laptop:
|
All is compiled with |
With hash function I initially crafted for Crystal. |
Here are my results (Ubuntu 14.04, 64-bit), using benchmark.ips and 2165 random english words. Master is slightly slower than 0.23.1, while this branch is twice slower:
Note: we can most surely tweak / rework the algorithm to enable some LLVM optimizations. |
Go uses as fast function as its author could made. So, it is your choice what to use:
|
I think I prefer strong enough fast function :-) |
@funny-falcon can you send PR with your fastest implementation? I'm always can get hasher1, of course. With no normalization please |
hasher1 is almost the same as current hasher, so it is as dumb. |
@asterite @funny-falcon I think that we
Every step is by another PR. |
I've made #5146 with funny_hash64 like function. Fastest result above were achieved on branch from older master commit with my hasher interface and 2*32bit state. |
@ysbaddaden looks this PR should be modified to no touch Crystal::Hasher directly. Just alternative digests. |
Now that the alternative hash function has been merged, I don't think there's a need for this in the standard library (it could be a useful shard, though). |
It was already: https://github.com/ysbaddaden/siphash.cr |
Implements streaming versions of siphash and halfsiphash pseudo random functions as
Digest::SipHash
andDigest::HalfSipHash
with chosen rounds.Refactors
Crystal::Hasher
to useSipHash(1, 3)
on 64-bit systems andHalfSipHash(1, 3)
on 32-bit systems, with tweaks to optimize some performance, e.g.String#hash
—but probably did 💩Update: choice of siphash-1-3 and hafsiphash-1-3 stems from linux kernel choices (see comments below).
Update: I notice I broke the contract of
Object#hash
which now returns either an UInt32 or an UInt64 depending on the platform. MaybeDigest::HalfSipHash
should always compute an UInt64 to matchDigest::SipHash
(the algorithm allows it)?