Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upReplace FNV with a faster hash function. #37229
Conversation
rust-highfive
assigned
Aatch
Oct 17, 2016
This comment has been minimized.
This comment has been minimized.
|
r? @Aatch (rust_highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
|
Do we have any backing data for this algorithm? Maybe from the Firefox development process/source? Smhasher run? |
This comment has been minimized.
This comment has been minimized.
|
I forgot to mention that there is something of an explanation about this hash function in the Firefox source: https://dxr.mozilla.org/mozilla-central/source/mfbt/HashFunctions.h#74-117. I modified it from 32-bits to 64-bits by changing the multiplication factor from 0x9E3779B9 (the golden ratio in fixed point) to 0x517cc1b727220a95 (pi in fixed point). I changed it from the golden ratio to pi because the golden ratio in 64-bit fixed point is even -- see http://stackoverflow.com/questions/5889238/why-is-xor-the-default-way-to-combine-hashes#comment54810251_27952689 This hash function was introduced into Firefox in https://bugzilla.mozilla.org/show_bug.cgi?id=729940. There's very little discussion in that bug report about how it was derived. I'm happy to try Smhasher on it. But the ultimate workload for the hash function used within rustc is rustc itself, and it's clearly working well there. |
This comment has been minimized.
This comment has been minimized.
|
I think it's worth discussing a couple more things while we're at it, so we nail this for good.
|
This comment has been minimized.
This comment has been minimized.
|
I think you miswrote your second dot point... but I did some ad hoc profiling and found that the vast majority of occurrences are |
This comment has been minimized.
This comment has been minimized.
|
Did I? I can't find it. I guess I need more more coffee. Interesting, but I'm almost sure it'll show up when we eventually move everything away from siphasher (string interner for example). Siphasher is still even higher in the profiles. |
This comment has been minimized.
This comment has been minimized.
|
Fnv and SipHasher both have the property that the stream of bytes to hash is "untyped": a u16 fed as u16 or its byte representation is hashed the same way. But I don't think that the But what I do think is that a well behaved hasher must hash a slice of bytes the same way, regardless of how you split it into subslices (as long as the order is the same). That means that any whole-word optimization for |
This comment has been minimized.
This comment has been minimized.
Yeah, luckily the Hash trait doesn't impose any special streaming requirement. |
This comment has been minimized.
This comment has been minimized.
|
I got curious so I ran smhasher on the 64bit (PR) and original 32bit hashes, I had to include 2 variants of each to be able to see how both modes of the hasher behave (integral and byte-byte...) see gist for results: https://gist.github.com/arthurprs/5e57cd59586acd8c52dbb02b55711096 A few comments considering the code in the PR. Hashing integral types (write_...) The quality is really bad but it's so cheap to calculate for integral types (what rustc seems to be using fnv for) that it's still a win for the combination of the workload + hashmap implementation. I'm fairly sure that the compiler sees the 0 seed and the hash boils down to a single IMUL instruction. Hashing slices (write_usize() + write()) The write_usize(slice.len()) will be faster and the write() slower compared to fnv. So it could potentially regress those cases. I think the right way forward is to have two hashers in the rustc codebase, one general purpose-ish and another for integral types. This PR has potential for the later. |
This comment has been minimized.
This comment has been minimized.
|
@arthurps: Thank you for running these! I was about to do it myself but you've saved me the trouble. Looking at the results... whelp, there are a lot of numbers there that I don't know how to interpret, though the "FAIL" results sound bad.
Why will |
This comment has been minimized.
This comment has been minimized.
It's a 15% difference in my Intel Skylake processor, 690MB/s vs 800MB/s. You can see some rough numbers in the gist. |
This comment has been minimized.
This comment has been minimized.
Are you sure? Where does that requirement come from? I was thinking about changing |
nnethercote
force-pushed the
nnethercote:FxHasher
branch
2 times, most recently
from
e8ac705
to
7be7488
Oct 25, 2016
This comment has been minimized.
This comment has been minimized.
|
New version. I've made the following changes.
r? @arthurps: what do you think? |
This comment has been minimized.
This comment has been minimized.
|
@nnethercote I'm not sure; it's something that needs to be discussed and put into the documentation. I think it's the logical rule by the construction of |
This comment has been minimized.
This comment has been minimized.
|
To make a concrete example, imagine |
nnethercote
force-pushed the
nnethercote:FxHasher
branch
from
7be7488
to
d92fdfb
Oct 25, 2016
This comment has been minimized.
This comment has been minimized.
|
(New version removes the |
This comment has been minimized.
This comment has been minimized.
|
Looks good to me. Somebody from the core team should weight about how to move this forward. I wouldn't be worried about the Hasher having the "strict streaming" characteristic as the Hash trait is "strongly typed" and will make the same writes to hasher every time. |
This comment has been minimized.
This comment has been minimized.
|
|
nnethercote
force-pushed the
nnethercote:FxHasher
branch
from
d92fdfb
to
b83ee24
Oct 25, 2016
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
With the notable exception of @arthurprs, this is being ignored. It's a big compile speed win, the biggest one I know of, but I fear that concerns about theoretical worst cases will overwhelm the benefit that's been demonstrated widely in practice. How can we move this forward? |
pnkfelix
added
I-nominated
T-compiler
labels
Oct 31, 2016
This comment has been minimized.
This comment has been minimized.
|
(Nominated for discussion amongst compiler team; hopefully that will help it move forward...) |
This comment has been minimized.
This comment has been minimized.
|
I think the problem is that @Aatch hasn't been too active of late, so the PR went unnoticed. I have no strong opinion about what hash function we use --- basically, if it's faster, I'm for it. I'm curious if anyone has any objections. |
This comment has been minimized.
This comment has been minimized.
|
@rfcbot fcp merge I'm not sure if this merits a FCP-style decision making process, but it seems harmless enough. Maybe if everyone is in favor we can avoid the need to discuss at the meeting. =) (In any case, I'd rather if we can conduct the discussion here on the PR in advance.) |
This comment has been minimized.
This comment has been minimized.
rfcbot
commented
Nov 1, 2016
•
|
Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged teams:
No concerns currently listed. Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
This comment has been minimized.
This comment has been minimized.
|
Could you have the Fnv -> Fx global rename in its own commit? |
This comment has been minimized.
This comment has been minimized.
You mean this?
Sure. I'll wait until I get full approval from the compiler team, because I have some other conflicts that I need to fix and I might as well do them later to reduce the likelihood of more conflicts afterwards. |
This comment has been minimized.
This comment has been minimized.
|
How about defining type alias |
This comment has been minimized.
This comment has been minimized.
|
Although there's no one size fits all for hashers I think it's easier to opt-out of it if necessary than the other way around. So +1 for the DefaultMap/Set. |
This comment has been minimized.
This comment has been minimized.
|
@nnethercote everybody is in favor! |
nikomatsakis
removed
the
I-nominated
label
Nov 8, 2016
nnethercote
force-pushed the
nnethercote:FxHasher
branch
from
b83ee24
to
1459d03
Nov 8, 2016
This comment has been minimized.
This comment has been minimized.
|
I rebased and split the PR into two commits: one adding FxHasher, and one converting all FnvHash instances to FxHash instances. I also remeasured and the results are similar to before.
(reddit-stress and ostn15_phf are a couple of programs that aren't in rust-benchmarks that I've been measuring.) |
nnethercote
added some commits
Nov 8, 2016
nnethercote
force-pushed the
nnethercote:FxHasher
branch
from
1459d03
to
00e48af
Nov 8, 2016
This comment has been minimized.
This comment has been minimized.
|
Ugh, this PR is so conflict-prone. |
This comment has been minimized.
This comment has been minimized.
|
@bors r+ |
This comment has been minimized.
This comment has been minimized.
|
|
eddyb
added a commit
to eddyb/rust
that referenced
this pull request
Nov 9, 2016
bors
added a commit
that referenced
this pull request
Nov 9, 2016
bors
merged commit 00e48af
into
rust-lang:master
Nov 9, 2016
1 check passed
nnethercote
deleted the
nnethercote:FxHasher
branch
Nov 10, 2016
brson
added
the
relnotes
label
Nov 15, 2016
This comment has been minimized.
This comment has been minimized.
|
@nnethercote this hash is super fast on my dataset. Here are my tests for this hash one a personal round robin hashset implementation for about 4500 u32 (unicode):
The |
This comment has been minimized.
This comment has been minimized.
|
@cbreeden I'm happy if you want to make a crate out of it. Make sure you observe the rustc license (of course) and you should probably make it clear in the docs that it's not a "well-designed" hash and so may not be suitable in all situations. Thanks. |
This comment has been minimized.
This comment has been minimized.
|
sounds good. Yeah, I got pretty lucky there, I'd say. |
This comment has been minimized.
This comment has been minimized.
|
I went ahead and decided to modify the fn write(&mut self, bytes: &[u8]) {
let mut buf = bytes;
while buf.len() >= 4 {
let n = buf.read_u32::<NativeEndian>().unwrap();
self.write_u32(n);
}
for byte in buf {
let i = *byte;
self.add_to_hash(i as usize);
}
}Testing this with a few ascii byte slices yield these results: name old ns/iter chunks ns/iter diff ns/iter diff % speedup
bench_3chars 2 3 1 50.00% x 0.67
bench_4chars 3 2 -1 -33.33% x 1.50
bench_11chars 8 5 -3 -37.50% x 1.60
bench_12chars 9 3 -6 -66.67% x 3.00
bench_23chars 21 8 -13 -61.90% x 2.62
bench_24chars 24 6 -18 -75.00% x 4.00It appears that there is a clear win for hashing any byte slice with length > 3, which I believe is the common case. For some reason there is a regression when hashing in chunks of u64. (x64 Intel i7-6600U @ 2.6 GHz, Windows 10). @nnethercote I know that you said |
This comment has been minimized.
This comment has been minimized.
|
@nnethercote nevermind, sorry for the spam. I think you were using https://github.com/rust-lang-nursery/rustc-benchmarks. I'll try it out when I get back home on a computer that can compile rustc in a reasonable amount of time. |
nnethercote commentedOct 17, 2016
Hash table lookups are very hot in rustc profiles and the time taken within
FnvHashitself is a big part of that. Although FNV is a simple hash, it processes its input one byte at a time. In contrast, Firefox has a homespun hash function that is also simple but works on multiple bytes at a time. So I tried it out and the results are compelling:(That's a stage1 compiler doing debug builds. Results for a stage2 compiler are similar.)
The attached commit is not in a state suitable for landing because I changed the implementation of FnvHasher without changing its name (because that would have required touching many lines in the compiler). Nonetheless, it is a good place to start discussions.
Profiles show very clearly that this new hash function is a lot faster to compute than FNV. The quality of the new hash function is less clear -- it seems to do better in some cases and worse in others (judging by the number of instructions executed in
Hash{Map,Set}::get).CC @brson, @arthurprs