Improve `make_hash` function #20464

ranma42 · 2015-01-03T09:55:59Z

The make_hash function is used to prevent hashes of non-empty
buckets to collide with EMPTY_HASH = 0u64. Ideally this function
also preserve the uniform distribution of hashes and is cheap to
compute.

The new implementation reduces the input hash size by one bit, simply
by setting the most significant bit. This obviously prevent output
hashes to collide with EMPTY_HASH and guarantees that the uniform
distribution is preserved. Moreover, the new function is simpler (no
comparisons, just an OR) and (under the same assumptions as the old
function, i.e. only the least significant bit will contribute to the
bucket index) no additional collisions are caused.

The `make_hash` function is used to prevent hashes of non-empty buckets to collide with `EMPTY_HASH = 0u64`. Ideally this function also preserve the uniform distribution of hashes and is cheap to compute. The new implementation reduces the input hash size by one bit, simply by setting the most significant bit. This obviously prevent output hashes to collide with `EMPTY_HASH` and guarantees that the uniform distribution is preserved. Moreover, the new function is simpler (no comparisons, just an OR) and (under the same assumptions as the old function, i.e. only the least significant bit will contribute to the bucket index) no additional collisions are caused.

rust-highfive · 2015-01-03T09:56:09Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see CONTRIBUTING.md for more information.

ranma42 · 2015-01-03T10:00:27Z

I did some benchmarking on x86-64 (by stripping the call to the hasher and comparing minimal make_hash functions that simply use the old vs new policy). Iterating over 0..2047u64 gives:

running 2 tests
test b_make_hash_2048     ... bench:      1129 ns/iter (+/- 45)
test b_make_hash_new_2048 ... bench:       900 ns/iter (+/- 58)

On 32-bits platforms I would expect a possibly bigger improvement, because LLVM should be able to notice that the OR operation on the low word is a no-op (while to compare with EMPTY_BUCKET both words had to be examined).

alexcrichton · 2015-01-03T19:34:18Z

cc @gankro, @pczarn, curious about your opinions on this.

Gankra · 2015-01-03T19:58:11Z

Do these performance changes actually make it into the actual HashMap benches?

Gankra · 2015-01-03T20:02:48Z

Functionality-wise this seems like a reasonable change. It should have no practical effect on truncated hash-collisions, but will double the likelihood of a "true" hash collision from "very unlikely" to "still very unlikely".

pczarn · 2015-01-03T23:27:55Z

The old implementation was basically a single 64-bit cmov.

Pros of the new approach:

Definitely better on 32-bit platforms.
A little simpler.

I agree with @gankro, there are no other pros or cons in practice.

ranma42 · 2015-01-04T04:54:45Z

@gankro I do not expect these performance changes (0.1 ns every time a hash is computed?) to be measurable in practical applications. The main reason I proposed this change is to simplify the function and to make it more obvious that the distribution remains uniform after making the hash "safe".

Improve `make_hash` function Reviewed-by: Gankro, Gankro

rust-highfive assigned alexcrichton Jan 3, 2015

bors added a commit that referenced this pull request Jan 4, 2015

Merge pull request #20464 from ranma42/improve-make-hash

05abc65

Improve `make_hash` function Reviewed-by: Gankro, Gankro

bors referenced this pull request Jan 4, 2015

rustc: allow paths in constants to refer to polymorphic items.

9a90d43

bors merged commit 28cca28 into rust-lang:master Jan 5, 2015

ranma42 deleted the improve-make-hash branch January 17, 2015 15:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve `make_hash` function #20464

Improve `make_hash` function #20464

Uh oh!

ranma42 commented Jan 3, 2015

Uh oh!

rust-highfive commented Jan 3, 2015

Uh oh!

ranma42 commented Jan 3, 2015

Uh oh!

alexcrichton commented Jan 3, 2015

Uh oh!

Gankra commented Jan 3, 2015

Uh oh!

Gankra commented Jan 3, 2015

Uh oh!

pczarn commented Jan 3, 2015

Uh oh!

ranma42 commented Jan 4, 2015

Uh oh!

Uh oh!

Improve make_hash function #20464

Improve make_hash function #20464

Uh oh!

Conversation

ranma42 commented Jan 3, 2015

Uh oh!

rust-highfive commented Jan 3, 2015

Uh oh!

ranma42 commented Jan 3, 2015

Uh oh!

alexcrichton commented Jan 3, 2015

Uh oh!

Gankra commented Jan 3, 2015

Uh oh!

Gankra commented Jan 3, 2015

Uh oh!

pczarn commented Jan 3, 2015

Uh oh!

ranma42 commented Jan 4, 2015

Uh oh!

Uh oh!

Improve `make_hash` function #20464

Improve `make_hash` function #20464