Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose the hasher algorithm and type aliases for HashMap, HashSet and BuildHasher #9

Closed
hcpl opened this issue Oct 30, 2018 · 3 comments

Comments

@hcpl
Copy link
Contributor

hcpl commented Oct 30, 2018

As I understand the purpose of this crate, the only thing that is different from std::collections::Hash{Map, Set} and other hash-based data structures from other crates is a different hasher. If that is not true, then what I'm saying below is unapplicable, so feel free to close.

The goal is to eliminate the need to maintain a separate Hash{Map, Set} and also reuse impls for Hash{Map, Set} defined in other crates (provided that they are generic over the hasher) so that we don't have to provide them either.

For example, that's how both fxhash and fnv do:

This is technically a breaking change since the semantics of hashbrown::HashMap and hashbrown::HashSet might be altered in some ways (that I don't know of, unfortunately).

Solving this issue will help solving https://github.com/Amanieu/hashbrown/issues/6 as well.

@hcpl hcpl mentioned this issue Oct 30, 2018
@Amanieu
Copy link
Member

Amanieu commented Oct 31, 2018

Actually hashbrown is a re-implementation of the underlying hash table based on Google's SwissTable design. It uses FxHash as the default hasher.

@Amanieu Amanieu closed this as completed Oct 31, 2018
@hcpl
Copy link
Contributor Author

hcpl commented Oct 31, 2018

Oh, so I got it backwards. How about generalizing hashbrown for different hashers then? I'm blind and dumb --- it already does it.

Also it might be useful to change the docs a little bit, because "Drop-in replacement for the standard library HashMap and HashSet types" made me think of "a new hasher for standard HashMap and HashSet", not "new HashMap and HashSet implementations altogether".

@Amanieu
Copy link
Member

Amanieu commented Oct 31, 2018

You can use different hashers the same way you do with the standard HashMap: just change the S generic parameter to a different BuildHasher.

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Nov 13, 2020
Doc change: Remove mention of `fnv` in HashMap

Disclaimer: I am the author of [aHash](https://github.com/tkaitchuck/aHash).

This changes the Rustdoc in `HashMap` from mentioning the `fnv` crate to mentioning the `aHash` crate, as an alternative `Hasher` implementation.

### Why

Fnv [has poor hash quality](https://github.com/rurban/smhasher), is [slow for larger keys](https://github.com/tkaitchuck/aHash/blob/master/compare/readme.md#speed), and does not provide dos resistance, because it is unkeyed (this can also cause [other problems](https://accidentallyquadratic.tumblr.com/post/153545455987/rust-hash-iteration-reinsertion)).

Fnv has acceptable performance for integers and has very poor performance with keys >32 bytes. This is the reason it was removed from the standard library in rust-lang#37229 .

Because regardless of which dimension you value, there are better alternatives, it does not make sense for anyone to consider using `fnv`.

The text mentioning `fnv` in the standard library continues to create confusion: rust-lang/hashbrown#153  rust-lang/hashbrown#9 . There are also a number of [crates using it](https://crates.io/crates/fnv/reverse_dependencies) a great many of which are hashing strings (Which is when Fnv is the [worst](https://github.com/cbreeden/fxhash#benchmarks), [possible](https://github.com/tkaitchuck/aHash#speed), [choice](http://cglab.ca/~abeinges/blah/hash-rs/).)

I think aHash makes the most sense to mention as an alternative because it is the most credible option (in my obviously biased opinion). It offers [good performance on numbers and strings](https://github.com/tkaitchuck/aHash/blob/master/compare/readme.md#speed), is [of high quality](https://github.com/tkaitchuck/aHash#hash-quality), and [provides dos resistance](https://github.com/tkaitchuck/aHash/wiki/How-aHash-is-resists-DOS-attacks). It is popular (see [stats](https://crates.io/crates/ahash)) and is the default hasher for [hashbrown](https://crates.io/crates/hashbrown) and [dashmap](https://crates.io/crates/dashmap) which are the most popular alternative hashmaps. Finally it does not have any of the [`gotcha` cases](https://github.com/tkaitchuck/aHash#fxhash) that `FxHash` suffers from. (Which is the other popular hashing option when DOS attacks are not a concern)

Signed-off-by: Tom Kaitchuck <tom.kaitchuck@emc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants