Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std: Stabilize custom hasher support in HashMap #31081

Merged
merged 1 commit into from Jan 26, 2016

Conversation

Projects
None yet
8 participants
@alexcrichton
Copy link
Member

alexcrichton commented Jan 21, 2016

This commit implements the stabilization of the custom hasher support intended
for 1.7 but left out due to some last-minute questions that needed some
decisions. A summary of the actions done in this PR are:

Stable

  • std::hash::BuildHasher
  • BuildHasher::Hasher
  • BuildHasher::build_hasher
  • std::hash::BuildHasherDefault
  • HashMap::with_hasher
  • HashMap::with_capacity_and_hasher
  • HashSet::with_hasher
  • HashSet::with_capacity_and_hasher
  • std::collections::hash_map::RandomState
  • RandomState::new

Deprecated

  • std::collections::hash_state
  • std::collections::hash_state::HashState - this trait was also moved into
    std::hash with a reexport here to ensure that we can have a blanket impl to
    prevent immediate breakage on nightly. Note that this is unstable in both
    location.
  • HashMap::with_hash_state - renamed
  • HashMap::with_capacity_and_hash_state - renamed
  • HashSet::with_hash_state - renamed
  • HashSet::with_capacity_and_hash_state - renamed

Closes #27713

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Jan 21, 2016

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jan 21, 2016

r? @aturon

cc @rust-lang/libs

nominating for a beta-backport

@Gankro

This comment has been minimized.

Copy link
Contributor

Gankro commented Jan 21, 2016

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Jan 22, 2016

Everything here seems reasonable and as was discussed. 👍

pub fn with_hash_state(hash_state: S) -> HashMap<K, V, S> {
HashMap::with_hasher(hash_state)
}

/// Creates an empty HashMap with space for at least `capacity`

This comment has been minimized.

@aturon

aturon Jan 22, 2016

Member

Should we give the same warning about DoS here?

This comment has been minimized.

@aturon

aturon Jan 22, 2016

Member

Ooops, github just had it collapsed, nevermind.

#[allow(deprecated)]
impl<T: HashState> BuildHasher for T {
type Hasher = T::Hasher;
fn build_hasher(&self) -> T::Hasher { self.hasher() }

This comment has been minimized.

@aturon

aturon Jan 22, 2016

Member

Can you remind me why we take &self here, given that you're describing the trait in terms of being state that's used to generate new hashers? We seem to force you to interior mutability here, presumably for good reason?

This comment has been minimized.

@shepmaster

shepmaster Jan 22, 2016

Member

Each T::Hasher has to have the same initial state - otherwise hashing the same thing twice wouldn't have the same hash. When this function is called, it shouldn't be doing a whole lot.

For example a factory constructed with a random seed. You don't see the non-random seed in my code because it's piggybacking on the implementation provided by DefaultState

This comment has been minimized.

@aturon

aturon Jan 22, 2016

Member

OK, I think the problem is just that the docs here are very misleading:

  • "A trait representing stateful hashes ..."
  • "Creates a new hasher based on the given state of this object"

Both of those strongly suggested to me that mutation is expected here.

This comment has been minimized.

@alexcrichton

alexcrichton Jan 23, 2016

Author Member

Oh actually the HashState trait is basically dead/to be ignored, the relevant documentation on the BuildHasher trait (the real one now) is:

/// A trait representing an object which maintains state to create new instances
/// of the `Hasher` trait.                                                      
///                                                                             
/// A `BuildHasher` is typically used as a factory for instances of `Hasher`    
/// which a `HashMap` can then use to hash keys independently.                  

Does that sound ok?

I'll clear out the old documentation in favor of just saying it's deprecated.

This comment has been minimized.

@aturon

aturon Jan 23, 2016

Member

Those are the docs I was referencing :) Look above -- the BuildHasher trait method docs say "Creates a new hasher based on the given state of this object", which is what I was quoting. My point is just that these docs strongly suggest mutable state to me. Not a huge deal, but I found it pretty confusing.

This comment has been minimized.

@alexcrichton

alexcrichton Jan 23, 2016

Author Member

Ah ok I think I see what you're getting at now. Do you have a rewording in mind? I could try and wordsmith a bit as well and see what I can come up with

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Jan 22, 2016

LGTM -- though I did leave one question about the actual API. I suspect I'm just failing to remember one of the design constraints here.

@alexcrichton alexcrichton force-pushed the alexcrichton:stabilize-hasher branch from eaff046 to ad2f9af Jan 23, 2016

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jan 23, 2016

@aturon I've updated with removing the old and misleading HashState documentation, and as @shepmaster mentioned the &self is because these are just hash factories as opposed to hashers themselves. The &self is required to work because hashers need to be created in the fn get(&self) function on a HashMap.

@alexcrichton alexcrichton force-pushed the alexcrichton:stabilize-hasher branch from ad2f9af to 306bd59 Jan 23, 2016

@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Jan 26, 2016

This looks good to go, right?

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Jan 26, 2016

@alexcrichton corrected the wrong docs problem :) I still find the docs super misleading, but we can fix in post if you prefer. That's my only complaint.

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jan 26, 2016

I'm fine changing the docs, but I'm not really quite sure what to change them to? Do you have something in mind @aturon ?

@@ -190,6 +191,79 @@ pub trait Hasher {
}
}

/// A trait representing an object which maintains state to create new instances

This comment has been minimized.

@aturon

aturon Jan 26, 2016

Member

Suggestion: nix this sentence.

#[stable(since = "1.7.0", feature = "build_hasher")]
type Hasher: Hasher;

/// Creates a new hasher based on the given state of this object.

This comment has been minimized.

@aturon

aturon Jan 26, 2016

Member

Suggestion: shorten to Creates a new hasher

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Jan 26, 2016

@alexcrichton Left some minimal suggestions. With that r=me.

std: Stabilize custom hasher support in HashMap
This commit implements the stabilization of the custom hasher support intended
for 1.7 but left out due to some last-minute questions that needed some
decisions. A summary of the actions done in this PR are:

Stable

* `std:#️⃣:BuildHasher`
* `BuildHasher::Hasher`
* `BuildHasher::build_hasher`
* `std:#️⃣:BuildHasherDefault`
* `HashMap::with_hasher`
* `HashMap::with_capacity_and_hasher`
* `HashSet::with_hasher`
* `HashSet::with_capacity_and_hasher`
* `std::collections::hash_map::RandomState`
* `RandomState::new`

Deprecated

* `std::collections::hash_state`
* `std::collections::hash_state::HashState` - this trait was also moved into
  `std::hash` with a reexport here to ensure that we can have a blanket impl to
  prevent immediate breakage on nightly. Note that this is unstable in both
  location.
* `HashMap::with_hash_state` - renamed
* `HashMap::with_capacity_and_hash_state` - renamed
* `HashSet::with_hash_state` - renamed
* `HashSet::with_capacity_and_hash_state` - renamed

Closes #27713

@alexcrichton alexcrichton force-pushed the alexcrichton:stabilize-hasher branch from 306bd59 to 1fa0be2 Jan 26, 2016

@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jan 26, 2016

@bors: r=aturon 1fa0be2

bors added a commit that referenced this pull request Jan 26, 2016

Auto merge of #31081 - alexcrichton:stabilize-hasher, r=aturon
This commit implements the stabilization of the custom hasher support intended
for 1.7 but left out due to some last-minute questions that needed some
decisions. A summary of the actions done in this PR are:

Stable

* `std:#️⃣:BuildHasher`
* `BuildHasher::Hasher`
* `BuildHasher::build_hasher`
* `std:#️⃣:BuildHasherDefault`
* `HashMap::with_hasher`
* `HashMap::with_capacity_and_hasher`
* `HashSet::with_hasher`
* `HashSet::with_capacity_and_hasher`
* `std::collections::hash_map::RandomState`
* `RandomState::new`

Deprecated

* `std::collections::hash_state`
* `std::collections::hash_state::HashState` - this trait was also moved into
  `std::hash` with a reexport here to ensure that we can have a blanket impl to
  prevent immediate breakage on nightly. Note that this is unstable in both
  location.
* `HashMap::with_hash_state` - renamed
* `HashMap::with_capacity_and_hash_state` - renamed
* `HashSet::with_hash_state` - renamed
* `HashSet::with_capacity_and_hash_state` - renamed

Closes #27713
@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jan 26, 2016

⌛️ Testing commit 1fa0be2 with merge a9e139b...

@bors bors merged commit 1fa0be2 into rust-lang:master Jan 26, 2016

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@alexcrichton

This comment has been minimized.

Copy link
Member Author

alexcrichton commented Jan 27, 2016

Marking as accepted for beta as libs talked about this awhile ago

@alexcrichton alexcrichton deleted the alexcrichton:stabilize-hasher branch Jan 27, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.