New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for HashMap::raw_entry #56167
Comments
What is the motivation for having separate Could we consider merging these methods into a single one? Or is there some use case where the difference in behavior is useful? |
I am also extremely confused by this distinction, as my original designs didn't include them (I think?) and the documentation that was written is very unclear. |
cc @fintelia |
The reason I added let key = map.iter().nth(rand() % map.len()).0.clone();
map.remove(&key); I wanted to just be able to pick a random "bucket" and then get an entry/raw entry to the first element in it if any: loop {
if let Occupied(o) = map.raw_entry_mut().search_bucket(rand(), || true) {
o.remove();
break;
}
} (the probabilities aren't uniform in the second version, but close enough for my purposes) |
I continue to not want to support the "random deletion" usecase in std's HashMap. You really, really, really, should be using a linked hashmap or otherwise ordered map for that. |
It doesn't work in hashbrown anyways (see rust-lang#56167)
I have removed this method in the hashbrown PR (#56241). Your code snippet for random deletion won't work with hashbrown anyways since it always checks the hash as part of the search process. |
It doesn't work in hashbrown anyways (see rust-lang#56167)
I can avoid unnecessary clones inherent to the original #![feature(hash_raw_entry)]
use std::collections::HashMap;
let mut map = HashMap::new();
map.raw_entry_mut()
.from_key("poneyland")
.or_insert("poneyland", 3); Currently I use the following function to hash once and automatically provide an owned key if necessary (somewhat similar to what was discussed in rust-lang/rfcs#1769): use std::borrow::Borrow;
use std::collections::hash_map::RawEntryMut;
use std::hash::{BuildHasher, Hash, Hasher};
fn get_mut_or_insert_with<'a, K, V, Q, F>(
map: &'a mut HashMap<K, V>,
key: &Q,
default: F,
) -> &'a mut V
where
K: Eq + Hash + Borrow<Q>,
Q: Eq + Hash + ToOwned<Owned = K>,
F: FnOnce() -> V,
{
let mut hasher = map.hasher().build_hasher();
key.hash(&mut hasher);
let hash = hasher.finish();
match map.raw_entry_mut().from_key_hashed_nocheck(hash, key) {
RawEntryMut::Occupied(entry) => entry.into_mut(),
RawEntryMut::Vacant(entry) => {
entry
.insert_hashed_nocheck(hash, key.to_owned(), default())
.1
}
}
} Given If there isn't, why not saving the hash in |
I'm not yet very familiar with this API, but what @gdouezangrard suggested seems like a great idea to me. What even happens currently if the two hashes don't match, is the key-value pair then inserted into the wrong bucket? It's not clear to me from (quickly) reading the source code. |
I submitted rust-lang/hashbrown#54 to support using a If so, I'd be happy to submit a PR! |
This is a really great API, it's also what keeps crates ( What could be next steps here towards stabilization? |
Just gonna add another ping here -- what's blocking this right now? |
I see a few things that need to be resolved:
I would recommend prototyping in the hashbrown crate first, which can then be ported back in the the std HashMap. |
I find I also would like to point out that #![feature(hash_raw_entry)]
use std::collections::HashMap;
fn main() {
let mut map = HashMap::new();
map.raw_entry_mut().from_key(&42).or_insert(1, 2);
println!("{}", map[&1]);
} This is a bit like calling #![feature(hash_raw_entry)]
use std::collections::hash_map::{HashMap, RawEntryMut};
fn main() {
let mut map = HashMap::new();
if let RawEntryMut::Vacant(_) = map.raw_entry_mut().from_key(&42) {
map.insert(1, 2);
}
println!("{}", map[&1]);
} I think raw entry API is useful, but I don't think its API should be conflated with entry API. |
As discussed here: rust-lang/hashbrown#232
If the feature of a user specified hash is needed, it may be useful to instead provide a method on the raw entry to hash a key. That way the hashmap can implement this however it sees fit and the application code is less error prone because there is an unambiguous way to obtain the hash value if it is not known in advance. |
There is a very simple and common use case that, as far as I can tell, is not currently possible with the standard Description of the use case: Imagine I have (1) check if some key exists, There are two options I can see, each of which has unnecessary overhead. First option:
Second option:
The defect of the first option is that it requires a clone of the key even when no insertion is necessary. The defect of the second option is that it requires two hash computations and lookups when an insertion is necessary. I find myself wanting to do this practically every time I use hash maps, so it's possible this is an already supported basic API and I'm just missing it in the docs. |
No, this requires raw_entry_mut. |
Thanks for confirming. Has the idea of adding an API for this specific use case (e.g., |
Yes, there was a great deal of discussion about this in the past. Note that |
Do you know where to find the past discussion? I'm interested in learning whether/why that idea was rejected. After that I will stop spamming this issue; sorry! |
Actually, there are way more discussions that this doesn't bring up, but this is a start. |
IMO
// - Just a single T instead of (K, V).
// - No bounds on T.
// - Hashing and equality checking is completely user-controlled.
// - A hasher is provided by the caller for functions that might cause a table resize.
pub fn get_mut(
&mut self,
hash: u64,
eq: impl FnMut(&T) -> bool
) -> Option<&mut T>;
pub fn insert_entry(
&mut self,
hash: u64,
value: T,
hasher: impl Fn(&T) -> u64
) -> &mut T; |
Reading the past discussions posted by Thom, I didn't find where It has been five years - perhaps it's time to land those so people can start using HashMaps now without performance overhead, rather than waiting for |
This applies to |
We reviewed this in today's @rust-lang/libs-api meeting and agreed with the summary in #56167 (comment). We'd like to see an implementation of We feel that any more advanced use cases are better off using @rfcbot fcp close |
Team member @Amanieu has proposed to close this. The next step is review by the rest of the tagged team members: Concerns:
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
I'm fine with the conclusion but I would like we have the alternative in nightly before remove |
I agree with #56167 (comment), don't think there is any rush to delete this. Keeping it in front of people will help people form an opinion of how to serve each of the two use cases better. @rfcbot concern keep unstable for now |
I created an initial implementation of the |
@rfcbot cancel We've discussed this in the libs api meeting. We'll keep this around as unstable until we have something better. |
@m-ou-se proposal cancelled. |
One thing I wanted to record for consideration in whatever this becomes: It's certainly true that for unknown generic types, one cannot trust the But it might be nice to be able to trust, say, a For example, imagine a |
I’ve replaced a use of
Perhaps a middle ground would be to change |
Added in #54043.
As of 6ecad33 / 2019-01-09, this feature covers:
… as well as
Debug
impls for each 5 new types, and their inherent methods.The text was updated successfully, but these errors were encountered: