Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

basic drain and into_iter implementations #33

Closed
wants to merge 13 commits into from

Conversation

soruh
Copy link
Contributor

@soruh soruh commented Jan 25, 2020

I just took a quick look at implementing drain and into_iter.
The tests seem to pass, but that might just be luck.

This code should probably not be merged, since it seems horrendously inefficient, but maybe it can be used to work towards an actual implementation.

@codecov
Copy link

codecov bot commented Jan 25, 2020

Codecov Report

Merging #33 into master will increase coverage by 0.73%.
The diff coverage is 92.46%.

Impacted Files Coverage Δ
src/iter/iter.rs 97.01% <100%> (-0.13%) ⬇️
src/raw/mod.rs 91.11% <100%> (+0.41%) ⬆️
src/node.rs 45.45% <100%> (ø) ⬆️
src/map.rs 89.94% <92.02%> (+1.23%) ⬆️

@jonhoo
Copy link
Owner

jonhoo commented Jan 26, 2020

I'm a little strapped for time at the moment, but one thing to keep in mind is what exactly the contract is for drain. For example, what happens if I write the following code:

std::mem::forget(map.drain());

Is map now empty or full? What happens if I read one element and then drop the Drain? Take a look at hashbrown::RawDrain for some inspiration. I'm genuinely not sure what the best way to express this is a concurrent map, which is why I marked drain as hard in the issue :p

@soruh
Copy link
Contributor Author

soruh commented Jan 26, 2020

I assumed this would get hard, but I thought I'd just experiment with it for a bit and see how far I get...

I think there are two ways we can go about this:

  • have the Drain iterate over the map bin by bin, try to remove Elements as it goes, and return them if they weren't removed already.
  • take a snapshot of the Map's keys first and then remove them one by one.

The main difference is that the first option will also remove elements added during the drain while the second one is guaranteed to stop when it iterated over all the keys in it's snapshot, but has to store all of those keys somewhere...

Regarding the return Value there are also two options:

  • return an Option<V> for each V we tried to remove
  • ignore removes that returned None and only return V

But the first option will result in returning an Option<Option<V>> but will let the user know that an entry they tried to drain was removed. (I don't really know what they would do with that information, but it could be useful)

I personally tend towards not Snapshotting the map, just returning a V, and ignoring entries that were already removed.

I'd be interested in hearing your opinion on this, but I probably also won't have too much free time until next weekend.

Copy link
Collaborator

@cuviper cuviper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ISTM that since you clone the keys, you should also be cloning the values.

Alternatively, for Drain you could just return references (&'g K, &'g V) which the user could choose to clone. Then the difference from plain Iter is just that they are marked for removal along the way.

IntoIter should have exclusive ownership of the map, no outstanding references, so it just needs a way for the map to avoid re-dropping items that the iterator already returned, like a true immediate removal.

src/iter/iter.rs Outdated

if let Some(value) = self.map.remove(&node.key, self.guard) {
return Some((node.key.clone(), unsafe {
std::ptr::read(value as *const V)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this lead to a double-drop of V? Once for the internal guard.defer_destroy(val), and again whenever the user is done with this read copy of the value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I added this when playing around with giving the user a V instead of a &V, but we probably should just always give out &Vs and let them clone when they need to...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be nice to have the same signature as std::HashMap, but since we're not implementing a trait I guess we can just have the user deal with it...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed now. I'll write some tests to make sure there actually aren't any double-frees tomorrow.


let key = node.key.clone();
let value = node.value.load(Ordering::SeqCst, self.guard);
let value = unsafe { std::ptr::read(value.as_ref().unwrap() as *const V) };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also seems like a double-drop of the value, between the returned value and the whole map dropping.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

@soruh
Copy link
Contributor Author

soruh commented Jan 29, 2020

IntoIter should have exclusive ownership of the map, no outstanding references

Are we sure that's true? I think, since we load our values under the guard, the user can still have references to them even if they don't have access to the map anymore. I could be wrong about that though.

Thank you for taking a look at this

@cuviper
Copy link
Collaborator

cuviper commented Jan 29, 2020

IntoIter should have exclusive ownership of the map, no outstanding references

Are we sure that's true? I think, since we load our values under the guard, the user can still have references to them even if they don't have access to the map anymore.

AFAIK all user references have double-borrowed lifetimes, from both &'g self and &'g guard, which means they can't outlive their reference to the map. IntoIterator consumes self by value, so it should be clear of all borrows at that point. You'd still need to make sure your IntoIter doesn't keep internal references too long, but I guess that's the unsafety of epoch::unprotected().

@soruh
Copy link
Contributor Author

soruh commented Jan 29, 2020

IntoIter should have exclusive ownership of the map, no outstanding references

Are we sure that's true? I think, since we load our values under the guard, the user can still have references to them even if they don't have access to the map anymore.

AFAIK all user references have double-borrowed lifetimes, from both &'g self and &'g guard, which means they can't outlive their reference to the map. IntoIterator consumes self by value, so it should be clear of all borrows at that point. You'd still need to make sure your IntoIter doesn't keep internal references too long, but I guess that's the unsafety of epoch::unprotected().

Nice, we're good then, I wasn't sure if all Vs were necessarily also bound to the map.

@cuviper
Copy link
Collaborator

cuviper commented Jan 29, 2020

FWIW, there's similar reasoning about exclusive access in Drop for HashMap:

flurry/src/map.rs

Lines 1389 to 1396 in 9544337

// safety: we have &mut self _and_ all references we have returned are bound to the
// lifetime of their borrow of self, so there cannot be any outstanding references to
// anything in the map.
//
// NOTE: we _could_ relax the bounds in all the methods that return `&'g ...` to not also
// bound `&self` by `'g`, but if we did that, we would need to use a regular `epoch::Guard`
// here rather than an unprotected one.
let guard = unsafe { crossbeam_epoch::unprotected() };

@jonhoo
Copy link
Owner

jonhoo commented Jan 29, 2020

I added that comment in Drop after having basically the same realization as is going on here in a9c6890

@jonhoo jonhoo added this to the 1.0 milestone Jan 31, 2020
@jonhoo
Copy link
Owner

jonhoo commented Feb 17, 2020

I'm going to close this for the time being, and then we can re-open if you find time to pick it up again @soruh :)

@jonhoo jonhoo closed this Feb 17, 2020
@soruh
Copy link
Contributor Author

soruh commented Feb 17, 2020

@jonhoo Great, thank you. I'll probably still be quite busy for a while but I should be able to contribute to discussions regarding this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants