Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collections: Make BinaryHeap panic safe in sift_up / sift_down #25856

Merged
merged 1 commit into from May 28, 2015

Conversation

Projects
None yet
7 participants
@bluss
Copy link
Contributor

bluss commented May 28, 2015

collections: Make BinaryHeap panic safe in sift_up / sift_down

Use a struct called Hole that keeps track of an invalid location
in the vector and fills the hole on drop.

I include a run-pass test that the current BinaryHeap fails, and the new
one passes.

NOTE: The BinaryHeap will still be inconsistent after a comparison fails. It will
not have the heap property. What we fix is just that elements will be valid
values.

This is actually a performance win -- the new code does not bother to write in zeroed()
values in the holes, it just leaves them as they were.

Net result is something like a 5% decrease in runtime for BinaryHeap::from_vec. This
can be further improved by using unchecked indexing (I confirmed it makes a difference,
not a surprise with the non-sequential access going on), but let's leave that for another PR.
Safety first 😉

Fixes #25842

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented May 28, 2015

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @brson (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@bluss

This comment has been minimized.

Copy link
Contributor Author

bluss commented May 28, 2015

I prototyped this using a general scope guard, but this Hole representation turned out to be nice.

LLVM needs to be on our side here and see that the removed element's Option<T>
is always Some, etc. If you wonder why I use hole.pos() in one location and hole.pos
in another, it's because those choices benched the best, both locations. Optimizing
compilers, they are chaotic.

Edited: Moved useful info from this comment into the merge message itself (above)

cc @alexcrichton @Gankro

@bluss bluss force-pushed the bluss:binary-heap-hole branch from 5f44f5b to b221853 May 28, 2015

@bluss

This comment has been minimized.

Copy link
Contributor Author

bluss commented May 28, 2015

Using only unchecked indexing in Hole removes the bench result dependence on hole.pos vs hole.pos() -- just shows how panicking can be an impediment to optimization.


impl<'a, T> Hole<'a, T> {
/// Create a new Hole at index `pos`.
pub fn new(data: &'a mut [T], pos: usize) -> Self {

This comment has been minimized.

@alexcrichton

alexcrichton May 28, 2015

Member

Could the pub be dropped from these functions? (Hole isn't exposed so they shouldn't need to be exposed either)

This comment has been minimized.

@Gankro

Gankro May 28, 2015

Contributor

It makes the code more portable, though. e.g. we can happily move this to a module without worrying.

This comment has been minimized.

@bluss

bluss May 28, 2015

Author Contributor

It came out with pub, I think pub is natural as soon as the methods are intended to be called from outside the struct itself. I know our privacy doesn't work that way, but it does once the code grows and yes you move it out to a module. I'm fine either way but I prefer pub.

This comment has been minimized.

@alexcrichton

alexcrichton May 28, 2015

Member

Idiomatically code in the rest of the standard library does not do this, so let's stick to existing conventions. We can add pub if necessary at a later date, but code should be as conservative as possible in exports today.

This comment has been minimized.

@bluss

bluss May 28, 2015

Author Contributor

ok that's fine

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented May 28, 2015

Looks good to me, thanks @bluss! Perhaps the test could also be modified to ensure that the destructor for each element in the heap isn't run more than once?

Other than that though r=me, we can always tweak the performance at a later date.

cc @Gankro

@bluss

This comment has been minimized.

Copy link
Contributor Author

bluss commented May 28, 2015

Thanks. I'll add a test that makes sure no destructors were called right after catching the panic, before I inspect the data further. That should cover it (and fails on old version of BinaryHeap).

ptr::write(&mut self.data[pos], x);
pos = parent;
while hole.pos() > start {
let parent = (hole.pos() - 1) >> 1;

This comment has been minimized.

@Gankro

Gankro May 28, 2015

Contributor

Surely LLVM can convert the much more semantically clear /2 to this?

This comment has been minimized.

@bluss

bluss May 28, 2015

Author Contributor

I'll fix it. Haven't really looked at changing anything besides what's needed, though.

/// position with the value that was originally removed.
struct Hole<'a, T: 'a> {
data: &'a mut [T],
elt: Option<T>,

This comment has been minimized.

@Gankro

Gankro May 28, 2015

Contributor

Perhaps worth noting that this value is Some until this value is Dropped

This comment has been minimized.

@bluss

bluss May 28, 2015

Author Contributor

Good idea.

debug_assert!(index != self.pos);
let old_pos = self.pos;
let x = ptr::read(&mut self.data[index]);
ptr::write(&mut self.data[old_pos], x);

This comment has been minimized.

@Gankro

Gankro May 28, 2015

Contributor

Should this not just be a copy_nonoverlapping?

This comment has been minimized.

@bluss

bluss May 28, 2015

Author Contributor

That's a good idea for here. Haven't really looked at changing anything besides what's needed.

This comment has been minimized.

@bluss

bluss May 28, 2015

Author Contributor

Can we leave it for now? It requires switching to unchecked indexing.

This comment has been minimized.

@Gankro

Gankro May 28, 2015

Contributor

You don't need unchecked indexing, you just need to cache &self.data[index] in a *const for a line. Right?

(also at very least, you shouldn't be reading from an &mut :) )

This comment has been minimized.

@bluss

bluss May 28, 2015

Author Contributor

I had a feeling something was off. I promise, a feeling.

@Gankro

This comment has been minimized.

Copy link
Contributor

Gankro commented May 28, 2015

This looks great!

@bluss bluss force-pushed the bluss:binary-heap-hole branch from b221853 to c41a5cc May 28, 2015

@bluss

This comment has been minimized.

Copy link
Contributor Author

bluss commented May 28, 2015

PR updated. Everything is addressed, removed pub. The test checks for number of drop calls now (old binary heap used to drop 1 on panic), but it could still have tracked drops in even more detail.

Ulrik Sverdrup
collections: Make BinaryHeap panic safe in sift_up / sift_down
Use a struct called Hole that keeps track of an invalid location
in the vector and fills the hole on drop.

I include a run-pass test that the current BinaryHeap fails, and the new
one passes.

Fixes #25842

@bluss bluss force-pushed the bluss:binary-heap-hole branch from c41a5cc to 5249cbb May 28, 2015

@bluss

This comment has been minimized.

Copy link
Contributor Author

bluss commented May 28, 2015

wait, oh I totally thought I had spotted a bug and wondered how that could have slipped my tests, but no, it's fine. Pushed an update with one scope less that I didn't need, so the diff for sift_down is easier to read.

@Gankro

This comment has been minimized.

Copy link
Contributor

Gankro commented May 28, 2015

@bors r+

Sweet!

@bors

This comment has been minimized.

Copy link
Contributor

bors commented May 28, 2015

📌 Commit 5249cbb has been approved by Gankro

@alexcrichton alexcrichton added the T-libs label May 28, 2015

@bors

This comment has been minimized.

Copy link
Contributor

bors commented May 28, 2015

⌛️ Testing commit 5249cbb with merge efebe45...

bors added a commit that referenced this pull request May 28, 2015

Auto merge of #25856 - bluss:binary-heap-hole, r=Gankro
collections: Make BinaryHeap panic safe in sift_up / sift_down

Use a struct called Hole that keeps track of an invalid location
in the vector and fills the hole on drop.

I include a run-pass test that the current BinaryHeap fails, and the new
one passes.

NOTE: The BinaryHeap will still be inconsistent after a comparison fails. It will
not have the heap property. What we fix is just that elements will be valid
values.

This is actually a performance win -- the new code does not bother to write in `zeroed()`
values in the holes, it just leaves them as they were.

Net result is something like a 5% decrease in runtime for `BinaryHeap::from_vec`. This
can be further improved by using unchecked indexing (I confirmed it makes a difference,
not a surprise with the non-sequential access going on), but let's leave that for another PR.
Safety first 😉 

Fixes #25842

@bors bors merged commit 5249cbb into rust-lang:master May 28, 2015

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details

@bluss bluss deleted the bluss:binary-heap-hole branch May 28, 2015

@huonw

This comment has been minimized.

Copy link
Member

huonw commented May 29, 2015

Nice work @bluss!

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jun 9, 2015

triage: beta-accepted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.