Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move garbage collection to seize #102

Merged
merged 29 commits into from
Feb 26, 2022
Merged

Move garbage collection to seize #102

merged 29 commits into from
Feb 26, 2022

Conversation

ibraheemdev
Copy link
Collaborator

@ibraheemdev ibraheemdev commented Feb 9, 2022

Background

This PR moves garbage collection from crossbeam-epoch to seize. As shown in #50, crossbeam-epoch has serious performance issues that make flurry much slower than other concurrent hashmaps. #95 attempted to switch to flize, but unfortunately the pull request as well as the library have stalled.

Design

seize uses a special type of reference counting instead of tracking global epochs. When an object is retired, it is first appended to a thread-local batch. This operation does not allocate. When the batch size hits a configurable limit, the batch is added to a separate list for each thread that is currently active (holds a guard). The reference count of the batch is set to the number of threads that are active. After every thread drops its guard, it decrements the reference count. If it holds the last reference, it frees the entire batch.

Implementation

The API provides by seize differs quite a bit from that of crossbeam-epoch. I tried to make migration easier by adding a reclaim module that mimics some of the previously used APIs, but here are some of the major changes:

  • seize requires metadata to be stored with every allocated object, so T -> Linked<T> has propagated around the codebase.
  • seize has no concept of a global collector, which means that:
    • The free-function pin has been removed.
    • with_collector is now public as the footgun of using epoch::pin() no longer exists.
    • Guard has a lifetime. Public functions now take &'g Guard<'_> instead of &'g Guard.
    • Many internal functions now need access to the &Collector in order to allocate objects.

Performance

The algorithm used by seize has shown to have performance similar to or better than optimized epoch-based schemes (like flize). It is also much more efficient in terms of memory usage as reclamation is more targeted. Reclamation is naturally parallelized and balances across threads due to the nature of reference counting. Using seize significantly improves flurry's performance. Here's a read-heavy benchmark that was run on a 8 core machine, pinning before every operation:

Remaining Items

  • Determine if the &'g Guard<'g> stored by iterators is problematic.
  • Potentially loosen K/V: 'static bounds now that there is no global collector.
  • Fix check_guard. This is difficult because seize::Collector contains no allocations that can be easily accessed to check pointer equality. Perhaps flurry should expose it's own guard type that also contains an &HashMap, or a dummy allocation?
  • Update some of the documentation around guards.
  • Release new seize version to crates.io with required changes.

cc @jonhoo @domenicquirl


This change is Reviewable

@codecov
Copy link

codecov bot commented Feb 9, 2022

Codecov Report

Merging #102 (fe8a171) into master (9fe7d21) will increase coverage by 0.55%.
The diff coverage is 91.26%.

Impacted Files Coverage Δ
src/map.rs 80.58% <ø> (-0.02%) ⬇️
src/set_ref.rs 53.65% <0.00%> (ø)
src/reclaim.rs 86.00% <86.00%> (ø)
src/node.rs 78.53% <90.00%> (+1.62%) ⬆️
src/set.rs 95.31% <90.90%> (+1.46%) ⬆️
src/iter/traverser.rs 89.84% <94.11%> (+0.24%) ⬆️
src/raw/mod.rs 79.72% <95.00%> (-2.22%) ⬇️
src/iter/mod.rs 100.00% <100.00%> (ø)
src/map_ref.rs 86.53% <100.00%> (ø)
src/rayon_impls.rs 98.66% <100.00%> (+0.04%) ⬆️
... and 4 more

@jonhoo
Copy link
Owner

jonhoo commented Feb 12, 2022

This is so cool, thanks for taking it on!

@ibraheemdev
Copy link
Collaborator Author

The "Disallowed attributes" CI job seems to be failing with a weird error, any chance you know why that is?

@jonhoo
Copy link
Owner

jonhoo commented Feb 12, 2022

#103 should take care of that 👍

@jonhoo
Copy link
Owner

jonhoo commented Feb 12, 2022

There — a rebase should do it

@ibraheemdev
Copy link
Collaborator Author

Can you approve the workflow run?

@ibraheemdev
Copy link
Collaborator Author

Looks like you need to approve after every push...

@jonhoo
Copy link
Owner

jonhoo commented Feb 13, 2022

Hmm, I didn't think that should be necessary.. Let's try that again

@ibraheemdev
Copy link
Collaborator Author

Everything is passing except Miri (need to disable isolation because seize calls num_cpus::get) and MSRV (which I can fix) 🎉

Looks like you need to approve again. I think you can disable the approval requirement here in the settings, although Github should recognize that you already approved my branch...

@ibraheemdev
Copy link
Collaborator Author

ibraheemdev commented Feb 15, 2022

Can you approve the workflow again, I believe I fixed the miri issue.

Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I had a few comments, but I don't think any of them should be very hard to address 👍

src/map.rs Show resolved Hide resolved
tests/basic.rs Outdated Show resolved Hide resolved
Cargo.toml Outdated Show resolved Hide resolved
src/iter/mod.rs Outdated Show resolved Hide resolved
src/iter/mod.rs Show resolved Hide resolved
src/reclaim.rs Outdated Show resolved Hide resolved
src/node.rs Outdated Show resolved Hide resolved
}

pub(crate) fn boxed(value: T, collector: &Collector) -> Self {
Shared::from(collector.link_boxed(value))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that a ref to the Collector is needed to construct a Shared has me a little concerned — does every seize-protected value hold a pointer to the collector? If so, that'll pretty significantly increase the size of a map where the key and value types are small. Say they're u32, but now we're sticking in a full usize pointer twice in every entry as well!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The map holds pointer to the Linked<T>, not the value directly, so it shouldn't increase the size?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still increases the overall heap memory use of the map significantly. Previously we'd store (for each item):

sizeof(usize) + sizeof(K) + sizeof(usize) + sizeof(V)

whereas if each Linked also holds a pointer to the container, and every K and V is a Linked<_>, then we're now storing

2 * sizeof(usize) + sizeof(K) + 2 * sizeof(usize) + sizeof(V)

which is a fairly significant increase in heap memory for a map when K and V are small, and there are many elements, such as maps to or from integers (which are very common in practice).

Copy link
Collaborator Author

@ibraheemdev ibraheemdev Feb 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linked actually stores a lot more than that. Linked<()> is 32 bytes, and packs a ton of metadata in there (2 unions). It stores a link to the head of the batch, the recorded epoch value when it was allocated, a link to the next node in the reservation list, the reference count and minimum recorded epoch of the batch, a link to the next node in the batch, and the reservation function 😅 All of this is pretty fundamental to the reclamation algorithm. Some of the data is inlined from what would instead be a thread-local list in crossbeam-epoch/hazard pointers.

How big of an issue do you think this is?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooooof, that's pretty painful. For every value? That's definitely worrying. Consider a map with, say, 100k entries, all of which are usizes. Its heap memory use would increase three-fold (from 100k*(228)=3.2MB to 100k*(2*(8+32+8))=9.6MB). Now, whether it's a deal-breaker, that's harder to say... In its current state, flurry isn't going to be used since its performance is so poor, and at least with this its performance is pretty good, even if its memory footprint is kind of lousy. That said, it would be in Java too, so maybe that's just what we signed up for :p

I think let's still go ahead with this, but maybe add a note about performance and memory characteristics to the docs, and highlight the expected memory use of the map given N, sizeof(K), and sizeof(V).

Copy link
Collaborator Author

@ibraheemdev ibraheemdev Feb 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.. it might be possible to initially only store the epoch value, and then only when a node is retired, allocate the 32 bytes needed to keep track of it's reservation status. That would mean we incur an extra pointer indirection anytime reference counts need to be decremented.

Do you think that would be worthwhile?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean an extra pointer indirection on every read? If so, I'd be curious to see the performance impact. But could very well be worth it, yeah!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No not on every read, just in retire and Guard::drop.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That definitely sounds promising then!

src/node.rs Outdated Show resolved Hide resolved
src/node.rs Outdated Show resolved Hide resolved
@ibraheemdev
Copy link
Collaborator Author

Potentially loosen K/V: 'static bounds now that there is no global collector.

All 'static bounds have been removed. Now that the collector is owned by the map, all retired values will be reclaimed when the map is dropped.

@ibraheemdev
Copy link
Collaborator Author

I think the only blocker now is documentation. I'll open an issue about memory usage optimizations.

@ibraheemdev
Copy link
Collaborator Author

Alright, I updated the public documentation, but there are still 100+ safety comments that talk about "thread X must be pinned to a lesser epoch" 😅 Do you think it's important all these comments are updated? They are all correct of course, the terminology is just wrong (s/pinned to an epoch/marked as active).

@ibraheemdev ibraheemdev marked this pull request as ready for review February 20, 2022 22:42
src/map.rs Outdated Show resolved Hide resolved
@jonhoo
Copy link
Owner

jonhoo commented Feb 20, 2022

Do you think it's important all these comments are updated? They are all correct of course, the terminology is just wrong (s/pinned to an epoch/marked as active).

Unfortunately, yes, I do think it's important to update them, even if the terminology is just slightly off. Contributors will read the safety comments and try to reason from them as to whether their modifications are correct, and if the comment doesn't match the semantics of the protection, it will be very confusing. Hopefully it'll be mostly search/replace though!

@ibraheemdev
Copy link
Collaborator Author

ibraheemdev commented Feb 25, 2022

Updated all the safety comments, I think this can be merged now, provided I didn't make any typos :)

Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly minor nits for the updated comments. I think we're almost good to go!

src/map.rs Outdated Show resolved Hide resolved
src/map.rs Outdated Show resolved Hide resolved
src/map.rs Outdated Show resolved Hide resolved
src/map.rs Outdated Show resolved Hide resolved
src/map.rs Show resolved Hide resolved
src/map.rs Outdated Show resolved Hide resolved
src/map.rs Outdated Show resolved Hide resolved
src/map.rs Show resolved Hide resolved
src/map.rs Outdated Show resolved Hide resolved
src/node.rs Show resolved Hide resolved
@ibraheemdev
Copy link
Collaborator Author

I think I've addressed all your comments.

src/map.rs Outdated Show resolved Hide resolved
Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sticking with it!

@jonhoo
Copy link
Owner

jonhoo commented Feb 26, 2022

Also, I would love to have you as a co-maintainer of this crate if you're up for it! You certainly know the code-base well enough at this point 😅

@jonhoo jonhoo merged commit 669b41f into jonhoo:master Feb 26, 2022
@jonhoo jonhoo mentioned this pull request Feb 26, 2022
@ibraheemdev
Copy link
Collaborator Author

Sure! 😄

@jonhoo
Copy link
Owner

jonhoo commented Feb 26, 2022

Just sent you a collaborator invite!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants