Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Implement a Concurrent Generational Typed Arena (and AtomicRef{,Mut}::map) #13777
Conversation
MozReview-Commit-ID: 8iOALQylOuK
|
r? @emilio |
|
@bors-servo try |
Implement a Concurrent Generational Typed Arena (and AtomicRef{,Mut}::map)
This is a core part of the new incremental restyle architecture.
CC @Manishearth @SimonSapin @heycam @emilio
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/13777)
<!-- Reviewable:end -->
|
A few comments from skimming this only. |
| let result = Arc::new(AtomicRefCell::new(Arena { | ||
| generation: 1, | ||
| current_index: AtomicIsize::new(0), | ||
| chunk_list: ChunkList::<Item>::new(), |
emilio
Oct 14, 2016
Member
nit: I think ChunkList::new would work here.
nit: I think ChunkList::new would work here.
| /// operations for callers that don't need to dereference the result. | ||
| pub fn allocate_raw(&self) -> TokenData { | ||
| let idx_signed = self.current_index.fetch_add(1, Ordering::Relaxed); | ||
| if idx_signed < 0 { |
emilio
Oct 14, 2016
Member
Wasn't this undefined behavior? Could we instead use add_fetch and check current_index == 0? Also, you're using isize, which means than on 64-bit you will overflow in the integer conversion below before this point.
Wasn't this undefined behavior? Could we instead use add_fetch and check current_index == 0? Also, you're using isize, which means than on 64-bit you will overflow in the integer conversion below before this point.
emilio
Oct 14, 2016
Member
I mean using add_fetch with an AtomicUsize here, ofc.
I mean using add_fetch with an AtomicUsize here, ofc.
|
|
||
| self.chunk_list.allocate(idx as usize); | ||
| TokenData { | ||
| generation: self.generation, |
emilio
Oct 14, 2016
Member
nit: Spacing is off here.
nit: Spacing is off here.
|
As another unrelated note, probably a benchmark against concurrent allocations of |
| use std::sync::{Arc, Weak}; | ||
| use std::sync::atomic::{AtomicPtr, AtomicIsize, Ordering}; | ||
|
|
||
| ///! A Concurrent Generational Typed Arena. |
emilio
Oct 14, 2016
Member
nit: I think the preferred style is either /// or //!, but not this one.
nit: I think the preferred style is either /// or //!, but not this one.
Just curious - is there any information available about what this is referring to? |
|
This is the design document: https://docs.google.com/a/mozilla.com/document/d/1TcTAQMm-jIiNgzkDgpO04Pja_aQNRNkrD0CNCCYfbo8/edit?usp=sharing The plan was designed around stylo, but with some tweaks it should work for regular Servo as well. |
|
And the tl;dr is that Servo's incremental restyle is pretty basic, and we believe the new architecture will make it world-class. |
|
|
|
(will retry once I push a few fixups) |
|
Thank you for the quick response! It'll take me a little while to read through the document, but it definitely sounds interesting. Cheers! |
MozReview-Commit-ID: JyTV5VqKG8U
|
@bors-servo try |
Implement a Concurrent Generational Typed Arena (and AtomicRef{,Mut}::map)
This is a core part of the new incremental restyle architecture.
CC @Manishearth @SimonSapin @heycam @emilio
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/13777)
<!-- Reviewable:end -->
|
Benchmark numbers on my machine: sequential_allocations_arena : 216 ns/iter (+/- 126) This suggests there's some room for optimization, particularly around the handles (which appear to be the lion's share of the overhead). I'll look into it more next time, but I think it's probably reasonable to land this and address those issues as a followup. |
|
|
|
Could we get a comparison with the usual allocation stuff from Rust itself? |
|
I have some questions. |
| } | ||
|
|
||
| /// Not publicly exposed, invoked internally by `Token`. The caller must | ||
| /// ensure that the `TokenData` does not represent an expired `Token`. |
nox
Oct 16, 2016
Member
What happens if the caller does not ensure that?
What happens if the caller does not ensure that?
bholley
Oct 16, 2016
Author
Contributor
Token ensures it, and it's an internal API, so it doesn't particularly matter.
Token ensures it, and it's an internal API, so it doesn't particularly matter.
|
|
||
| // Shift the index left one bit to leave the bottom bit unused. This | ||
| // allows the callers to use tokens as tagged pointers. | ||
| debug_assert!(self.index & (1_u32 << 31) == 0); |
nox
Oct 16, 2016
Member
Shouldn't that be assert!? It can happen in practice that self.index becomes big enough, right?
Shouldn't that be assert!? It can happen in practice that self.index becomes big enough, right?
bholley
Oct 16, 2016
Author
Contributor
Nope, because allocate_raw panics if we overflow.
Nope, because allocate_raw panics if we overflow.
|
|
||
| /// Deserializes a TokenData from a `u64`. | ||
| pub fn deserialize(x: u64) -> Self { | ||
| debug_assert!(x & 1 == 0, "Caller should strip the tag bit"); |
nox
Oct 16, 2016
Member
Shouldn't this be assert!? What happens if this is false in release mode?
Shouldn't this be assert!? What happens if this is false in release mode?
bholley
Oct 16, 2016
Author
Contributor
It will just get shifted out. But anyway, none of this violates safety because Token::create is unsafe.
That said, it looks like there's a bug right now where deserialize doesn't unshift. I'll fix that and add a test.
It will just get shifted out. But anyway, none of this violates safety because Token::create is unsafe.
That said, it looks like there's a bug right now where deserialize doesn't unshift. I'll fix that and add a test.
|
I've done some interesting analysis that suggests that this arena may be the wrong approach for what we need. I'll write more later this evening or tomorrow. |
|
So, the original motivation for this was that we want to create tons and tons of short-lived TransientStyleData objects during re-styling (on various threads), store them across FFI boundaries, and free them all at once after styling completes. There were a few reasons that an arena was attractive: (1) is the core architectural reason to use an arena. This originally seemed like a big plus that would potentially reduce our traversal requirements, but given further thought and recent architectural changes I think we're probably going to end up traversing any node with a TSD anyway. (2) turned out to be a premature optimization, at least with the performance characteristics on my local machine with simple benchmarks. I took the following reference measurements: Arc clone + release: 13ns (just clone is 6ns) The upshot here is that heap allocation is almost certainly not a bottleneck for our uses here. It's about equivalent to the memory traffic of memmove initialization of the allocated memory on a cache miss, and is on the same order any of the various atomic operations that we perform all over the place. Using uninitialized memory, I was able to get the raw allocation performance of the arena down to 10ns (pretty much the cost of the atomic index bump, with the actual allocations amortized). So there's some win to be had, but it mostly disappears once you start interacting with the allocated memory. As for (3), I think we'll probably be ok, and I don't think fragmentation itself is enough to justify the complexity here. I think this arena would still be very useful for somebody needing the GC-like characteristics of (1), but given that we don't need them for stylo I can't justify spending any more time on it. I'll re-open a PR if something changes. |
|
For posterity, here are some fixes that should be made if this code ever gets used:
|
FWIW jemalloc already does some arena-ing so arenas aren't that effective in improving allocation performance. |
Implement AtomicRef{,Mut}::map
I was originally bundling this with #13777 but am splitting it out since that's been deprioritized.
r? @SimonSapin
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/13797)
<!-- Reviewable:end -->
This is a core part of the new incremental restyle architecture.
CC @Manishearth @SimonSapin @heycam @emilio
This change is