Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the "efficient "pivot pointer" scheme" for Rc<Trait> #54898

Open
Centril opened this issue Oct 7, 2018 · 14 comments
Open

Use the "efficient "pivot pointer" scheme" for Rc<Trait> #54898

Centril opened this issue Oct 7, 2018 · 14 comments
Assignees
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@Centril
Copy link
Contributor

Centril commented Oct 7, 2018

Refiling what remains of rust-lang/rfcs#981 here...

This might already be done... but @eddyb wrote:

[...] it doesn't use the efficient "pivot pointer" scheme, though.
Maybe a sepparate issue should be opened for that.

@Centril Centril added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Oct 7, 2018
@eddyb
Copy link
Member

eddyb commented Nov 3, 2018

cc @nikomatsakis @RalfJung @rkruppe @gankro @nagisa

This is about having e.g. Rc<T> (more specifically, *mut RcBox<T>) always point to the T on the heap, similar to how Rc::{into,from}_raw work.

EDIT: similar, but not identical!
With a pivot, all fields are at constant offsets from the pointer (see #54898 (comment)).

@hanna-kruppe
Copy link
Contributor

I'm not sure what the expected benefit is. For Rc::{into,from}_raw it allows people to use the T without knowing about (or without us exposing) RcBox. While you have an Rc, Deref takes care of that, and there the only benefit I can think of is not having to add the offset to go from the start of RcBox to the T.

That might be nice for Rc to avoid the "fetch align from vtable and align rcbox_ptr.offset(2 usizes) to that alignment" dance, but it instead requires a similarly complex calculation to get at the reference counts (so it penalizes clone/drop in favor of Deref).

Otherwise it doesn't seem worthwhile since it's just a constant offset -- literally one ALU instruction in the worst case, and very often that offset can be folded into subsequent offset calculations or memory accesses (and on x86, even into arithmetic).

@eddyb
Copy link
Member

eddyb commented Nov 3, 2018

One thing to note is that we do this manually if we wanted to test it!
Basically, keep the into_raw representation in Rc instead of the original pointer.

EDIT: see #54898 (comment), it's subtly different: all fields are at constant offsets from the pointer.

I'd be curious to see some numbers. There were some claims that this sort of scheme has its benefits, years ago, and I'm not sure how much backing they have (other than other languages doing it).

@Gankra
Copy link
Contributor

Gankra commented Nov 3, 2018

Penalizing clone/drop in favour of deref sounds like a pretty decent tradeoff to me? Reference count traffic is much lower in Rust than any other language thanks to the borrowchecker.

Also I would expect this change to be a net win on codesize, which is nice.

@eddyb
Copy link
Member

eddyb commented Nov 4, 2018

Another place where pivots could be useful: Thin<T> that keeps the metadata for T just before the T value, in memory, instead of next to the pointer.

We can make Thin<T> point to the T, meaning turning a &Thin<T> into &T doesn't involve anything more than reading the metadata (at a negative offset).

@eddyb
Copy link
Member

eddyb commented Nov 4, 2018

@rkruppe Ahhh I just realized there's some confusion! You wrote:

That might be nice for Rc to avoid the "fetch align from vtable and align rcbox_ptr.offset(2 usizes) to that alignment" dance, but it instead requires a similarly complex calculation to get at the reference counts (so it penalizes clone/drop in favor of Deref).

But there's no need for complex calculation anywhere!
The fields before the pivot are right-aligned instead of left-aligned, meaning, it's always (ptr as *const u8).offset(-(2*size_of::<usize>() as isize)) to get at the first field of RcBox.

I think you can even prove that no matter which field is the "pivot", the total size of the struct can't.

@hanna-kruppe
Copy link
Contributor

With trait objects (or more generally DSTs other than slices), there currently has to be a variable amount of padding between the the refcounts and the T. If you put offset 0 before that padding, the refcount / metadata can be accessed at a constant offset, but then you need an "round up to runtime alignment" calculation to get at the T. If you put offset 0 at the start of the T, the variable padding is before offset 0 and you need a "round down to usize align" step to skip over that padding and get at the usizes.

I guess you're trying to say we should put that padding at the start of the layout so we get [padding, refcounts/metadata, data] instead of [refcounts/metadata, padding, data]? That would require deeper changes to layout than just storing the Rc::into_raw pointer.

@eddyb
Copy link
Member

eddyb commented Nov 4, 2018

@rkruppe The changes are pretty straight-forward, the pivot opt-in would mean the fields in the struct keep their definition order, so all you have to do is compute offsets twice, once for fields before the pivot, and once for fields after (including the pivot).

I mean, yes, there's some effort involved in allowing signed offsets, but they would still be constant.

We already have to deal with weird layouts because of enum variants, I don't think this would be much more of a hassle. In terms of LLVM, the struct type we give it will not include fields before the pivot, which will have to be accessed in a different way.
miri, OTOH, should trivially support negative offsets.

EDIT: hmm, I haven't considered pivoted structs nested in other structs. So it's more like there are two field directions, each with their own extent (but common alignment).

What should size_of::<RcBox<T>>() say? Some generic (unsafe) code getting an RcBox<T> cannot assume its bytes are laid from offset 0 to its size.

EDIT: What about arrays of RcBox<T>? That + unsafe code, has been the poster child of "stride" vs "used size" complications, and pivots are even scarier than that.

I guess we limit the problems by limiting it to unstable code and requiring pivoted structs to be private (meaning they're never exposed in a way that stable code can see them).

@hanna-kruppe
Copy link
Contributor

I'm deferring to you on implementation effort, I was mostly reacting to "oh we can test this by storing Rc::into_raw instead of the RcBox pointer" and assuming we wouldn't change layout at all.

@eddyb
Copy link
Member

eddyb commented Nov 4, 2018

@rkruppe Right, that was a mistake, my bad - we can still test it, it's just a bit more effort than reusing code from Rc::{into,from}_raw.

Also, I added a few edits to the previous post because I overlooked a few aspects, regarding the conceptual consequences, which might be worse than implementation.

@RalfJung
Copy link
Member

RalfJung commented Nov 4, 2018

@eddyb Why would layout have to be involved? This could all be done by Rc itself, right? Ideally with some wrapper for convenience, but with no reason to make this a generic composeable concept.

@RalfJung
Copy link
Member

RalfJung commented Nov 4, 2018

I am imagining a library that takes two types (the part Prefix before the pivot, and the main type T), and that computes the size and alignment needed for the allocation, and then returns a pointer such that the pointer points to the pivot, and subtracting the size of Prefix yields the base address for the part before the pivot. (In particular, the offset from that pointer to the beginning of where Prefix is stored is constant, so accessing any Prefix field is just a constant offset.)

The alignment of the allocation must be the max of the two alignments of Prefix and T; the offset of the T part must be at least size_of::<Prefix> but also be a multiple of the max-alignment; the size of the allocation must be size_of::<T>() + offset. Then adding offset should yield an aligned pointer that we can use for T such that subtracting size_of::<Prefix> is sufficiently aligned for Prefix and hence can be used for that.

@eddyb
Copy link
Member

eddyb commented Nov 4, 2018

@RalfJung Ah, doing this with a library pointer type makes a lot of sense!
Similar to Pin, it avoids having complications built into the language (movability for Pin).

(I don't know why @nikomatsakis and I didn't consider this - maybe it was before we had a solid story for DSTs, and it was a possible way to implement DST fields without dynamic alignment)

@Gankra
Copy link
Contributor

Gankra commented Nov 5, 2018

Just for historical records: I asked the swift devs if they ever messed around with this stuff (they have super optimized their (A)Rc since all classes are ref-counted), but they hadn't because they are limited by objc compatibility.

@Enselic Enselic added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Nov 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants