Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When DST struct's last field is misaligned, memory access generated are for the wrong address #26403

Closed
mukilan opened this issue Jun 18, 2015 · 32 comments

Comments

Projects
None yet
@mukilan
Copy link
Contributor

commented Jun 18, 2015

Example

struct Foo<T: ?Sized + Bar> {
    a: u8,
    b: T
}

trait Bar {
    fn get(&self) -> usize;
}

struct BarImpl {
    i: usize
}

impl Bar for BarImpl {
    fn get(&self) -> usize {
        self.i
    }
}


fn main() {
    let f = Foo { a:1, b: BarImpl { i: 5 } };
    assert!(f.b.get() == 5); // succeeds
    let f = &f as &Foo<Bar>;
    assert!(f.b.get() == 5); // assert fails
}

in playpen

@eddyb

This comment has been minimized.

Copy link
Member

commented Jun 18, 2015

@Aatch

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2015

Hmm, this seems tricky to solve properly, I see three options.

1.Force word alignment on unsized fields. This is the easiest, but won't work for types with higer alignment due to SIMD or similar.
2. Force maybe-unsized types to be packed, but that doesn't seem very appealing.
3. Calculate the appropriate offset based on the alignment information in the vtable. This is the most robust method, but makes extracting the field in an unsized type require a runtime calculation.

@eddyb

This comment has been minimized.

Copy link
Member

commented Jun 19, 2015

@Aatch the preferred solution involves right-aligning fields before the unsized one (as opposed to left-aligning them) and making pointers to the structure point to the field directly.
But it's more work and may not play nicely with LLVM.

That and your 3. are the only two correct solutions I know of.

@Gankro

This comment has been minimized.

Copy link
Contributor

commented Jun 29, 2015

Worth noting that tuples have their last field ?Sized, so any decision here may affect them too.

@eddyb

This comment has been minimized.

Copy link
Member

commented Jun 29, 2015

@Gankro you sure? I'd be very surprised if that were the case, as I've never seen any compiler code supporting unsized tuples (or enums, for that matter).

@Aatch

This comment has been minimized.

Copy link
Contributor

commented Jun 29, 2015

@Gankro

This comment has been minimized.

Copy link
Contributor

commented Jun 29, 2015

https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md#coercions specified this (last bullet of "coerce_inner").

https://github.com/rust-lang/rfcs/blob/master/text/0982-dst-coercion.md does not appear to overturn this decision.

If it doesn't work, it's a bug.

@eddyb

This comment has been minimized.

Copy link
Member

commented Jun 29, 2015

@Gankro 401 is obsolete wrt unsizing. 982 specifies struct only, and that's it. Nothing else is specified atm. If there is a bug, it's that 982 doesn't make it clear that it obsoleted 401's unsizing, which was overreaching and remained unimplemented.

@Gankro

This comment has been minimized.

Copy link
Contributor

commented Jun 29, 2015

Works for me 👍

@pnkfelix

This comment has been minimized.

Copy link
Member

commented Jul 24, 2015

@Aatch wrote (the third of three options):

  1. [...]
  2. [...]
  3. Calculate the appropriate offset based on the alignment information in the vtable. This is the most robust method, but makes extracting the field in an unsized type require a runtime calculation.

Just to be clear: am I correct that the runtime calculation is only necessary when the type T (for the field of type T: ?Sized) has been instantiated with a trait type? I.e. if we instantiated T with the concrete types S or [S; k] (where S: Sized), then there is no need for runtime calculation to determine the field offset, right?

I ask because that scenario does not sound so dire to me.

@eddyb

This comment has been minimized.

Copy link
Member

commented Jul 24, 2015

@pnkfelix Yes, it's only a performance regression for P<Trait> with P being Rc or Arc, but not Box, &_ or any other prefix-less pointer.
Also P<[T]> is not affected, as it has a constant alignment (that of T).

The cheaper solution involves pointing, e.g. Rc<Trait> directly at the Trait and having the ref-counts right-aligned so they are at constant negative offsets from the Trait lvalue.

But since struct layout is not specified (and #[repr(C)] structs must always be Sized, although I'm not sure we enforce it), switching from runtime calculations to the pointer-to-unsized-field scheme should be backwards-compatible.

That said, even if an implementation of @Aatch's 3rd option is likely to be merged, it's still somewhat low-priority, with Rc<Trait> and Arc<Trait> behaving correctly for types of pointer alignment or less.

I expect the 16-byte SIMD alignments to mess things up, but on x64 that only can happen if the allocation itself is not aligned to 16 bytes, as the RcBox prefix is also 16 bytes there.

@pnkfelix

This comment has been minimized.

Copy link
Member

commented Jul 24, 2015

@eddyb yes I would normally be inclined to agree that it is low-priority; I came and looked at this based on @eefriedman 's linking #27023 here ... but I'm not really sure I agree these are fundamentally the same bug. (That is, it seems like a fix for #27023 can happen independently, and sooner, than the fix for #26403 ...)

@pnkfelix

This comment has been minimized.

Copy link
Member

commented Jul 24, 2015

(on further reflection, I think I can see why @eefriedman made the connection between the two bugs -- it seems like the most robust approach to resolving bugs like #27023 may require a fix to this bug as well. Having said that, I'm pretty sure that #27023 can be isolated and fixed on its own.)

@Aatch

This comment has been minimized.

Copy link
Contributor

commented Jul 27, 2015

@pnkfelix just to show that a runtime calculation would only affect things with data before the unsized field:

alignment padding = (alignment - (initial offset % alignment)) % alignment

If offset is 0 then:

alignment padding = (alignment - (0 % alignment)) % alignment
                  = (alignment - 0) % alignment
                  = alignment % alignment
                  = 0

So while it's easy enough to just special-case the codegen for the 0-offset case, in theory, LLVM would be able to constant-fold and inst-simplify it away anyway.

One thing idea I did come up with to try and reduce the performance hit would be to take advantage of branch prediction and do something like this:

let alignment = get_alignment(val);
let offset = if likely(alignment == word_align) {
    calc_offset(initial_offset, word_align)
} else {
    calc_offset(initial_offset, alignment)
};

Where word_align and initial_offset are constants. This should allow the CPU to speculate based on alignment == word_align (which I think is a common case) while waiting for the load for the alignment (which is the real performance hit here). Having thought about it more, the performance hit, while non-zero, probably isn't going to be an issue. Trait objects already represent a sacrifice in performance and the alignment calculation is just a few bitwise operations (since we can assume a power-of-two alignment)

@Diggsey

This comment has been minimized.

Copy link
Contributor

commented Jul 27, 2015

There's another option I think: the compiler could generate a separate vtable for each incorrect alignment of a trait object, whose entries point to thunks which correctly align the this pointer before forwarding to the original method.

The downside is that there would potentially be multiple vtables per trait/type combination (up to one per trait/type/alignment combination). On the other hand, it's simple and fast, and it should be a rare enough occurence to not be a problem in terms of memory usage.

@eddyb

This comment has been minimized.

Copy link
Member

commented Jul 27, 2015

@Diggsey I think you mean "up to one per trait/type/alignment combination".

@Diggsey

This comment has been minimized.

Copy link
Contributor

commented Jul 27, 2015

@eddyb Yes

Actually, it's possible to improve on that idea: you only need up to two vtables per trait/type combination. One for when the alignment is correct, one for when it's incorrect. Also, you don't need separate thunks: prefix each trait method implementation with an instruction sequence which aligns the this pointer. The vtable for a correct alignment skips that sequence, while the vtable for an incorrect alignment does not.

@briansmith

This comment has been minimized.

Copy link

commented Nov 4, 2015

1.Force word alignment on unsized fields. This is the easiest, but won't work for types with higer alignment due to SIMD or similar.

As a temporary measure, could we force maximal alignment on unsized fields? That is, for each platform define a constant that holds the largest alignment required for a type (SIMD or otherwise), and then force all unsized fields to have that alignment. We could then improve the efficiency of the alignment with more complicated things in future versions.

In particular, I am hoping that #[repr(simd)] can be made stable (very) soon and so a very simple and easy solution sounds like a great way to help make that happen.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Nov 5, 2015

I think long term, I favor either dynamic computation or right-justification (the former, dynamic computation, is what @nrc and I settled on when he was doing the initial implementation, but I guess the work never got done.)

Using "maximal" alignment (really: SIMD alignment) on (potentially) DST-ified fields may be plausible, but it is a bit tricky. I guess the question of tuples is an important one. That is, we have to see how much we can identify the set of fields that we would have to align this way, because when you are creating a struct initially you do not necessarily know whether it will become DST-ified after the fact. So you have to conservatively bump up the alignment for the fields all the time.

I am curious to see how hard it would be to implement the dynamic computation: in principle it seems fairly straightforward, possibly easier than "maximal" alignment even. Though introducing the notion that alignment may or may not be dynamically computed is sure to be something of a pain (brings back bad memories of the pre-monomorphization days, though obviously this is a significantly more limited change).

@Gankro

This comment has been minimized.

Copy link
Contributor

commented Nov 5, 2015

There's no such thing as maximal alignment with repr(SIMD). Even if there was, we have simd types that go as high as 512-bit alignment today. afaict you could get away with 128-bit align for a 99% solution, but it potentially penalizes some ops that "work better" at higher alignment, and would be broken for a few ops.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Nov 5, 2015

However, it was pointed out that this may, indeed, be fixed -- or at least
partially so. e.g. there was #27351

On Thu, Nov 5, 2015 at 12:47 PM, Alexis Beingessner <
notifications@github.com> wrote:

There's no such thing as maximal alignment with repr(SIMD). Even if there
was, we have simd types that go as high as 512-bit alignment today. afaict
you could get away with 128-bit align for a 99% solution, but it
potentially penalizes some ops that "work better" at higher alignment, and
would be broken for a few ops.


Reply to this email directly or view it on GitHub
#26403 (comment).

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Nov 5, 2015

OK, just tested --- not completely fixed. But some of the pieces are there!

On Thu, Nov 5, 2015 at 1:56 PM, Nicholas Matsakis nmatsakis@mozilla.com
wrote:

However, it was pointed out that this may, indeed, be fixed -- or at least
partially so. e.g. there was #27351

On Thu, Nov 5, 2015 at 12:47 PM, Alexis Beingessner <
notifications@github.com> wrote:

There's no such thing as maximal alignment with repr(SIMD). Even if
there was, we have simd types that go as high as 512-bit alignment today.
afaict you could get away with 128-bit align for a 99% solution, but it
potentially penalizes some ops that "work better" at higher alignment, and
would be broken for a few ops.


Reply to this email directly or view it on GitHub
#26403 (comment).

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Nov 25, 2015

triage: I-nominated

(seems like something good to at least have categorized)

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Dec 3, 2015

triage: P-medium

@arielb1

This comment has been minimized.

Copy link
Contributor

commented Dec 3, 2015

We decided that implementing dynamic offset calculations is the best option.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Dec 3, 2015

I think we should implement the dynamic solution now. It seems to be the only solution that works for all existing code. @eddyb's idea of adding a pivot field (and right-justifying the items before it) is a nifty optimization, but it requires an annotation or opt-in on the Rc struct -- which means that if there are users who have similar structs, their code would no longer be supported. (Unless we made the choice of pivot field implicit, I suppose.)

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Dec 3, 2015

(Making pivot field implicit would probably not handle #[repr(C)] either)

@Gankro

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2015

Can you clarify why it would be an opt in? I don't see what the breaking change is, unless people are blindly transmuting Rc's, which is blatant UB.

@Aatch

This comment has been minimized.

Copy link
Contributor

commented Dec 6, 2015

I have implemented the dynamic solution, wasn't too hard to do. I just need to write some tests before making a PR.

@Aatch Aatch self-assigned this Dec 6, 2015

Aatch added a commit to Aatch/rust that referenced this issue Dec 7, 2015

Align pointers to DST fields properly
DST fields, being of an unknown type, are not automatically aligned
properly, so a pointer to the field needs to be aligned using the
information in the vtable.

Fixes rust-lang#26403 and a number of other DST-related bugs discovered while
implementing this.

bors added a commit that referenced this issue Dec 9, 2015

Auto merge of #30245 - Aatch:dynamic-align-dst, r=pnkfelix
Fixes #26403

This adjusts the pointer, if needed, to the correct alignment by using the alignment information in the vtable.

Handling zero might not be necessary, as it shouldn't actually occur. I've left it as it's own commit so it can be removed fairly easily if people don't think it's worth doing. The way it's handled though means that there shouldn't be much impact on performance.

@bors bors closed this in #30245 Dec 9, 2015

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

commented Dec 15, 2015

@Gankro the Rc type would have to opt-in, not its consumers -- if we just made this change for all structs, it would affect what the value of &Foo is in unexpected and surprised ways, is the point.

@DavidMikeSimon

This comment has been minimized.

Copy link

commented Mar 28, 2017

The PR referenced above was merged, so this issue can probably be closed

@DavidMikeSimon

This comment has been minimized.

Copy link

commented Mar 28, 2017

Oh, whoops, it is already closed. Carry on :-)

peterjoel added a commit to peterjoel/nomicon that referenced this issue Oct 9, 2017

Update exotic-sizes.md
[This issue](rust-lang/rust#26403) was fixed quite some time ago. The warning should no longer be necessary.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.