Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When DST struct's last field is misaligned, memory access generated are for the wrong address #26403

Closed
mukilan opened this issue Jun 18, 2015 · 32 comments
Assignees
Labels
A-dst Area: Dynamically Sized Types I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@mukilan
Copy link
Contributor

mukilan commented Jun 18, 2015

Example

struct Foo<T: ?Sized + Bar> {
    a: u8,
    b: T
}

trait Bar {
    fn get(&self) -> usize;
}

struct BarImpl {
    i: usize
}

impl Bar for BarImpl {
    fn get(&self) -> usize {
        self.i
    }
}


fn main() {
    let f = Foo { a:1, b: BarImpl { i: 5 } };
    assert!(f.b.get() == 5); // succeeds
    let f = &f as &Foo<Bar>;
    assert!(f.b.get() == 5); // assert fails
}

in playpen

@eddyb
Copy link
Member

eddyb commented Jun 18, 2015

cc @rust-lang/compiler

@arielb1 arielb1 added I-wrong A-dst Area: Dynamically Sized Types labels Jun 18, 2015
@Aatch
Copy link
Contributor

Aatch commented Jun 19, 2015

Hmm, this seems tricky to solve properly, I see three options.

1.Force word alignment on unsized fields. This is the easiest, but won't work for types with higer alignment due to SIMD or similar.
2. Force maybe-unsized types to be packed, but that doesn't seem very appealing.
3. Calculate the appropriate offset based on the alignment information in the vtable. This is the most robust method, but makes extracting the field in an unsized type require a runtime calculation.

@eddyb
Copy link
Member

eddyb commented Jun 19, 2015

@Aatch the preferred solution involves right-aligning fields before the unsized one (as opposed to left-aligning them) and making pointers to the structure point to the field directly.
But it's more work and may not play nicely with LLVM.

That and your 3. are the only two correct solutions I know of.

@Gankra
Copy link
Contributor

Gankra commented Jun 29, 2015

Worth noting that tuples have their last field ?Sized, so any decision here may affect them too.

@eddyb
Copy link
Member

eddyb commented Jun 29, 2015

@gankro you sure? I'd be very surprised if that were the case, as I've never seen any compiler code supporting unsized tuples (or enums, for that matter).

@Aatch
Copy link
Contributor

Aatch commented Jun 29, 2015

@Gankra
Copy link
Contributor

Gankra commented Jun 29, 2015

https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md#coercions specified this (last bullet of "coerce_inner").

https://github.com/rust-lang/rfcs/blob/master/text/0982-dst-coercion.md does not appear to overturn this decision.

If it doesn't work, it's a bug.

@eddyb
Copy link
Member

eddyb commented Jun 29, 2015

@gankro 401 is obsolete wrt unsizing. 982 specifies struct only, and that's it. Nothing else is specified atm. If there is a bug, it's that 982 doesn't make it clear that it obsoleted 401's unsizing, which was overreaching and remained unimplemented.

@Gankra
Copy link
Contributor

Gankra commented Jun 29, 2015

Works for me 👍

@pnkfelix
Copy link
Member

@Aatch wrote (the third of three options):

  1. [...]
  2. [...]
  3. Calculate the appropriate offset based on the alignment information in the vtable. This is the most robust method, but makes extracting the field in an unsized type require a runtime calculation.

Just to be clear: am I correct that the runtime calculation is only necessary when the type T (for the field of type T: ?Sized) has been instantiated with a trait type? I.e. if we instantiated T with the concrete types S or [S; k] (where S: Sized), then there is no need for runtime calculation to determine the field offset, right?

I ask because that scenario does not sound so dire to me.

@eddyb
Copy link
Member

eddyb commented Jul 24, 2015

@pnkfelix Yes, it's only a performance regression for P<Trait> with P being Rc or Arc, but not Box, &_ or any other prefix-less pointer.
Also P<[T]> is not affected, as it has a constant alignment (that of T).

The cheaper solution involves pointing, e.g. Rc<Trait> directly at the Trait and having the ref-counts right-aligned so they are at constant negative offsets from the Trait lvalue.

But since struct layout is not specified (and #[repr(C)] structs must always be Sized, although I'm not sure we enforce it), switching from runtime calculations to the pointer-to-unsized-field scheme should be backwards-compatible.

That said, even if an implementation of @Aatch's 3rd option is likely to be merged, it's still somewhat low-priority, with Rc<Trait> and Arc<Trait> behaving correctly for types of pointer alignment or less.

I expect the 16-byte SIMD alignments to mess things up, but on x64 that only can happen if the allocation itself is not aligned to 16 bytes, as the RcBox prefix is also 16 bytes there.

@pnkfelix
Copy link
Member

@eddyb yes I would normally be inclined to agree that it is low-priority; I came and looked at this based on @eefriedman 's linking #27023 here ... but I'm not really sure I agree these are fundamentally the same bug. (That is, it seems like a fix for #27023 can happen independently, and sooner, than the fix for #26403 ...)

@pnkfelix
Copy link
Member

(on further reflection, I think I can see why @eefriedman made the connection between the two bugs -- it seems like the most robust approach to resolving bugs like #27023 may require a fix to this bug as well. Having said that, I'm pretty sure that #27023 can be isolated and fixed on its own.)

@Aatch
Copy link
Contributor

Aatch commented Jul 27, 2015

@pnkfelix just to show that a runtime calculation would only affect things with data before the unsized field:

alignment padding = (alignment - (initial offset % alignment)) % alignment

If offset is 0 then:

alignment padding = (alignment - (0 % alignment)) % alignment
                  = (alignment - 0) % alignment
                  = alignment % alignment
                  = 0

So while it's easy enough to just special-case the codegen for the 0-offset case, in theory, LLVM would be able to constant-fold and inst-simplify it away anyway.

One thing idea I did come up with to try and reduce the performance hit would be to take advantage of branch prediction and do something like this:

let alignment = get_alignment(val);
let offset = if likely(alignment == word_align) {
    calc_offset(initial_offset, word_align)
} else {
    calc_offset(initial_offset, alignment)
};

Where word_align and initial_offset are constants. This should allow the CPU to speculate based on alignment == word_align (which I think is a common case) while waiting for the load for the alignment (which is the real performance hit here). Having thought about it more, the performance hit, while non-zero, probably isn't going to be an issue. Trait objects already represent a sacrifice in performance and the alignment calculation is just a few bitwise operations (since we can assume a power-of-two alignment)

@Diggsey
Copy link
Contributor

Diggsey commented Jul 27, 2015

There's another option I think: the compiler could generate a separate vtable for each incorrect alignment of a trait object, whose entries point to thunks which correctly align the this pointer before forwarding to the original method.

The downside is that there would potentially be multiple vtables per trait/type combination (up to one per trait/type/alignment combination). On the other hand, it's simple and fast, and it should be a rare enough occurence to not be a problem in terms of memory usage.

@eddyb
Copy link
Member

eddyb commented Jul 27, 2015

@Diggsey I think you mean "up to one per trait/type/alignment combination".

@Diggsey
Copy link
Contributor

Diggsey commented Jul 27, 2015

@eddyb Yes

Actually, it's possible to improve on that idea: you only need up to two vtables per trait/type combination. One for when the alignment is correct, one for when it's incorrect. Also, you don't need separate thunks: prefix each trait method implementation with an instruction sequence which aligns the this pointer. The vtable for a correct alignment skips that sequence, while the vtable for an incorrect alignment does not.

@briansmith
Copy link
Contributor

1.Force word alignment on unsized fields. This is the easiest, but won't work for types with higer alignment due to SIMD or similar.

As a temporary measure, could we force maximal alignment on unsized fields? That is, for each platform define a constant that holds the largest alignment required for a type (SIMD or otherwise), and then force all unsized fields to have that alignment. We could then improve the efficiency of the alignment with more complicated things in future versions.

In particular, I am hoping that #[repr(simd)] can be made stable (very) soon and so a very simple and easy solution sounds like a great way to help make that happen.

@nikomatsakis
Copy link
Contributor

I think long term, I favor either dynamic computation or right-justification (the former, dynamic computation, is what @nrc and I settled on when he was doing the initial implementation, but I guess the work never got done.)

Using "maximal" alignment (really: SIMD alignment) on (potentially) DST-ified fields may be plausible, but it is a bit tricky. I guess the question of tuples is an important one. That is, we have to see how much we can identify the set of fields that we would have to align this way, because when you are creating a struct initially you do not necessarily know whether it will become DST-ified after the fact. So you have to conservatively bump up the alignment for the fields all the time.

I am curious to see how hard it would be to implement the dynamic computation: in principle it seems fairly straightforward, possibly easier than "maximal" alignment even. Though introducing the notion that alignment may or may not be dynamically computed is sure to be something of a pain (brings back bad memories of the pre-monomorphization days, though obviously this is a significantly more limited change).

@Gankra
Copy link
Contributor

Gankra commented Nov 5, 2015

There's no such thing as maximal alignment with repr(SIMD). Even if there was, we have simd types that go as high as 512-bit alignment today. afaict you could get away with 128-bit align for a 99% solution, but it potentially penalizes some ops that "work better" at higher alignment, and would be broken for a few ops.

@nikomatsakis
Copy link
Contributor

However, it was pointed out that this may, indeed, be fixed -- or at least
partially so. e.g. there was #27351

On Thu, Nov 5, 2015 at 12:47 PM, Alexis Beingessner <
notifications@github.com> wrote:

There's no such thing as maximal alignment with repr(SIMD). Even if there
was, we have simd types that go as high as 512-bit alignment today. afaict
you could get away with 128-bit align for a 99% solution, but it
potentially penalizes some ops that "work better" at higher alignment, and
would be broken for a few ops.


Reply to this email directly or view it on GitHub
#26403 (comment).

@nikomatsakis
Copy link
Contributor

OK, just tested --- not completely fixed. But some of the pieces are there!

On Thu, Nov 5, 2015 at 1:56 PM, Nicholas Matsakis nmatsakis@mozilla.com
wrote:

However, it was pointed out that this may, indeed, be fixed -- or at least
partially so. e.g. there was #27351

On Thu, Nov 5, 2015 at 12:47 PM, Alexis Beingessner <
notifications@github.com> wrote:

There's no such thing as maximal alignment with repr(SIMD). Even if
there was, we have simd types that go as high as 512-bit alignment today.
afaict you could get away with 128-bit align for a 99% solution, but it
potentially penalizes some ops that "work better" at higher alignment, and
would be broken for a few ops.


Reply to this email directly or view it on GitHub
#26403 (comment).

@alexcrichton
Copy link
Member

triage: I-nominated

(seems like something good to at least have categorized)

@alexcrichton alexcrichton added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Nov 25, 2015
@Gankra Gankra added the I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness label Nov 25, 2015
@nikomatsakis
Copy link
Contributor

triage: P-medium

@arielb1 arielb1 added the I-needs-decision Issue: In need of a decision. label Dec 3, 2015
@arielb1
Copy link
Contributor

arielb1 commented Dec 3, 2015

We decided that implementing dynamic offset calculations is the best option.

@arielb1 arielb1 removed the I-needs-decision Issue: In need of a decision. label Dec 3, 2015
@nikomatsakis
Copy link
Contributor

I think we should implement the dynamic solution now. It seems to be the only solution that works for all existing code. @eddyb's idea of adding a pivot field (and right-justifying the items before it) is a nifty optimization, but it requires an annotation or opt-in on the Rc struct -- which means that if there are users who have similar structs, their code would no longer be supported. (Unless we made the choice of pivot field implicit, I suppose.)

@nikomatsakis
Copy link
Contributor

(Making pivot field implicit would probably not handle #[repr(C)] either)

@Gankra
Copy link
Contributor

Gankra commented Dec 4, 2015

Can you clarify why it would be an opt in? I don't see what the breaking change is, unless people are blindly transmuting Rc's, which is blatant UB.

@Aatch
Copy link
Contributor

Aatch commented Dec 6, 2015

I have implemented the dynamic solution, wasn't too hard to do. I just need to write some tests before making a PR.

@Aatch Aatch self-assigned this Dec 6, 2015
Aatch added a commit to Aatch/rust that referenced this issue Dec 7, 2015
DST fields, being of an unknown type, are not automatically aligned
properly, so a pointer to the field needs to be aligned using the
information in the vtable.

Fixes rust-lang#26403 and a number of other DST-related bugs discovered while
implementing this.
bors added a commit that referenced this issue Dec 9, 2015
Fixes #26403

This adjusts the pointer, if needed, to the correct alignment by using the alignment information in the vtable.

Handling zero might not be necessary, as it shouldn't actually occur. I've left it as it's own commit so it can be removed fairly easily if people don't think it's worth doing. The way it's handled though means that there shouldn't be much impact on performance.
@nikomatsakis
Copy link
Contributor

@gankro the Rc type would have to opt-in, not its consumers -- if we just made this change for all structs, it would affect what the value of &Foo is in unexpected and surprised ways, is the point.

@DavidMikeSimon
Copy link

The PR referenced above was merged, so this issue can probably be closed

@DavidMikeSimon
Copy link

Oh, whoops, it is already closed. Carry on :-)

peterjoel added a commit to peterjoel/nomicon that referenced this issue Oct 9, 2017
[This issue](rust-lang/rust#26403) was fixed quite some time ago. The warning should no longer be necessary.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dst Area: Dynamically Sized Types I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests