New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repr(packed) allows invalid unaligned loads #27060

Open
huonw opened this Issue Jul 16, 2015 · 98 comments

Comments

@huonw
Member

huonw commented Jul 16, 2015

This is now a tracking issue for RFC 1240, "Taking a reference into a struct marked repr(packed) should become unsafe", but originally it was a bug report that led to the development of that RFC.


Original Issue Description

#![feature(simd, test)]

extern crate test;

// simd types require high alignment or the CPU faults
#[simd]
#[derive(Debug, Copy, Clone)]
struct f32x4(f32, f32, f32, f32);

#[repr(packed)]
#[derive(Copy, Clone)]
struct Unalign<T>(T);

struct Breakit {
    x: u8,
    y: Unalign<f32x4>
}

fn main() {
    let val = Breakit { x: 0, y: Unalign(f32x4(0.0, 0.0, 0.0, 0.0)) };

    test::black_box(&val);

    println!("before");

    let ok = val.y;
    test::black_box(ok.0);

    println!("middle");

    let bad = val.y.0;
    test::black_box(bad);

    println!("after");
}

Will print, on playpen:

before
middle
playpen: application terminated abnormally with signal 4 (Illegal instruction)

The assembly for the ok load is:

    movups  49(%rsp), %xmm0
    movaps  %xmm0, (%rsp)
    #APP
    #NO_APP

But for the bad one, it is

    movaps  49(%rsp), %xmm0
    movaps  %xmm0, (%rsp)
    #APP
    #NO_APP

Specifically, the movups (unaligned) became a movaps (aligned), but the pointer isn't actually aligned, hence the CPU faults.

@Gankro

This comment has been minimized.

Show comment
Hide comment
@Gankro

Gankro Jul 16, 2015

Contributor

I... honestly thought that was expected behaviour. What can possibly be done about this is general, other than making references into packed fields a different type (or forbidding them)?

Contributor

Gankro commented Jul 16, 2015

I... honestly thought that was expected behaviour. What can possibly be done about this is general, other than making references into packed fields a different type (or forbidding them)?

@eefriedman

This comment has been minimized.

Show comment
Hide comment
@eefriedman

eefriedman Jul 16, 2015

Contributor

This is UB, so it's clearly not expected... but yes. it's more of a design flaw than an implementation issue.

Note that it's possible to make code like this misbehave even without SIMD, although it's a bit trickier; LLVM performs optimizations based on the alignment of loads.

Contributor

eefriedman commented Jul 16, 2015

This is UB, so it's clearly not expected... but yes. it's more of a design flaw than an implementation issue.

Note that it's possible to make code like this misbehave even without SIMD, although it's a bit trickier; LLVM performs optimizations based on the alignment of loads.

@huonw

This comment has been minimized.

Show comment
Hide comment
@huonw

huonw Jul 16, 2015

Member

Yeah, I only used SIMD because it was the simplest way to demonstrate the problem on x86. I believe platforms like ARM are generally stricter about load alignments, so even, say, u16 may crash in simple cases like the above.

Member

huonw commented Jul 16, 2015

Yeah, I only used SIMD because it was the simplest way to demonstrate the problem on x86. I believe platforms like ARM are generally stricter about load alignments, so even, say, u16 may crash in simple cases like the above.

@retep998

This comment has been minimized.

Show comment
Hide comment
@retep998

retep998 Jul 21, 2015

Member

Is there a way to tell LLVM that the value could be unaligned, and so LLVM should emit code that tries to read it in a safe way for the given architecture, even if it results in slower code?

Member

retep998 commented Jul 21, 2015

Is there a way to tell LLVM that the value could be unaligned, and so LLVM should emit code that tries to read it in a safe way for the given architecture, even if it results in slower code?

@vadimcn

This comment has been minimized.

Show comment
Hide comment
@vadimcn

vadimcn Jul 21, 2015

Contributor

@retep998: That's exactly what LLVM should have done for a packed struct. This looks like a LLVM codegen bug to me.

Contributor

vadimcn commented Jul 21, 2015

@retep998: That's exactly what LLVM should have done for a packed struct. This looks like a LLVM codegen bug to me.

@dotdash

This comment has been minimized.

Show comment
Hide comment
@dotdash

dotdash Jul 21, 2015

Contributor

@vadimcn this is entirely our (or rather: my) fault. We used to not emit alignment data at all, which caused misaligned accesses for small aggregates (see #23431). But now we unconditionally emit alignment data purely based on the type that we're loading, ignoring where that value is stored. In fact at the point where we create the load, we currently don't even know where that pointer comes from and can't properly handle packed structs.

Contributor

dotdash commented Jul 21, 2015

@vadimcn this is entirely our (or rather: my) fault. We used to not emit alignment data at all, which caused misaligned accesses for small aggregates (see #23431). But now we unconditionally emit alignment data purely based on the type that we're loading, ignoring where that value is stored. In fact at the point where we create the load, we currently don't even know where that pointer comes from and can't properly handle packed structs.

@retep998

This comment has been minimized.

Show comment
Hide comment
@retep998

retep998 Jul 21, 2015

Member

Perhaps we need some sort of alignment attribute that we can attach to pointers?

Member

retep998 commented Jul 21, 2015

Perhaps we need some sort of alignment attribute that we can attach to pointers?

@huonw

This comment has been minimized.

Show comment
Hide comment
@huonw

huonw Jul 21, 2015

Member

In general we have no idea where a reference comes from, e.g. fn foo(x: &f32x4) { let y = *x; ... in some external crate can't know if we happen to pass in &val.y.0 in the code above. The only way to codegen to handle unaligned references properly is to assume pointers are never aligned, which would be extremely unfortunate. We don't currently have type-attributes, so it would be somewhat strange to introduce one here, instead of using types (for example).

Member

huonw commented Jul 21, 2015

In general we have no idea where a reference comes from, e.g. fn foo(x: &f32x4) { let y = *x; ... in some external crate can't know if we happen to pass in &val.y.0 in the code above. The only way to codegen to handle unaligned references properly is to assume pointers are never aligned, which would be extremely unfortunate. We don't currently have type-attributes, so it would be somewhat strange to introduce one here, instead of using types (for example).

@vadimcn

This comment has been minimized.

Show comment
Hide comment
@vadimcn

vadimcn Jul 22, 2015

Contributor

Can we make creation of references into packed structs unsafe?

Contributor

vadimcn commented Jul 22, 2015

Can we make creation of references into packed structs unsafe?

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Jul 23, 2015

Contributor

triage: P-high

Contributor

nikomatsakis commented Jul 23, 2015

triage: P-high

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
@nikomatsakis

nikomatsakis Jul 23, 2015

Contributor

Seems clear there's a bug here. First step is to evaluate how widely used packed is, so huon is going to do a crater run with #[repr(packed)] feature gated.

Contributor

nikomatsakis commented Jul 23, 2015

Seems clear there's a bug here. First step is to evaluate how widely used packed is, so huon is going to do a crater run with #[repr(packed)] feature gated.

huonw added a commit to huonw/rust that referenced this issue Jul 23, 2015

Feature gate repr(packed).
There are some correctness issues due to unaligned internal fields and
references. cc rust-lang#27060.

huonw added a commit to huonw/rust that referenced this issue Jul 24, 2015

Feature gate repr(packed).
There are some correctness issues due to unaligned internal fields and
references. cc rust-lang#27060.

huonw added a commit to huonw/rust that referenced this issue Jul 24, 2015

Feature gate repr(packed).
There are some correctness issues due to unaligned internal fields and
references. cc rust-lang#27060.

huonw added a commit to huonw/rust that referenced this issue Jul 24, 2015

Feature gate repr(packed).
There are some correctness issues due to unaligned internal fields and
references. cc rust-lang#27060.

huonw added a commit to huonw/image that referenced this issue Jul 24, 2015

Remove unnecessary use of `#[repr(packed)]`.
There's some correctness issues with this, so there may be breaking
changes in future. See rust-lang/rust#27060.

huonw added a commit to huonw/X11Cap that referenced this issue Jul 24, 2015

Remove unnecessary use of `#[repr(packed)]`.
This struct is laid out the same way with or without `packed`, since
it's just a few bytes.

The removal is good because there's some correctness issues with it, so
there may be breaking changes to it in future and removing it now will
avoid them all together. See
rust-lang/rust#27060.

huonw added a commit to huonw/image that referenced this issue Jul 24, 2015

Remove unnecessary use of `#[repr(packed)]`.
This struct never seems to be used in a way that requires being packed.

The removal is good because there's some correctness issues with it, so
there may be breaking changes to it in future and removing it now will
avoid them all together. See
rust-lang/rust#27060.

huonw added a commit to huonw/stemmer-rs that referenced this issue Jul 24, 2015

Remove unnecessary use of `#[repr(packed)]`.
This struct is laid out the same way with or without `packed`, since
it is empty.

The removal is good because there's some correctness issues with it, so
there may be breaking changes to it in future and removing it now will
avoid them all together. See
rust-lang/rust#27060.

huonw added a commit to huonw/rust-openal that referenced this issue Jul 24, 2015

Remove unnecessary uses of `#[repr(packed)]`.
These structs are laid out the same way with or without `packed`, since
they're just repeats of a single element type. That is, they're always
going to be three contiguous `f32`s.

The removal is good because there's some correctness issues with it, so
there may be breaking changes to it in future and removing it now will
avoid them all together. See
rust-lang/rust#27060.

arielb1 added a commit to arielb1/rust that referenced this issue Nov 20, 2017

make accesses to fields of packed structs unsafe
To handle packed structs with destructors (which you'll think are a rare
case, but the `#[repr(packed)] struct Packed<T>(T);` pattern is
ever-popular, which requires handling packed structs with destructors to
avoid monomorphization-time errors), drops of subfields of packed
structs should drop a local move of the field instead of the original
one.

cc rust-lang#27060 - this should deal with that issue after codegen of drop glue
is updated.

The new errors need to be changed to future-compatibility warnings, but
I'll rather do a crater run first with them as errors to assess the
impact.

bors added a commit that referenced this issue Nov 22, 2017

Auto merge of #44884 - arielb1:pack-safe, r=nikomatsakis
Make accesses to fields of packed structs unsafe

To handle packed structs with destructors (which you'll think are a rare
case, but the `#[repr(packed)] struct Packed<T>(T);` pattern is
ever-popular, which requires handling packed structs with destructors to
avoid monomorphization-time errors), drops of subfields of packed
structs should drop a local move of the field instead of the original
one.

That's it, I think I'll use a strategy suggested by @Zoxc, where this mir
```
drop(packed_struct.field)
```

is replaced by
```
tmp0 = packed_struct.field;
drop tmp0
```

cc #27060 - this should deal with that issue after codegen of drop glue
is updated.

The new errors need to be changed to future-compatibility warnings, but
I'll rather do a crater run first with them as errors to assess the
impact.

cc @eddyb

Things which still need to be done for this:
 - [ ] - handle `repr(packed)` structs in `derive` the same way I did in `Span`, and use derive there again
 - [ ] - implement the "fix packed drops" pass and call it in both the MIR shim and validated MIR pipelines
 - [ ] - do a crater run
 - [ ] - convert the errors to compatibility warnings

bors added a commit that referenced this issue Nov 23, 2017

Auto merge of #44884 - arielb1:pack-safe, r=nikomatsakis
Make accesses to fields of packed structs unsafe

To handle packed structs with destructors (which you'll think are a rare
case, but the `#[repr(packed)] struct Packed<T>(T);` pattern is
ever-popular, which requires handling packed structs with destructors to
avoid monomorphization-time errors), drops of subfields of packed
structs should drop a local move of the field instead of the original
one.

That's it, I think I'll use a strategy suggested by @Zoxc, where this mir
```
drop(packed_struct.field)
```

is replaced by
```
tmp0 = packed_struct.field;
drop tmp0
```

cc #27060 - this should deal with that issue after codegen of drop glue
is updated.

The new errors need to be changed to future-compatibility warnings, but
I'll rather do a crater run first with them as errors to assess the
impact.

cc @eddyb

Things which still need to be done for this:
 - [ ] - handle `repr(packed)` structs in `derive` the same way I did in `Span`, and use derive there again
 - [ ] - implement the "fix packed drops" pass and call it in both the MIR shim and validated MIR pipelines
 - [ ] - do a crater run
 - [ ] - convert the errors to compatibility warnings

arielb1 added a commit to arielb1/rust that referenced this issue Nov 23, 2017

make accesses to fields of packed structs unsafe
To handle packed structs with destructors (which you'll think are a rare
case, but the `#[repr(packed)] struct Packed<T>(T);` pattern is
ever-popular, which requires handling packed structs with destructors to
avoid monomorphization-time errors), drops of subfields of packed
structs should drop a local move of the field instead of the original
one.

cc rust-lang#27060 - this should deal with that issue after codegen of drop glue
is updated.

The new errors need to be changed to future-compatibility warnings, but
I'll rather do a crater run first with them as errors to assess the
impact.

bors added a commit that referenced this issue Nov 23, 2017

Auto merge of #44884 - arielb1:pack-safe, r=nikomatsakis
Make accesses to fields of packed structs unsafe

To handle packed structs with destructors (which you'll think are a rare
case, but the `#[repr(packed)] struct Packed<T>(T);` pattern is
ever-popular, which requires handling packed structs with destructors to
avoid monomorphization-time errors), drops of subfields of packed
structs should drop a local move of the field instead of the original
one.

That's it, I think I'll use a strategy suggested by @Zoxc, where this mir
```
drop(packed_struct.field)
```

is replaced by
```
tmp0 = packed_struct.field;
drop tmp0
```

cc #27060 - this should deal with that issue after codegen of drop glue
is updated.

The new errors need to be changed to future-compatibility warnings, but
I'll rather do a crater run first with them as errors to assess the
impact.

cc @eddyb

Things which still need to be done for this:
 - [ ] - handle `repr(packed)` structs in `derive` the same way I did in `Span`, and use derive there again
 - [ ] - implement the "fix packed drops" pass and call it in both the MIR shim and validated MIR pipelines
 - [ ] - do a crater run
 - [ ] - convert the errors to compatibility warnings

bors added a commit that referenced this issue Nov 23, 2017

Auto merge of #44884 - arielb1:pack-safe, r=nikomatsakis
Make accesses to fields of packed structs unsafe

To handle packed structs with destructors (which you'll think are a rare
case, but the `#[repr(packed)] struct Packed<T>(T);` pattern is
ever-popular, which requires handling packed structs with destructors to
avoid monomorphization-time errors), drops of subfields of packed
structs should drop a local move of the field instead of the original
one.

That's it, I think I'll use a strategy suggested by @Zoxc, where this mir
```
drop(packed_struct.field)
```

is replaced by
```
tmp0 = packed_struct.field;
drop tmp0
```

cc #27060 - this should deal with that issue after codegen of drop glue
is updated.

The new errors need to be changed to future-compatibility warnings, but
I'll rather do a crater run first with them as errors to assess the
impact.

cc @eddyb

Things which still need to be done for this:
 - [ ] - handle `repr(packed)` structs in `derive` the same way I did in `Span`, and use derive there again
 - [ ] - implement the "fix packed drops" pass and call it in both the MIR shim and validated MIR pipelines
 - [ ] - do a crater run
 - [ ] - convert the errors to compatibility warnings

bors added a commit that referenced this issue Nov 24, 2017

Auto merge of #44884 - arielb1:pack-safe, r=nikomatsakis
Make accesses to fields of packed structs unsafe

To handle packed structs with destructors (which you'll think are a rare
case, but the `#[repr(packed)] struct Packed<T>(T);` pattern is
ever-popular, which requires handling packed structs with destructors to
avoid monomorphization-time errors), drops of subfields of packed
structs should drop a local move of the field instead of the original
one.

That's it, I think I'll use a strategy suggested by @Zoxc, where this mir
```
drop(packed_struct.field)
```

is replaced by
```
tmp0 = packed_struct.field;
drop tmp0
```

cc #27060 - this should deal with that issue after codegen of drop glue
is updated.

The new errors need to be changed to future-compatibility warnings, but
I'll rather do a crater run first with them as errors to assess the
impact.

cc @eddyb

Things which still need to be done for this:
 - [ ] - handle `repr(packed)` structs in `derive` the same way I did in `Span`, and use derive there again
 - [ ] - implement the "fix packed drops" pass and call it in both the MIR shim and validated MIR pipelines
 - [ ] - do a crater run
 - [ ] - convert the errors to compatibility warnings

arielb1 added a commit to arielb1/rust that referenced this issue Nov 25, 2017

make accesses to fields of packed structs unsafe
To handle packed structs with destructors (which you'll think are a rare
case, but the `#[repr(packed)] struct Packed<T>(T);` pattern is
ever-popular, which requires handling packed structs with destructors to
avoid monomorphization-time errors), drops of subfields of packed
structs should drop a local move of the field instead of the original
one.

cc rust-lang#27060 - this should deal with that issue after codegen of drop glue
is updated.

The new errors need to be changed to future-compatibility warnings, but
I'll rather do a crater run first with them as errors to assess the
impact.

bors added a commit that referenced this issue Nov 25, 2017

Auto merge of #44884 - arielb1:pack-safe, r=nikomatsakis
Make accesses to fields of packed structs unsafe

To handle packed structs with destructors (which you'll think are a rare
case, but the `#[repr(packed)] struct Packed<T>(T);` pattern is
ever-popular, which requires handling packed structs with destructors to
avoid monomorphization-time errors), drops of subfields of packed
structs should drop a local move of the field instead of the original
one.

That's it, I think I'll use a strategy suggested by @Zoxc, where this mir
```
drop(packed_struct.field)
```

is replaced by
```
tmp0 = packed_struct.field;
drop tmp0
```

cc #27060 - this should deal with that issue after codegen of drop glue
is updated.

The new errors need to be changed to future-compatibility warnings, but
I'll rather do a crater run first with them as errors to assess the
impact.

cc @eddyb

Things which still need to be done for this:
 - [ ] - handle `repr(packed)` structs in `derive` the same way I did in `Span`, and use derive there again
 - [ ] - implement the "fix packed drops" pass and call it in both the MIR shim and validated MIR pipelines
 - [ ] - do a crater run
 - [ ] - convert the errors to compatibility warnings

arielb1 added a commit to arielb1/rust that referenced this issue Nov 26, 2017

make accesses to fields of packed structs unsafe
To handle packed structs with destructors (which you'll think are a rare
case, but the `#[repr(packed)] struct Packed<T>(T);` pattern is
ever-popular, which requires handling packed structs with destructors to
avoid monomorphization-time errors), drops of subfields of packed
structs should drop a local move of the field instead of the original
one.

cc rust-lang#27060 - this should deal with that issue after codegen of drop glue
is updated.

The new errors need to be changed to future-compatibility warnings, but
I'll rather do a crater run first with them as errors to assess the
impact.

bors added a commit that referenced this issue Nov 27, 2017

Auto merge of #44884 - arielb1:pack-safe, r=nikomatsakis,eddyb
Make accesses to fields of packed structs unsafe

To handle packed structs with destructors (which you'll think are a rare
case, but the `#[repr(packed)] struct Packed<T>(T);` pattern is
ever-popular, which requires handling packed structs with destructors to
avoid monomorphization-time errors), drops of subfields of packed
structs should drop a local move of the field instead of the original
one.

That's it, I think I'll use a strategy suggested by @Zoxc, where this mir
```
drop(packed_struct.field)
```

is replaced by
```
tmp0 = packed_struct.field;
drop tmp0
```

cc #27060 - this should deal with that issue after codegen of drop glue
is updated.

The new errors need to be changed to future-compatibility warnings, but
I'll rather do a crater run first with them as errors to assess the
impact.

cc @eddyb

Things which still need to be done for this:
 - [ ] - handle `repr(packed)` structs in `derive` the same way I did in `Span`, and use derive there again
 - [ ] - implement the "fix packed drops" pass and call it in both the MIR shim and validated MIR pipelines
 - [ ] - do a crater run
 - [ ] - convert the errors to compatibility warnings

spastorino added a commit to spastorino/rust that referenced this issue Dec 5, 2017

make accesses to fields of packed structs unsafe
To handle packed structs with destructors (which you'll think are a rare
case, but the `#[repr(packed)] struct Packed<T>(T);` pattern is
ever-popular, which requires handling packed structs with destructors to
avoid monomorphization-time errors), drops of subfields of packed
structs should drop a local move of the field instead of the original
one.

cc rust-lang#27060 - this should deal with that issue after codegen of drop glue
is updated.

The new errors need to be changed to future-compatibility warnings, but
I'll rather do a crater run first with them as errors to assess the
impact.
@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Dec 29, 2017

Member

This is fixed by #44884, isn't it?

Member

RalfJung commented Dec 29, 2017

This is fixed by #44884, isn't it?

@glaubitz

This comment has been minimized.

Show comment
Hide comment
@glaubitz

glaubitz Dec 29, 2017

Contributor

@RalfJung Last time I tried, rustc itself was no longer crashing with SIGBUS. But code compiled natively on sparc64 still crashed (rustc being cross-compiled for sparc64) with SIGBUS.

I have to perform more tests on sparc64 first to be able to give a qualified answer.

Contributor

glaubitz commented Dec 29, 2017

@RalfJung Last time I tried, rustc itself was no longer crashing with SIGBUS. But code compiled natively on sparc64 still crashed (rustc being cross-compiled for sparc64) with SIGBUS.

I have to perform more tests on sparc64 first to be able to give a qualified answer.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Dec 29, 2017

Member

Ah, this issue now seems to be about two things:

  • Tracking issue for RFC 1240. That RFC is now implemented in nightly, but it is a warning currently, not an error: https://play.rust-lang.org/?gist=7ebeed4f9cc64fb970a4ba2e604a5697&version=nightly. Probably the tracking issue should remain open until the RFC is fully implemented, i.e., this is a hard error (in nightly at least)?

  • The code in the first post compiling incorrectly. The code doesn't actually take a reference to an unsafe field, so it seems rather orthogonal to the RFC -- this is purely a codegen issue. I am a little puzzled that these are in the same issue.

Member

RalfJung commented Dec 29, 2017

Ah, this issue now seems to be about two things:

  • Tracking issue for RFC 1240. That RFC is now implemented in nightly, but it is a warning currently, not an error: https://play.rust-lang.org/?gist=7ebeed4f9cc64fb970a4ba2e604a5697&version=nightly. Probably the tracking issue should remain open until the RFC is fully implemented, i.e., this is a hard error (in nightly at least)?

  • The code in the first post compiling incorrectly. The code doesn't actually take a reference to an unsafe field, so it seems rather orthogonal to the RFC -- this is purely a codegen issue. I am a little puzzled that these are in the same issue.

@gnzlbg

This comment has been minimized.

Show comment
Hide comment
@gnzlbg

gnzlbg Apr 16, 2018

Contributor

Could somebody summarize exactly what remains to be done here ?

It is currently hard, although not impossible, to write C FFI code without repr(packed) and repr(packed(N)) that does not invoke undefined behavior. Because of how hard this is, libc, mach, and other low-level OS api libraries prefer to actually just silently invoke undefined behavior. AFAICT only rust-bindgen does consistently the right thing here, and the results are pretty horrible.

I only really need this for C FFI so I would be fine with completely forbidding all references to packed structs if that could lead to a minimally usable subset of this being stabilized quicker.

Also, the current RFC2366: portable packed SIMD vector types does not mention the interaction between SIMD vector types and packed, it just assumes that all vector types are always stored at a multiple of their alignment. I don't think that implicitly doing unaligned loads would be a sane default, and I'd rather avoid having to assert! on every vector method that &self is properly aligned. I would be fine with RFC2366 requiring that vector types must always be stored at a multiple of their alignment, which would prevent them from being stored in repr(packed) structs.

Contributor

gnzlbg commented Apr 16, 2018

Could somebody summarize exactly what remains to be done here ?

It is currently hard, although not impossible, to write C FFI code without repr(packed) and repr(packed(N)) that does not invoke undefined behavior. Because of how hard this is, libc, mach, and other low-level OS api libraries prefer to actually just silently invoke undefined behavior. AFAICT only rust-bindgen does consistently the right thing here, and the results are pretty horrible.

I only really need this for C FFI so I would be fine with completely forbidding all references to packed structs if that could lead to a minimally usable subset of this being stabilized quicker.

Also, the current RFC2366: portable packed SIMD vector types does not mention the interaction between SIMD vector types and packed, it just assumes that all vector types are always stored at a multiple of their alignment. I don't think that implicitly doing unaligned loads would be a sane default, and I'd rather avoid having to assert! on every vector method that &self is properly aligned. I would be fine with RFC2366 requiring that vector types must always be stored at a multiple of their alignment, which would prevent them from being stored in repr(packed) structs.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Apr 16, 2018

Member

Could somebody summarize exactly what remains to be done here ?

I only really need this for C FFI so I would be fine with completely forbidding all references to packed structs if that could lead to a minimally usable subset of this being stabilized quicker.

AFAIK everything is stable already (and has been since 1.0), but the problem is that some code is accepted without unsafe. #46043 tracks turning the warning this generates into a hard error.

I don't know about the codegen error this issue was originally about.

It is currently hard, although not impossible, to write C FFI code without repr(packed) and repr(packed(N)) that does not invoke undefined behavior.

Is that because the C side is packed, or because repr(C) somehow doesn't do the right thing?

I'd rather avoid having to assert! on every vector method that &self is properly aligned.

You do not have to. References are always aligned; only raw pointers may be unaligned and only if they are used with write_unaligned and read_unaligned. We already tell LLVM that references are aligned, so vector types shouldn't be any different here.

Member

RalfJung commented Apr 16, 2018

Could somebody summarize exactly what remains to be done here ?

I only really need this for C FFI so I would be fine with completely forbidding all references to packed structs if that could lead to a minimally usable subset of this being stabilized quicker.

AFAIK everything is stable already (and has been since 1.0), but the problem is that some code is accepted without unsafe. #46043 tracks turning the warning this generates into a hard error.

I don't know about the codegen error this issue was originally about.

It is currently hard, although not impossible, to write C FFI code without repr(packed) and repr(packed(N)) that does not invoke undefined behavior.

Is that because the C side is packed, or because repr(C) somehow doesn't do the right thing?

I'd rather avoid having to assert! on every vector method that &self is properly aligned.

You do not have to. References are always aligned; only raw pointers may be unaligned and only if they are used with write_unaligned and read_unaligned. We already tell LLVM that references are aligned, so vector types shouldn't be any different here.

@gnzlbg

This comment has been minimized.

Show comment
Hide comment
@gnzlbg

gnzlbg Apr 16, 2018

Contributor

Is that because the C side is packed, or because repr(C) somehow doesn't do the right thing?

Because the C side is packed with #pragma pack N (and/or a similar packed attribute). In many cases repr(packed) isn't enough and one needs repr(packed(N)) which is not stable yet. This issue is one of its unresolved issues.

Contributor

gnzlbg commented Apr 16, 2018

Is that because the C side is packed, or because repr(C) somehow doesn't do the right thing?

Because the C side is packed with #pragma pack N (and/or a similar packed attribute). In many cases repr(packed) isn't enough and one needs repr(packed(N)) which is not stable yet. This issue is one of its unresolved issues.

@DemiMarie

This comment has been minimized.

Show comment
Hide comment
@DemiMarie

DemiMarie Apr 16, 2018

Contributor
Contributor

DemiMarie commented Apr 16, 2018

@glaubitz

This comment has been minimized.

Show comment
Hide comment
@glaubitz

glaubitz Apr 16, 2018

Contributor

@DemiMarie A compiler shouldn't generate code which produces unaligned access. Depending on the target platform, unaligned accesses can either result in performance penalties or even or a crash with SIGBUS.

Rust supports more than just x86 where unaligned access doesn't have such a huge impact. I don't know about the performance impact on arm64 though.

Contributor

glaubitz commented Apr 16, 2018

@DemiMarie A compiler shouldn't generate code which produces unaligned access. Depending on the target platform, unaligned accesses can either result in performance penalties or even or a crash with SIGBUS.

Rust supports more than just x86 where unaligned access doesn't have such a huge impact. I don't know about the performance impact on arm64 though.

@DemiMarie

This comment has been minimized.

Show comment
Hide comment
@DemiMarie

DemiMarie Apr 16, 2018

Contributor
Contributor

DemiMarie commented Apr 16, 2018

@glaubitz

This comment has been minimized.

Show comment
Hide comment
@glaubitz

glaubitz Apr 16, 2018

Contributor

The user expects the compiler to generate aligned accesses by default since this aligned accesses are normally the safest and fastest option.

Contributor

glaubitz commented Apr 16, 2018

The user expects the compiler to generate aligned accesses by default since this aligned accesses are normally the safest and fastest option.

@whitequark

This comment has been minimized.

Show comment
Hide comment
@whitequark

whitequark Apr 17, 2018

Contributor

On some platforms, generating an
unaligned load or store might be the best option. Can LLVM do that when it
should?

Yes, LLVM of course does that. It compares the alignment of data with alignment required by the selected instruction and adjusts the pointer/shifts the data if necessary.

Contributor

whitequark commented Apr 17, 2018

On some platforms, generating an
unaligned load or store might be the best option. Can LLVM do that when it
should?

Yes, LLVM of course does that. It compares the alignment of data with alignment required by the selected instruction and adjusts the pointer/shifts the data if necessary.

@RalfJung

This comment has been minimized.

Show comment
Hide comment
@RalfJung

RalfJung Apr 17, 2018

Member

@DemiMarie Notice that alignment doesn't just affect loads. Once we tell LLVM about the alignment 8which we do), it is UB for the pointer value to not have the given alignment even if we do not load/store. For example, if we do bit operations on the least significant bits of the pointer, LLVM will optimize them (or so I am told) based on the assumption that these bits all have to be 0 due to alignment. Also see the example of using alignment for layout optimizations that I mentioned at the end of this post.

Member

RalfJung commented Apr 17, 2018

@DemiMarie Notice that alignment doesn't just affect loads. Once we tell LLVM about the alignment 8which we do), it is UB for the pointer value to not have the given alignment even if we do not load/store. For example, if we do bit operations on the least significant bits of the pointer, LLVM will optimize them (or so I am told) based on the assumption that these bits all have to be 0 due to alignment. Also see the example of using alignment for layout optimizations that I mentioned at the end of this post.

@DemiMarie

This comment has been minimized.

Show comment
Hide comment
@DemiMarie

DemiMarie Apr 18, 2018

Contributor
Contributor

DemiMarie commented Apr 18, 2018

@gnzlbg

This comment has been minimized.

Show comment
Hide comment
@gnzlbg

gnzlbg Apr 18, 2018

Contributor

Does it actually help?

Does it matter? The question is: what do we do in those platforms in which unaligned loads are illegal. Should we have two different programming languages?

Contributor

gnzlbg commented Apr 18, 2018

Does it actually help?

Does it matter? The question is: what do we do in those platforms in which unaligned loads are illegal. Should we have two different programming languages?

@ibkevg

This comment has been minimized.

Show comment
Hide comment
@ibkevg

ibkevg Jul 19, 2018

Food for thought, another use case that may flavour the solution here is device drivers.

Having control over how a structure lays out in memory is important to systems programmers who work with board support packages and drivers. A common technique in writing C device drivers, where the device registers are memory mapped, is to create a struct that perfectly overlays the registers. gcc and other compilers often require #pragmas to ensure packing and/or alignment are as required and its not uncommon to see C structs created as fully packed and have the alignment dealt with explicitly by the programmer adding explicit padding fields. (This makes it easy to verify by code inspection that the code matches what the device's data book describes.)

You then create a pointer to that struct with the base address of the device's registers. The device registers are read and written by reading and writing to the fields of the struct. You also have to ensure that you use volatile so that the compiler doesn't optimize out register reads not knowing that the device is changing them.

I've seen similar methods used for heterogeneous single board computers that communicated via shared memory and also for messages sent over communications channels where the on-the-wire layout was important to match with whatever was on the other end.

ibkevg commented Jul 19, 2018

Food for thought, another use case that may flavour the solution here is device drivers.

Having control over how a structure lays out in memory is important to systems programmers who work with board support packages and drivers. A common technique in writing C device drivers, where the device registers are memory mapped, is to create a struct that perfectly overlays the registers. gcc and other compilers often require #pragmas to ensure packing and/or alignment are as required and its not uncommon to see C structs created as fully packed and have the alignment dealt with explicitly by the programmer adding explicit padding fields. (This makes it easy to verify by code inspection that the code matches what the device's data book describes.)

You then create a pointer to that struct with the base address of the device's registers. The device registers are read and written by reading and writing to the fields of the struct. You also have to ensure that you use volatile so that the compiler doesn't optimize out register reads not knowing that the device is changing them.

I've seen similar methods used for heterogeneous single board computers that communicated via shared memory and also for messages sent over communications channels where the on-the-wire layout was important to match with whatever was on the other end.

@SimonSapin

This comment has been minimized.

Show comment
Hide comment
@SimonSapin

SimonSapin Jul 19, 2018

Contributor

@ibkevg You can do this today, with #[repr(C)] structs (for defined layout) and a wrapper type to force accesses to individual registers to be volatile. For example: https://docs.rs/cortex-m/0.5.2/cortex_m/peripheral/cpuid/struct.RegisterBlock.html

However these device registers are typically designed so that none of them is at a misaligned address, aren’t they? So #[repr(C)] is enough and you probably don’t need #[repr(packed)] for this.

Contributor

SimonSapin commented Jul 19, 2018

@ibkevg You can do this today, with #[repr(C)] structs (for defined layout) and a wrapper type to force accesses to individual registers to be volatile. For example: https://docs.rs/cortex-m/0.5.2/cortex_m/peripheral/cpuid/struct.RegisterBlock.html

However these device registers are typically designed so that none of them is at a misaligned address, aren’t they? So #[repr(C)] is enough and you probably don’t need #[repr(packed)] for this.

@ibkevg

This comment has been minimized.

Show comment
Hide comment
@ibkevg

ibkevg Jul 19, 2018

Great - I might have been thrown off the trail by the name #[repr(C)] because this is needed even if there is no C code present at all and a driver is being writing in 100% Rust. I’d suggest renaming or adding an alias that isn’t FFI specific.

Another thing that led me to write the comment above was skimming this thread and noticing at one point musing about fixing the issue in software by copying values into aligned memory and then reading that, which would lead to extra reads. I didn’t read it carefully but it brought to mind that in the event the compiler for whatever reason can’t decide what’s the right thing to do and decides to play it safe in this way, it could cause trouble for a driver. I’ve worked with devices that have “read clear” registers. So reading the register clears one or more bits in it and when you read it next, it’s gone. Now imagine edge cases of a solution that copies values into memory, reading extra addresses that inadvertently read clears.

So I just wanted to offer that caution - but yes it would be strange for hw to be designed that would require unaligned accesses. Even if a part was used that might have been designed for an older memory architecture, it would typically be interfaced to via an FPGA that could present a “normal” set of registers to software, that or tricks using address lines, etc.

(Barely related, mostly off topic, war story: not every device is going to be memory mapped :) I did have to interface to an Ethernet switch chip once where we only had an SPI (serial) interface into it that was wired to the CPU’s general purpose I/O registers. My code had direct control of a data line and a clock line by writing to the I/O register and was literally twiddling bits to feed in cmds and read data out. Lowest level code I’ve ever written.)

ibkevg commented Jul 19, 2018

Great - I might have been thrown off the trail by the name #[repr(C)] because this is needed even if there is no C code present at all and a driver is being writing in 100% Rust. I’d suggest renaming or adding an alias that isn’t FFI specific.

Another thing that led me to write the comment above was skimming this thread and noticing at one point musing about fixing the issue in software by copying values into aligned memory and then reading that, which would lead to extra reads. I didn’t read it carefully but it brought to mind that in the event the compiler for whatever reason can’t decide what’s the right thing to do and decides to play it safe in this way, it could cause trouble for a driver. I’ve worked with devices that have “read clear” registers. So reading the register clears one or more bits in it and when you read it next, it’s gone. Now imagine edge cases of a solution that copies values into memory, reading extra addresses that inadvertently read clears.

So I just wanted to offer that caution - but yes it would be strange for hw to be designed that would require unaligned accesses. Even if a part was used that might have been designed for an older memory architecture, it would typically be interfaced to via an FPGA that could present a “normal” set of registers to software, that or tricks using address lines, etc.

(Barely related, mostly off topic, war story: not every device is going to be memory mapped :) I did have to interface to an Ethernet switch chip once where we only had an SPI (serial) interface into it that was wired to the CPU’s general purpose I/O registers. My code had direct control of a data line and a clock line by writing to the I/O register and was literally twiddling bits to feed in cmds and read data out. Lowest level code I’ve ever written.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment