Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign uprepr(packed) allows invalid unaligned loads #27060
Comments
huonw
added
I-crash
I-nominated
T-lang
T-compiler
labels
Jul 16, 2015
This comment has been minimized.
This comment has been minimized.
|
I... honestly thought that was expected behaviour. What can possibly be done about this is general, other than making references into packed fields a different type (or forbidding them)? |
This comment has been minimized.
This comment has been minimized.
|
This is UB, so it's clearly not expected... but yes. it's more of a design flaw than an implementation issue. Note that it's possible to make code like this misbehave even without SIMD, although it's a bit trickier; LLVM performs optimizations based on the alignment of loads. |
This comment has been minimized.
This comment has been minimized.
|
Yeah, I only used SIMD because it was the simplest way to demonstrate the problem on x86. I believe platforms like ARM are generally stricter about load alignments, so even, say, |
This comment has been minimized.
This comment has been minimized.
|
Is there a way to tell LLVM that the value could be unaligned, and so LLVM should emit code that tries to read it in a safe way for the given architecture, even if it results in slower code? |
This comment has been minimized.
This comment has been minimized.
|
@retep998: That's exactly what LLVM should have done for a packed struct. This looks like a LLVM codegen bug to me. |
This comment has been minimized.
This comment has been minimized.
|
@vadimcn this is entirely our (or rather: my) fault. We used to not emit alignment data at all, which caused misaligned accesses for small aggregates (see #23431). But now we unconditionally emit alignment data purely based on the type that we're loading, ignoring where that value is stored. In fact at the point where we create the load, we currently don't even know where that pointer comes from and can't properly handle packed structs. |
This comment has been minimized.
This comment has been minimized.
|
Perhaps we need some sort of alignment attribute that we can attach to pointers? |
This comment has been minimized.
This comment has been minimized.
|
In general we have no idea where a reference comes from, e.g. |
This comment has been minimized.
This comment has been minimized.
|
Can we make creation of references into packed structs unsafe? |
This comment has been minimized.
This comment has been minimized.
|
triage: P-high |
rust-highfive
added
P-high
and removed
I-nominated
labels
Jul 23, 2015
nikomatsakis
assigned
huonw
Jul 23, 2015
This comment has been minimized.
This comment has been minimized.
|
Seems clear there's a bug here. First step is to evaluate how widely used packed is, so huon is going to do a crater run with |
huonw
added a commit
to huonw/rust
that referenced
this issue
Jul 23, 2015
huonw
added a commit
to huonw/rust
that referenced
this issue
Jul 24, 2015
huonw
added a commit
to huonw/rust
that referenced
this issue
Jul 24, 2015
huonw
added a commit
to huonw/rust
that referenced
this issue
Jul 24, 2015
huonw
added a commit
to huonw/image
that referenced
this issue
Jul 24, 2015
huonw
added a commit
to huonw/X11Cap
that referenced
this issue
Jul 24, 2015
huonw
added a commit
to huonw/image
that referenced
this issue
Jul 24, 2015
huonw
added a commit
to huonw/stemmer-rs
that referenced
this issue
Jul 24, 2015
huonw
added a commit
to huonw/rust-openal
that referenced
this issue
Jul 24, 2015
bors
added a commit
that referenced
this issue
Nov 24, 2017
arielb1
added a commit
to arielb1/rust
that referenced
this issue
Nov 25, 2017
bors
added a commit
that referenced
this issue
Nov 25, 2017
arielb1
added a commit
to arielb1/rust
that referenced
this issue
Nov 26, 2017
bors
added a commit
that referenced
this issue
Nov 27, 2017
spastorino
added a commit
to spastorino/rust
that referenced
this issue
Dec 5, 2017
This comment has been minimized.
This comment has been minimized.
|
This is fixed by #44884, isn't it? |
This comment has been minimized.
This comment has been minimized.
|
@RalfJung Last time I tried, rustc itself was no longer crashing with SIGBUS. But code compiled natively on sparc64 still crashed (rustc being cross-compiled for sparc64) with SIGBUS. I have to perform more tests on sparc64 first to be able to give a qualified answer. |
This comment has been minimized.
This comment has been minimized.
|
Ah, this issue now seems to be about two things:
|
This comment has been minimized.
This comment has been minimized.
|
Could somebody summarize exactly what remains to be done here ? It is currently hard, although not impossible, to write C FFI code without I only really need this for C FFI so I would be fine with completely forbidding all references to packed structs if that could lead to a minimally usable subset of this being stabilized quicker. Also, the current RFC2366: portable packed SIMD vector types does not mention the interaction between SIMD vector types and packed, it just assumes that all vector types are always stored at a multiple of their alignment. I don't think that implicitly doing unaligned loads would be a sane default, and I'd rather avoid having to |
This comment has been minimized.
This comment has been minimized.
AFAIK everything is stable already (and has been since 1.0), but the problem is that some code is accepted without I don't know about the codegen error this issue was originally about.
Is that because the C side is packed, or because
You do not have to. References are always aligned; only raw pointers may be unaligned and only if they are used with |
This comment has been minimized.
This comment has been minimized.
Because the C side is packed with |
This comment has been minimized.
This comment has been minimized.
|
Is there some reason that everything needs to be aligned?
On at least x86, amd64, and ARM64 I cannot see the harm in telling LLVM
things may not be aligned.
…On Mon, Apr 16, 2018, 5:31 AM Ralf Jung ***@***.***> wrote:
Could somebody summarize exactly what remains to be done here ?
I only really need this for C FFI so I would be fine with completely
forbidding all references to packed structs if that could lead to a
minimally usable subset of this being stabilized quicker.
AFAIK everything is stable already (and has been since 1.0), but the
problem is that some code is accepted without unsafe. #46043
<#46043> tracks turning the
warning this generates into a hard error.
I don't know about the codegen error this issue was originally about.
It is currently hard, although not impossible, to write C FFI code without
repr(packed) and repr(packed(N)) that does not invoke undefined behavior.
Is that because the C side is packed, or because repr(C) somehow doesn't
do the right thing?
I'd rather avoid having to assert! on every vector method that &self is
properly aligned.
You do not have to. References are *always* aligned; only raw pointers
may be unaligned and only if they are used with write_unaligned
<https://doc.rust-lang.org/beta/std/ptr/fn.write_unaligned.html> and
read_unaligned
<https://doc.rust-lang.org/beta/std/ptr/fn.read_unaligned.html>. We
already tell LLVM that references are aligned, so vector types shouldn't be
any different here.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27060 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGGWB9taIj5c7dcvGBg4eZRJyg7Kb01Pks5tpGTlgaJpZM4FZlMJ>
.
|
This comment has been minimized.
This comment has been minimized.
|
@DemiMarie A compiler shouldn't generate code which produces unaligned access. Depending on the target platform, unaligned accesses can either result in performance penalties or even or a crash with SIGBUS. Rust supports more than just x86 where unaligned access doesn't have such a huge impact. I don't know about the performance impact on arm64 though. |
This comment has been minimized.
This comment has been minimized.
|
That depends on the platform, though. On some platforms, generating an
unaligned load or store might be the best option. Can LLVM do that when it
should?
…On Mon, Apr 16, 2018, 1:34 PM John Paul Adrian Glaubitz < ***@***.***> wrote:
@DemiMarie <https://github.com/DemiMarie> A compiler shouldn't generate
code which produces unaligned access. Depending on the target platform,
unaligned accesses can either result in performance penalties or even or a
crash with SIGBUS.
Rust supports more than just x86 where unaligned access doesn't have such
a huge impact. I don't know about the performance impact on arm64 though.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27060 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGGWB_WOArzdsx9923j4Utll9JbHhErJks5tpNYygaJpZM4FZlMJ>
.
|
This comment has been minimized.
This comment has been minimized.
|
The user expects the compiler to generate aligned accesses by default since this aligned accesses are normally the safest and fastest option. |
This comment has been minimized.
This comment has been minimized.
Yes, LLVM of course does that. It compares the alignment of data with alignment required by the selected instruction and adjusts the pointer/shifts the data if necessary. |
This comment has been minimized.
This comment has been minimized.
|
@DemiMarie Notice that alignment doesn't just affect loads. Once we tell LLVM about the alignment 8which we do), it is UB for the pointer value to not have the given alignment even if we do not load/store. For example, if we do bit operations on the least significant bits of the pointer, LLVM will optimize them (or so I am told) based on the assumption that these bits all have to be 0 due to alignment. Also see the example of using alignment for layout optimizations that I mentioned at the end of this post. |
This comment has been minimized.
This comment has been minimized.
|
I see. My thought was that on most platforms, telling LLVM pointers are
aligned does not help optimization, so there is no need to do it. Does it
actually help?
…On Tue, Apr 17, 2018, 3:57 AM Ralf Jung ***@***.***> wrote:
@DemiMarie <https://github.com/DemiMarie> Notice that alignment doesn't
just affect loads. Once we tell LLVM about the alignment 8which we do), it
is UB for the pointer value to not have the given alignment even if we do
not load/store. For example, if we do bit operations on the least
significant bits of the pointer, LLVM will optimize them (or so I am told)
based on the assumption that these bits all have to be 0 due to alignment.
Also see the example of using alignment for layout optimizations that I
mentioned at the end of this post
<#46043 (comment)>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#27060 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGGWBxwUqNEdNXoXE1yhNhV4AX0tjJqmks5tpaBcgaJpZM4FZlMJ>
.
|
This comment has been minimized.
This comment has been minimized.
Does it matter? The question is: what do we do in those platforms in which unaligned loads are illegal. Should we have two different programming languages? |
This comment has been minimized.
This comment has been minimized.
ibkevg
commented
Jul 19, 2018
|
Food for thought, another use case that may flavour the solution here is device drivers. Having control over how a structure lays out in memory is important to systems programmers who work with board support packages and drivers. A common technique in writing C device drivers, where the device registers are memory mapped, is to create a struct that perfectly overlays the registers. gcc and other compilers often require #pragmas to ensure packing and/or alignment are as required and its not uncommon to see C structs created as fully packed and have the alignment dealt with explicitly by the programmer adding explicit padding fields. (This makes it easy to verify by code inspection that the code matches what the device's data book describes.) You then create a pointer to that struct with the base address of the device's registers. The device registers are read and written by reading and writing to the fields of the struct. You also have to ensure that you use volatile so that the compiler doesn't optimize out register reads not knowing that the device is changing them. I've seen similar methods used for heterogeneous single board computers that communicated via shared memory and also for messages sent over communications channels where the on-the-wire layout was important to match with whatever was on the other end. |
This comment has been minimized.
This comment has been minimized.
|
@ibkevg You can do this today, with However these device registers are typically designed so that none of them is at a misaligned address, aren’t they? So |
This comment has been minimized.
This comment has been minimized.
ibkevg
commented
Jul 19, 2018
|
Great - I might have been thrown off the trail by the name #[repr(C)] because this is needed even if there is no C code present at all and a driver is being writing in 100% Rust. I’d suggest renaming or adding an alias that isn’t FFI specific. Another thing that led me to write the comment above was skimming this thread and noticing at one point musing about fixing the issue in software by copying values into aligned memory and then reading that, which would lead to extra reads. I didn’t read it carefully but it brought to mind that in the event the compiler for whatever reason can’t decide what’s the right thing to do and decides to play it safe in this way, it could cause trouble for a driver. I’ve worked with devices that have “read clear” registers. So reading the register clears one or more bits in it and when you read it next, it’s gone. Now imagine edge cases of a solution that copies values into memory, reading extra addresses that inadvertently read clears. So I just wanted to offer that caution - but yes it would be strange for hw to be designed that would require unaligned accesses. Even if a part was used that might have been designed for an older memory architecture, it would typically be interfaced to via an FPGA that could present a “normal” set of registers to software, that or tricks using address lines, etc. (Barely related, mostly off topic, war story: not every device is going to be memory mapped :) I did have to interface to an Ethernet switch chip once where we only had an SPI (serial) interface into it that was wired to the CPU’s general purpose I/O registers. My code had direct control of a data line and a clock line by writing to the I/O register and was literally twiddling bits to feed in cmds and read data out. Lowest level code I’ve ever written.) |
sfackler
referenced this issue
Sep 12, 2018
Closed
No documentation on why packed types + Drop aren't UB #54148
This comment has been minimized.
This comment has been minimized.
|
We have been warning about borrows to packed struct fields requiring unsafe for a long time. How should this issue progress? It is blocking the stabilization of The only unresolved question I can find here is whether we want to make this unsafe on all cases or some cases only (e.g. when the specified packing is less than the field's natural alignment). However, there is also the option of making this always safe, by requiring that the reference is directly casted into a raw pointer (e.g. So we could progress this by either turning the current warning (requiring unsafe code) into an error, or by changing the warning to warn about references to fields of packed structs that are not directly turned into pointers, so that we can turn that into an error in the future. |
This comment has been minimized.
This comment has been minimized.
|
@gnzlbg I'm fine with making this a hard error. There's just one thing: I have some ideas for how to make it even harder in the future to create unaligned references. Namely:
I am a bit worried about having people change the same code twice, that's my only potential reservation with proceeding to make safe-ref-to-packed-field an error. But I don't think that should actually stop us, just bringing it up in case anyone disagrees. |
This comment has been minimized.
This comment has been minimized.
Im worried to, which is why I would prefer to change the warning to require |
huonw commentedJul 16, 2015
•
edited by pnkfelix
This is now a tracking issue for RFC 1240, "Taking a reference into a struct marked
repr(packed)should becomeunsafe", but originally it was a bug report that led to the development of that RFC.Original Issue Description
Will print, on playpen:
The assembly for the
okload is:But for the
badone, it isSpecifically, the
movups(unaligned) became amovaps(aligned), but the pointer isn't actually aligned, hence the CPU faults.