Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transmute documentation does not explicitly mention alignment requirement #82493

Closed
Lonami opened this issue Feb 24, 2021 · 8 comments · Fixed by #82592
Closed

transmute documentation does not explicitly mention alignment requirement #82493

Lonami opened this issue Feb 24, 2021 · 8 comments · Fixed by #82592
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools C-enhancement Category: An issue proposing an enhancement or a PR with one.

Comments

@Lonami
Copy link
Contributor

Lonami commented Feb 24, 2021

The documentation for std::mem::transmute reads:

Both types must have the same size. Neither the original, nor the result, may be an invalid value.

However, it does not mention that they must also have the same alignment. You would only learn about this if you try to dig into the code examples further down the page:

...
// Therefore, the new inner type must have the
// exact same size, *and the same alignment*, as the old type.
// The same caveats exist for this method as transmute, for
...

Must the types T and U have the same alignment, or is that a particular quirk of the example?

In any case, I think the documentation should explicitly mention alignment, because the compiler succeeds to compile the following with no warnings (playground), and I believe this is undefined behaviour:

#[repr(align(2))]
struct Foo([u8; 8]);

#[repr(align(8))]
struct Bar([u8; 8]);

fn main()  {
    let _x: Bar = unsafe { std::mem::transmute(Foo([0; 8])) };
}

@rustbot modify labels: +T-doc +C-enhancement

@rustbot rustbot added C-enhancement Category: An issue proposing an enhancement or a PR with one. A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools labels Feb 24, 2021
@HeroicKatora
Copy link
Contributor

That looks like well-defined behavior to me. The alignment of T is a relevant property if a transmute produces a reference to T but for values it is the compiler's job to make them addressable at a properly aligned stack location (it doesn't strictly have to if no one takes a pointer to it).

Regarding the quote paragraph, this references a conversion of Vec<&i32> to Vec<Option<&i32>>. The presented method does in fact cast pointers, but pointers to the value types inside the vector: *mut &i32 to *mut Option<&i32>. It is the Vec which requires the pointee types to have the exact same size and alignment as a precondition for Vec::from_raw_parts—and not transmutation.

@Lonami
Copy link
Contributor Author

Lonami commented Feb 24, 2021

Thanks for the clarification! So the documentation could still be updated to clarify when alignment matters, right?

@RalfJung
Copy link
Member

RalfJung commented Feb 27, 2021

I do not see a feasible way for the transmute documentation to explain for all possible types which guarantees matter. Alignment is just one of many examples. Whether you can transmute T to U depends on the details of U (and, less so, T), and I don't think transmute can cover this for all possible U.

You example (the text you quoted) is specific to Vec, and here it is alignment that matters. Another time, you might transmute 3u8 to bool; this is also UB but this time alignment has nothing to do with it. And the list goes on forever.

@RalfJung
Copy link
Member

However, maybe the comment in the example could be worded a bit more carefully to explain that alignment matters in this specific case because we are transmuting a pointer (implicitly as part of the Vec). Do you have any suggestions for how that could be worded?

Honestly, that entire example is a bit weird. It says "still using transmute" but the code this comments doesn't even use transmute.

@Lonami
Copy link
Contributor Author

Lonami commented Feb 27, 2021

I do not see a feasible way for the transmute documentation to explain for all possible types which guarantees matter.

If alignment is not a concern when transmuting values (because the compiler will move the memory to a stack location which is properly aligned), I think it would be worth explicitly saying this. Unless I was explicitly told (like Heroic explained), it would not have occured to me that it's moving into a proper aligned location.

With regards to "explain all possible types", I don't think this is necessary. Saying "The resulting type must be a valid value" might be enough. Or maybe the:

The nomicon has additional documentation.

…should be changed to encourage reading that page more strongly ("Make sure you understand all the dangers of using transmute before using it"), because that page does mention the invalid values thing (but it doesn't mention alignment at all, which is what raised this issue).

Honestly, that entire example is a bit weird

It's not even use transmute directly, it's casting pointers (which, sure, can be seen as a different way of transmuting values), where different rules apply. Maybe the example could be completely removed or moved elsewhere. Or framed differently, something like:

// To "transmute" the contents of a heap-allocated container to something else,
// you must make sure that both the size *and alignment* of the items match,
// so that the internal representation of the container is not affected:
let v_from_raw = unsafe {
    // Ensure the original vector is not dropped.
    let mut v_clone = std::mem::ManuallyDrop::new(v_clone);
    Vec::from_raw_parts(v_clone.as_mut_ptr() as *mut Option<&i32>,
                        v_clone.len(),
                        v_clone.capacity())
};

@RalfJung
Copy link
Member

If alignment is not a concern when transmuting values (because the compiler will move the memory to a stack location which is properly aligned), I think it would be worth explicitly saying this. Unless I was explicitly told (like Heroic explained), it would not have occured to me that it's moving into a proper aligned location.

I'm not a fan of listing all the things that are not a concern. That's a long list. However, you are not the first person to suggest this, so I concede. I guess I will never understand why people expect alignment to be an issue here, but it seems they do, and I don't have to understand everything. ;)

Saying "The resulting type must be a valid value" might be enough.

It already says that (in double-negated form): "Neither the original, nor the result, may be an invalid value."

It's not even use transmute directly, it's casting pointers (which, sure, can be seen as a different way of transmuting values), where different rules apply. Maybe the example could be completely removed or moved elsewhere. Or framed differently, something like:

There are people (myself included) that use the term "to transmute" for any kind of "type-changing reinterpretation of raw bytes", whether it is via transmute, via a pointer cast, or via union field accesses. The rules are the same in each case. However, since this is the docs of the transmute function specifically, I think we should be more precise here, and I like your new framing.

@Lonami
Copy link
Contributor Author

Lonami commented Feb 27, 2021

I guess I will never understand why people expect alignment to be an issue here, but it seems they do

In my case, it's because I'm not sure if it was undefined behaviour to read a type from memory with the wrong alignment. IIRC, there are some CPU architectures that fault when a misaligned memory access occurs? Or maybe I'm misremembering. Basically, I wasn't sure if "interpret this bunch of bytes as this other type" is something that is okay to do when "the bunch of bytes" is misaligned.

It already says that (in double-negated form): "Neither the original, nor the result, may be an invalid value."

My bad, somehow missed it.

@RalfJung
Copy link
Member

It is UB to perform an unaligned read or write. But transmute is a by-value operation, so you anyway have no control over the location at which T or U end up.

Anyway, you made some good suggestions -- do you plan to turn these into a PR, or should we mark this as an E-easy issue for someone else to write the PR?

m-ou-se added a commit to m-ou-se/rust that referenced this issue Mar 6, 2021
Improve transmute docs with further clarifications

Closes rust-lang#82493.

Please let me know if any of the new wording sounds off, English is not my mother tongue.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Mar 6, 2021
Improve transmute docs with further clarifications

Closes rust-lang#82493.

Please let me know if any of the new wording sounds off, English is not my mother tongue.
JohnTitor added a commit to JohnTitor/rust that referenced this issue Mar 7, 2021
Improve transmute docs with further clarifications

Closes rust-lang#82493.

Please let me know if any of the new wording sounds off, English is not my mother tongue.
@bors bors closed this as completed in fbc1741 Mar 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools C-enhancement Category: An issue proposing an enhancement or a PR with one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants