Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn and eventually forbid transmute::<T, U> for T or U with unspecified (Rust) layout #50842

Open
nagisa opened this issue May 17, 2018 · 15 comments
Assignees
Labels
A-lint Area: Lints (warnings about flaws in source code) such as unused_mut. C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@nagisa
Copy link
Member

nagisa commented May 17, 2018

It is a fairly commonplace mistake to do a transmute::<T, U> for T and U which are not necessarily compatible, but happen to work at that some particular point in time. These transmutes either change in behaviour when the compiler is updated or stop compiling altogether (because the size of T and size of U are not the same anymore (see e.g. #50830)).

The same way we error for mismatching sizes, we can also raise such an error (possibly disable-able by a #[feature(very_fast_such_dangerous)]) for types for which sizes may change over time with future compiler releases. Namely, this would prevent using transmute on stable for anything that

  1. Does not have one of the whitelisted #[repr()] attribute on top of the type declaration;
  2. Is not a FFI-safe type in the first place.

If we did that already, the issue linked before would’ve failed, because none of the enums have a #[repr] attribute on top of them, making their layout unspecified and therefore not transmutable.

@nagisa
Copy link
Member Author

nagisa commented May 17, 2018

cc @nikomatsakis


I’ll be picking at the implementation of this during the impl days.

@nagisa nagisa self-assigned this May 17, 2018
@kennytm kennytm added A-lint Area: Lints (warnings about flaws in source code) such as unused_mut. C-feature-request Category: A feature request, i.e: not implemented / a PR. labels May 17, 2018
@the8472
Copy link
Member

the8472 commented May 17, 2018

Couldn't code assert the layout compatibility of all fields with offset_from to make such transmutes safe and have a fallback codepath in case future changes alter the layout?

@est31
Copy link
Member

est31 commented May 17, 2018

There is prior art for this lint in the improper_ctypes lint which only fires for ffi interfaces though. Can the two be unified or share infrastructure?

Also, what about pointer casts?

Last question, as monomorphization time errors are ugly: We could add a Cast<U> builtin trait that is implemented for T iff it is castable to U. I guess the ship has sailed on adding that bound to transmute?

@hanna-kruppe
Copy link
Contributor

There is prior art for this lint in the improper_ctypes lint which only fires for ffi interfaces though. Can the two be unified or share infrastructure?

I don't think they have a lot of common ground, neither in concept (non-C types can still have defined layout) nor in implementation.

Also, what about pointer casts?

What about it? Do you want to lint for *const T as *const U for T, U with unspecified layout?

Last question, as monomorphization time errors are ugly: We could add a Cast builtin trait that is implemented for T iff it is castable to U. I guess the ship has sailed on adding that bound to transmute?

Precisely to avoid monomorphization time errors, transmute already doesn't work on type parameters (or more precisely, on types whose size depends on a type parameter).

@est31
Copy link
Member

est31 commented May 17, 2018

Thanks for the answers this clarified some stuff.

Do you want to lint for *const T as *const U for T, U with unspecified layout?

How is *const T as *const U different from a transmute::<T,U> invocation?

@hanna-kruppe
Copy link
Contributor

How is *const T as *const U different from a transmute::<T,U> invocation?

For starters, the pointer cast itself doesn't do anything of note and is always well-defined (assuming T and U are Sized, at least). It can be very useful even without ever dereferencing the pointer (e.g., smuggling a Rust type through a C API that deals in void *).

It's true that you can build your own transmute-like horror out of pointer casts and other primitives, but pointer casts alone are much more tame and general and more frequently necessary -- in fact many uses of transmute are best replaced with simpler pointer casts!

@est31
Copy link
Member

est31 commented May 18, 2018

@rkruppe makes sense, you convinced me.

@nagisa
Copy link
Member Author

nagisa commented Jul 12, 2018

Experiments have been done in #51294 and conclusions drawn.


Sorted by impact, these are the "unspecified" layout types that are transmuted the most from or to in the crates ecosystem:

  • &[T], &str, &variant_type::VariantTy (from glib, a repr(Rust) newtype over str);
  • Box<T> (for both T: Sized and T: ?Sized; is it specified to be same as a pointer to T?);
  • *mut dyn T, *const dyn T, &mut dyn T, &dyn T;
  • std::sync::Arc<T> (mostly by futures?);
  • std::net::SocketAddrV4, std::net::SocketAddrV6 (crate socket2);
  • Result;
  • utf8_char::Utf8Char (crate encode_unicode);
  • *mut Self (crate unsafe_any);
  • repr(Rust) structs, tuples;
  • Identity transmutes that only transmute the lifetime;
  • Transmutation to change the mutability of reference from immutable to mutable (even if it only occurs in one crate… what?);

In the compiler/libstd the following transmutes are common:

  • Identity transmutes that only "transmute" the lifetime;
  • Transmutes to/from raw::TraitObject;
  • Transmutes from Box<Trait + Trait2> to Box<AnotherSetOfTraits>;

Many of these are genuine mistakes users make when transmuting, some others are "technically" safe from the standpoint of the layout (e.g. identity transmutes or lifetime transmutes) but questionable in other ways.

Note, that implementation of such check broke some 7k of the crates transitively. If it was a proper lint, the impact would be lesser, but still very widespread.

@nagisa nagisa added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jul 12, 2018
@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 31, 2018

Note, that implementation of such check broke some 7k of the crates transitively. If it was a proper lint, the impact would be lesser, but still very widespread.

Is there a single dependency breaking all those crates, or are those 7k crates using transmute incorrectly ?

Also, what's the plan here? I'd rather have this as a warning that can be turned off, than have nothing. Ideally, maybe one day some of those crates will be fixed, and we can enable this as a hard error.

@RalfJung
Copy link
Member

RalfJung commented Aug 31, 2018

We can also start by linting the less frequent cases. I am much more worried about people transmuting Result (any repr(Rust) enum, really), or a repr(Rust) struct, than about transmuting Box or slices.

@nagisa
Copy link
Member Author

nagisa commented Aug 31, 2018

Is there a single dependency breaking all those crates, or are those 7k crates using transmute incorrectly ?

There is a considerable number of crates using transmute incorrectly. Majority of the 7k broken crates is, of course, due to them depending on crates that use transmute incorrectly.

@est31
Copy link
Member

est31 commented Aug 31, 2018

Once there is an allow by default compatibility lint, there could be an effort/wg/call for participation where people send pull requests to the crates with issues.

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 31, 2018

So I have gone through dozens of the failures here and there, and AFAICT for most crates there is nothing to fix. The only reason they fail is because they are still stuck on some older version of log, rand, ... which used to use a transmute, but where the newer versions don't do this anymore.

It is pretty hard to interpret how much breakage this will cause from the performed crater runs because all builds fail once a dependency fails to build. It would be way more informative to disable the transmute lint while compiling dependencies.

@DemiMarie
Copy link
Contributor

So here are some transmutes that are always safe:

  • Transmuting from any type to itself is never UB:

    /// A safe (though overly complex) identity function.
    fn overcomplicated_identity<T>(a: T) -> T {
        unsafe { std::mem::transmute(a) }
    }
  • Transmutes that are dead code are never UB:

    /// Check that the two arguments are the same size.  Otherwise, fails at compile-time.
    macro_rules! static_assert_same_size {
        ($a:ty, $b:ty) => {
            (if false {
                let _: $a = unsafe { std::mem::transmute(std::mem::zeroed::<$b>()) };
            })
        }
    }

@RalfJung
Copy link
Member

RalfJung commented Sep 6, 2023

One trouble with this lint is that there's really 2 reasons a transmute could be justified:

  • source and target type both have sufficiently well-defined layout that one can entirely predict what happens
  • source and target type are "sufficiently equal" such that even though their layout is not well-defined, it's known to be the same for source and target

The first question overlaps with #72774, so code could be shared with improper_ctypes. We have a a bunch of types that definitely have well-defined layout (most primitives [but not all, e.g. wide pointers/references], arrays, repr(C), repr(transparent), things explicitly documented in the standard library), and then a large grey area beyond that.

The second question is mostly independent and captures cases like "types only differ in lifetime". Here we have very little guidance, even for seemingly simple things like replacing i32 by u32 or vice versa.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-lint Area: Lints (warnings about flaws in source code) such as unused_mut. C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants