Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derive support for pod structs with enums in them? #84

Open
repi opened this issue Jan 2, 2022 · 10 comments
Open

Derive support for pod structs with enums in them? #84

repi opened this issue Jan 2, 2022 · 10 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@repi
Copy link

repi commented Jan 2, 2022

Currently it looks like pod/zeroable structs do not support having enum fields in them, is this correct? As the derive macros for Pod and Zeroable doesn't support enums.

Such as:

#[derive(Copy, Clone, Pod, Zeroable)]
#[repr(C)]
struct TestWithEnum {
  a: u16,
  b: u16,
  e: ContiguousEnum,
}

#[repr(u8)]
#[derive(Clone, Copy, Contiguous)]
enum ContiguousEnum {
  A = 0,
  B = 1,
  C = 2,
}

Though enums that use a primitive representation could be considered pod types (even without the Contiguous additional requirement), and if such an enum implements a variant with value 0 it could support implement Zeroable trait as well right?

One can implement the traits manually for an enum:

unsafe impl bytemuck::Pod for ContiguousEnum {}
unsafe impl bytemuck::Zeroable for ContiguousEnum {}

That said, the crate does specify this as a restriction for what it calls a pod type which could rule out enums, but is a pretty big constraint:

The type must allow any bit pattern (eg: no bool or char, which have illegal bit patterns).

Is this intended as the restriction of why enums are not supported also, or has it simply not been implemented yet in the derive macros? So one could do this instead of manual implementation:

#[repr(u8)]
#[derive(Clone, Copy, Pod, Zeroable, Contiguous)]
enum ContiguousEnum {
  A = 0,
  B = 1,
  C = 2,
}

We have a bunch of use cases in native and shader code where we have pod structs (that do have enums in) that we want to use bytemuck on for the additional compile-time verification that they use the right representation and some limitations, but we are just casting between typed slices and byte slices so maybe what we are after is just a more relaxed pod type that could handle enums and also do not require types to be zeroable (#85)

@repi repi changed the title Support for pod structs with enums in them? Derive support for pod structs with enums in them? Jan 2, 2022
@Lokathor
Copy link
Owner

Lokathor commented Jan 3, 2022

Is this intended as the restriction of why enums are not supported also

Yes.

Here's how things could go wrong:

enum MyEnum {
  Whatever = 0,
  AnotherTag = 1,
}
// this is unsound
unsafe impl Pod for MyEnum {}

let mut my_enum = MyEnum::Whatever;
let x: &mut u8 = cast_mut(&mut my_enum);
*x = u8::MAX; // now `my_enum` holds an illegal bit pattern
match my_enum {
  // this match expression is gonna go sideways
}

Still, many enums can correctly be Zeroable.

@Lokathor
Copy link
Owner

Lokathor commented Jan 3, 2022

As a side note: your TestWithEnum struct will actually have a padding byte as the 6th byte, which I hope the Pod derive would detect and crash on, even if it did otherwise accept the enum.

@CryZe
Copy link
Contributor

CryZe commented Jan 3, 2022

Almost sounds like it might make sense to have a read only variant of Pod (maybe called NoPadding ?) that enums, char, bool and co. could implement.

@Lokathor
Copy link
Owner

Lokathor commented Jan 3, 2022

You're not the first to propose such a thing.

When I was first developing bytemuck I asked in the Discord and everyone considered the ability to convert &T to &[u8] a "not very useful" thing on its own, so I didn't make a separate trait for it.

At this point, it would be a breaking change to insert into the existing Pod: Zeroable trait tree, but just as a totally separate trait would be fine. We could even have it blanket impl for all Pod types.

However, no one has cared enough about it to write the PR. NoPadding would probably be a good name.

@Vrixyz
Copy link

Vrixyz commented Nov 13, 2022

The description of this issue is starting to get obsolete, as current state has improved a lot since its creation. The PR mentioned above #91 then #94 creating NoUninit, AnyBitPattern, and CheckedBitPattern ; included in release 1.9.0.

Coming from gschup/ggrs#40 ; we're "mucking" player input, and it could be a good ergonomic boost to muck that as an enum.

My current blocker as noted on gschup/ggrs#41 (comment) ; is bytes_of (https://docs.rs/bytemuck/latest/bytemuck/fn.bytes_of.html) ; which has a type restriction on NoUninit. I feel like this type restriction could be relaxed to CheckedBitPattern ? But as NoUninit is full of unsafety warnings, I wanted to start the discussion with you before diving in.

@Lokathor
Copy link
Owner

Hey @fu5ha The docs of the CheckedBitPattern derive claim that it's a subtrait of NoUninit, but the code doesn't actually... have that bound. Didn't you add a lot of that stuff?

@fu5ha
Copy link
Collaborator

fu5ha commented Nov 14, 2022

Hey @fu5ha The docs of the CheckedBitPattern derive claim that it's a subtrait of NoUninit, but the code doesn't actually... have that bound. Didn't you add a lot of that stuff?

I think this is just a docs issue for the derive that didn't get updated as the design evolved throughout the PR.

@Vrixyz the short answer is that bytes_of does indeed require NoUninit -- CheckedBitPattern is irrelevant here, because if we already have a safe value that we can view the bytes_of, then that value must be valid, therefore it must have a valid bit pattern, and since bytes_of takes an immutable ref, we can't modify that bit pattern so there's no safety or validity concern there. However, bytes_of does create a reference to all the bytes representing the backing value, and thus in order to satisfy validity constraints, all of those bytes must be initialized, which is not generizably-true for enums with fields.

I don't believe it's worth it to implement support in the derive macro for detecting cases where this is safe, because they're incredibly specific and checking them sufficiently is probably very hard. However, we could/should add this to docs somewhere. This is only possible because rust's fielded #[repr(C)] enums are well-defined as being c-style tagged-enums with an integer discriminant followed by a union of variant field-types

Therefore should be safe to implement NoUninit for an enum with fields if and only if:

  • The enum is marked #[repr(C)]
  • Every variant's field is the same size in bytes
  • Every variant's field is itself NoUninit (i.e. has no padding bytes)

The best way to guarantee the second two would be to make every variant of the enum contain a single tuple-struct. Each variant's field struct could then derive NoUninit and CheckedBitPattern. However, you still have the problem of making each of these variant structs have the same size.

Internally at embark we do a similar thing but making our own tagged unions with raw tag and union, and I developed a helper struct for this purpose, which looks like the following:

/// This type adds some `const PAD` number of "explicit" or "manual" padding
/// bytes to the end of a struct.
///
/// This is useful to make a type not have *real* padding bytes,
/// and therefore be able to be marked as [`bytemuck::NoUninit`]. Specifically,
/// it's used in the `ffi_union` macro to equalize the size of all
/// fields of a union and therefore remove any "real" padding bytes from the union, making
/// it safe to store in WASM memory and pass through the ark module host memory utility functions.
/// It may also be useful in other places.
#[derive(Copy, Clone)]
#[repr(C)]
pub struct TransparentPad<T, const PAD: usize>(pub T, [u8; PAD]);

// SAFETY: Since `[u8; N]` is always Zeroable, this is safe
unsafe impl<T: Zeroable, const PAD: usize> Zeroable for TransparentPad<T, PAD> {}

// SAFETY: Since `[u8; N]` is always AnyBitPattern, this is safe
unsafe impl<T: AnyBitPattern, const PAD: usize> AnyBitPattern for TransparentPad<T, PAD> {}

#[doc(hidden)] // it is only for us in this crate
#[macro_export]
macro_rules! impl_checked_bit_pattern_for_transparent_pad {
    ($inner:ident) => {
        // SAFETY: The extra padding is always AnyBitPattern, so implies CheckedBitPattern, and this just passes
        // down the necessary safety checks to the inner type.
        unsafe impl<const PAD: usize> bytemuck::CheckedBitPattern
            for $crate::TransparentPad<$inner, PAD>
        {
            type Bits = $crate::TransparentPad<<$inner as bytemuck::CheckedBitPattern>::Bits, PAD>;

            fn is_valid_bit_pattern(bits: &Self::Bits) -> bool {
                <$inner as bytemuck::CheckedBitPattern>::is_valid_bit_pattern(&bits.0)
            }
        }
    };
}

impl<T, const PAD: usize> TransparentPad<T, PAD> {
    pub fn new(inner: T) -> Self {
        Self(inner, [0u8; PAD])
    }
}

impl<T, const PAD: usize> AsRef<T> for TransparentPad<T, PAD> {
    fn as_ref(&self) -> &T {
        &self.0
    }
}

impl<T, const PAD: usize> AsMut<T> for TransparentPad<T, PAD> {
    fn as_mut(&mut self) -> &mut T {
        &mut self.0
    }
}

impl<T, const PAD: usize> core::ops::Deref for TransparentPad<T, PAD> {
    type Target = T;

    fn deref(&self) -> &Self::Target {
        &self.0
    }
}

impl<T, const PAD: usize> core::ops::DerefMut for TransparentPad<T, PAD> {
    fn deref_mut(&mut self) -> &mut Self::Target {
        &mut self.0
    }
}

We use the mentioned ffi_union macro to then auto-wrap the union's fields as TransparentPad<VariantStruct, AUTO_DERIVED_PADDING> so that every field has the same size and explicit padding bytes. You could do the same, though it would probably be easier to just do it manually for a single occurrence. Unfortunately, doing so is definitely a bit error-prone and manual, so care would need to be taken. You'd just need to figure out the proper number of padding bytes for each field to add so that they all have the same size as whatever the largest variant is.

@fu5ha
Copy link
Collaborator

fu5ha commented Nov 14, 2022

Actually now that I think about it, if this is something that has real demand, it might be possible to port the functionality of the ffi_union macro we have to handle this for rerp-c enums under defined circumstances (i.e. the ones defined above)

@Vrixyz
Copy link

Vrixyz commented Nov 14, 2022

Thank you for chiming in ! Very enlightening information 😄

I made progress on my understanding, and I documented my findings there https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=b1e536d5e1a25b4719390dc14a27168a.

Summary: I think your described approach can work 👍, BUT Repr(C) discriminant can get padding applied 😱, leading to uninitialized memory, and that's undefined behaviour from what I understand 🙅.

So the minor correction to your approach would be that the enum has to be marked #[repr(u*)] (where * is adapted depending on final alignment).

@fu5ha
Copy link
Collaborator

fu5ha commented Nov 14, 2022

Ah, good point! Yeah, if it auto assigns a u8 discriminant and your payload is align(>1) then you'll get padding between discriminant and payload. So, you need to make sure align(discriminant) >= align(payload).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants