Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suboptimal assembly for Contiguous derive #175

Closed
e00E opened this issue Feb 19, 2023 · 9 comments · Fixed by #200
Closed

Suboptimal assembly for Contiguous derive #175

e00E opened this issue Feb 19, 2023 · 9 comments · Fixed by #200

Comments

@e00E
Copy link
Contributor

e00E commented Feb 19, 2023

The derived implementation of Contiguous::into_integer prevents some optimizations from happening. See the following example in which I have commented the resulting assembly as retrieved through cargo-show-asm on my stable-x86_64-unknown-linux-gnu machine.

use bytemuck::Contiguous;

#[derive(Copy, Clone, Contiguous)]
#[repr(u8)]
pub enum E {
    E0,
    E1,
    E2,
}

impl E {
    // movzx eax, byte ptr [rdi]
    pub fn ref_as(&self) -> Option<Self> {
        Self::from_integer(*self as u8)
    }

    // movzx eax, byte ptr [rdi]
    pub fn ref_into(&self) -> Option<Self> {
        Self::from_integer(self.into_integer())
    }

    // mov eax, edi
    pub fn val_as(self) -> Option<Self> {
        Self::from_integer(self as u8)
    }

    // cmp dil, 3
    // mov eax, 3
    // cmovb eax, edi
    pub fn val_into(self) -> Option<Self> {
        Self::from_integer(self.into_integer())
    }
}

The assembly for E::val_into should be the same as for E::val_as.

This could be an issue with the default forbidden-to-override implementation of Contiguous::into_integer or it could be an issue with rustc. As a workaround the derive macro could emit as Int since it knows what the repr type is. If you feel this is a rustc issue then I am happy to report it there.

@Lokathor
Copy link
Owner

I'd certainly be happy to accept a PR which improves the generated assembly of any of the proc-macros.

@Jarcho
Copy link

Jarcho commented Apr 3, 2023

The issue is cause by the transmute in into_integer. Casting to an integer rather than using transmute does allow the optimizer to preserve the range from the enum.

@scottmcm
Copy link

Hopefully fixed for Rust 1.71 -- try it out on nightly and let me know if it helped.

@e00E
Copy link
Contributor Author

e00E commented Apr 20, 2023

Thanks. Will try to remember to test this again when 1.71 releases.

@e00E
Copy link
Contributor Author

e00E commented Jul 13, 2023

Hopefully fixed for Rust 1.71 -- try it out on nightly and let me know if it helped.

Was not fixed by Rust 1.71. Tried with the stable release today.

@scottmcm
Copy link

scottmcm commented Jul 14, 2023

Oh, I see the problem. When I read

The issue is cause by the transmute in into_integer. Casting to an integer rather than using transmute does allow the optimizer to preserve the range from the enum.

I heard mem::transmute, not transmute!.

With true transmute they now do the same thing (https://rust.godbolt.org/z/Y6PrYbqaW -- so the same that they get merged), but the transmute_copy(&ManuallyDrop dance doesn't get the extra assumes.

So I guess the way forward here is either

  1. Update the derive to override into_integer to an implementation using a cast rather than using the provided impl, as was mentioned in Suboptimal assembly for Contiguous derive #175 (comment), or
  2. Convince libs-api to expose a stable API wrapper for https://doc.rust-lang.org/nightly/std/intrinsics/fn.transmute_unchecked.html

@e00E
Copy link
Contributor Author

e00E commented Jul 14, 2023

Thanks for your analysis. I might look into the derive change.

@Jarcho
Copy link

Jarcho commented Jul 19, 2023

That's my fault since I only reported mem::transmute and not mem::transmute_copy. I'll open an issue for that since it should also get the same treatment.

@Lokathor
Copy link
Owner

aside: now that ptr reading is const fn i hope transmute_copy is const fn soon too

e00E added a commit to e00E/bytemuck that referenced this issue Jul 19, 2023
`from_integer` and `into_integer` are usually provided by the trait's
default implementation. We override this implementation because it goes
through `transmute_copy`, which can lead to inefficient assembly as seen
in Lokathor#175 .
e00E added a commit to e00E/bytemuck that referenced this issue Jul 19, 2023
`from_integer` and `into_integer` are usually provided by the trait's
default implementation. We override this implementation because it goes
through `transmute_copy`, which can lead to inefficient assembly as seen
in Lokathor#175 .
e00E added a commit to e00E/bytemuck that referenced this issue Jul 19, 2023
`from_integer` and `into_integer` are usually provided by the trait's
default implementation. We override this implementation because it goes
through `transmute_copy`, which can lead to inefficient assembly as seen
in Lokathor#175 .
Lokathor pushed a commit that referenced this issue Sep 5, 2023
`from_integer` and `into_integer` are usually provided by the trait's
default implementation. We override this implementation because it goes
through `transmute_copy`, which can lead to inefficient assembly as seen
in #175 .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants