Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed optimization on array construction from fixed size byte array #67484

Closed
tesuji opened this issue Dec 21, 2019 · 1 comment
Closed

Missed optimization on array construction from fixed size byte array #67484

tesuji opened this issue Dec 21, 2019 · 1 comment
Labels
A-codegen Area: Code generation C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@tesuji
Copy link
Contributor

tesuji commented Dec 21, 2019

godbolt link: https://rust.godbolt.org/z/MNuq-K

I have two snippets which I expect they have the same assembly:

use std::convert::TryInto;
pub fn foo(bytes: [u8; 16]) -> u32 {
    let mut out = [0u32; 4];
    for (a, b) in out.iter_mut().zip(bytes.chunks_exact(4)) {
        *a = u32::from_ne_bytes(b.try_into().unwrap());
    }
    out[3]
}

and

pub fn foo(bytes: [u8; 16]) -> u32 {
    let out: [u32; 4] = unsafe {
        (bytes.as_ptr() as *const u32 as *const [u32; 4]).read_unaligned()
    };
    out[3]
}

The non-optimal one has this output:

example::foo:
        sub     rsp, 24
        movups  xmm0, xmmword ptr [rdi]
        movaps  xmmword ptr [rsp], xmm0
        mov     eax, dword ptr [rsp + 12]
        add     rsp, 24
        ret

Which should be:

example::foo:
        mov     eax, dword ptr [rdi + 12]
        ret
@Centril Centril added I-slow Issue: Problems and improvements with respect to performance of generated code. A-codegen Area: Code generation T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 21, 2019
@JohnTitor JohnTitor added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Jan 12, 2020
@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
@tesuji
Copy link
Contributor Author

tesuji commented May 20, 2024

Optimized since Rust 1.52

@tesuji tesuji closed this as completed May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants