Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ptr::read() is slower than directly dereferencing #73258

Closed
jrmuizel opened this issue Jun 11, 2020 · 5 comments · Fixed by #109035
Closed

ptr::read() is slower than directly dereferencing #73258

jrmuizel opened this issue Jun 11, 2020 · 5 comments · Fixed by #109035
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@jrmuizel
Copy link
Contributor

#[derive(Clone, Copy)]
pub enum Foo {
    A, B, C, D,
}

pub fn foo(v: *const Foo) -> Foo {
    let k: Option<Foo> = unsafe { Some(*v) };
    return k.unwrap();
}

compiles to

example::foo:
        mov     al, byte ptr [rdi]
        ret

but replacing *v with v.read()

#[derive(Clone, Copy)]
pub enum Foo {
    A, B, C, D,
}

pub fn foo(v: *const Foo) -> Foo {
    let k: Option<Foo> = unsafe { Some(v.read()) };
    return k.unwrap();
}

compiles to:

example::foo:
        push    rax
        mov     al, byte ptr [rdi]
        cmp     al, 4
        je      .LBB0_1
        pop     rcx
        ret
.LBB0_1:
        lea     rdi, [rip + .L__unnamed_1]
        lea     rdx, [rip + .L__unnamed_2]
        mov     esi, 43
        call    qword ptr [rip + core::panicking::panic@GOTPCREL]
        ud2

This came up from #71257

@jrmuizel jrmuizel added the C-bug Category: This is a bug. label Jun 11, 2020
@alex
Copy link
Member

alex commented Jun 11, 2020

godbolt for the bad version here: https://rust.godbolt.org/z/CWboVo

took me a second, but the panic is from the unwrap, suggesting that the discriminant range isn't propagated correctly

@jonas-schievink jonas-schievink added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-codegen Area: Code generation labels Jun 11, 2020
@jrmuizel
Copy link
Contributor Author

This is presumably because the copy_nonoverlapping in the implementation of ptr::read() does not emit range metadata.

@hanna-kruppe
Copy link
Contributor

This matches my impression from skimming over -print-after-all output, though I'll add that the call to ptr::read() does have the range metadata, so if that metadata got propagated before or during inlining, we'd probably be in a better position. But a simpler solution in the short term might be to implement ptr::read with a compiler intrinsic instead of the current MaybeUninit+copy_nonoverlapping contraption.

@scottmcm
Copy link
Member

LLVM-IR diff: https://rust.godbolt.org/z/vMex9ooxq

Looks like the root cause is that read doesn't get !range metadata:

-  %_3 = load i8, ptr %v, align 1, !dbg !11, !range !13, !noundef !10
+  %tmp.0.copyload = load i8, ptr %v, align 1, !dbg !11

Will hopefully be fixed by #109035

@scottmcm
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants