Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dyn closures shouldn't lose range analysis information of the environment #55145

Open
RReverser opened this issue Oct 17, 2018 · 0 comments
Open
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@RReverser
Copy link
Contributor

Minimal reproducible example:

pub fn test(b: bool) -> Box<dyn Fn() -> bool> {
    assert!(b);
    Box::new(move || b)
}

currently compiles into:

example::test:
        push    rax
        test    edi, edi
        je      .LBB7_3
        mov     edi, 1
        mov     esi, 1
        call    __rust_alloc@PLT
        test    rax, rax
        je      .LBB7_4
        mov     byte ptr [rax], 1
        lea     rdx, [rip + .Lvtable.7]
        pop     rcx
        ret
.LBB7_3:
        call    std::panicking::begin_panic
        ud2
.LBB7_4:
        mov     edi, 1
        mov     esi, 1
        call    alloc::alloc::handle_alloc_error@PLT
        ud2

example::test::{{closure}}:
        mov     al, byte ptr [rdi]
        ret

But given that the b is guaranteed to be true, I'd expect it to be equivalent to inlining a constant.

The only way to propagate such invariants currently seems to be by using the LLVM assume intrinsic which is available only on nightly. E.g. for example above:

#![feature(core_intrinsics)]

use std::intrinsics::assume;

pub fn test(b: bool) -> Box<dyn Fn() -> bool> {
    assert!(b);
    Box::new(move || {
        unsafe { assume(b); }
        b
    })
}

compiles to

example::test:
        push    rax
        test    edi, edi
        je      .LBB7_3
        mov     edi, 1
        mov     esi, 1
        call    qword ptr [rip + __rust_alloc@GOTPCREL]
        test    rax, rax
        je      .LBB7_4
        mov     byte ptr [rax], 1
        lea     rdx, [rip + .L__unnamed_4]
        pop     rcx
        ret
.LBB7_3:
        call    std::panicking::begin_panic
        ud2
.LBB7_4:
        mov     edi, 1
        mov     esi, 1
        call    qword ptr [rip + _ZN5alloc5alloc18handle_alloc_error17hfd3c7484b550d419E@GOTPCREL]
        ud2

example::test::{{closure}}:
        mov     al, 1
        ret

which is much closer to the expected result - function is using a constant as it should. However, this still allocates data for a variable even though it's no longer necessary - it has only one value and isn't used in the optimised function body anyway.

Note that the range analysis issue applies specifically to dynamically dispatched closures (which are still necessary sometimes and are used for generic callbacks), however the second part of the issue (unnecessary allocation) applies to statically dispatched generic closures too. For example:

pub fn test(b: bool) -> Box<impl Fn() -> bool> {
    assert!(b);
    Box::new(move || {
        b
    })
}

compiles to:

example::test:
        push    rax
        test    edi, edi
        je      .LBB6_3
        mov     edi, 1
        mov     esi, 1
        call    qword ptr [rip + __rust_alloc@GOTPCREL]
        test    rax, rax
        je      .LBB6_4
        mov     byte ptr [rax], 1
        pop     rcx
        ret
.LBB6_3:
        call    std::panicking::begin_panic
        ud2
.LBB6_4:
        mov     edi, 1
        mov     esi, 1
        call    qword ptr [rip + _ZN5alloc5alloc18handle_alloc_error17hfd3c7484b550d419E@GOTPCREL]
        ud2

I understand these examples might seem superficial and can be easily optimised by hand, but I hope they do showcase a more generic issue, which also applies to e.g. captures of enum in match branches where captured enum is guaranteed to be within the already matched variants, but this information is not properly propagated and lost to the closure, preventing further optimisations.

@jonas-schievink jonas-schievink added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 1, 2019
@Enselic Enselic added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

3 participants