Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-literal constant objects are not well optimized comparing to literal constant objects #118557

Open
EFanZh opened this issue Dec 3, 2023 · 2 comments
Labels
C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@EFanZh
Copy link
Contributor

EFanZh commented Dec 3, 2023

The problem is originated from rust-lang/log#599.

Sometimes, multiple function calls can have the same constant arguments:

pub fn test(f: fn(&[u32; 10])) {
    f(&[7; 10]);
    f(&[7; 10]);
    f(&[7; 10]);
    f(&[7; 10]);
}

Rust recognizes that these arguments are the same value, so it would create a single constant value and pass it to each function:

example::test:
        push    r14
        push    rbx
        push    rax
        mov     r14, rdi
        lea     rbx, [rip + .L__unnamed_1]
        mov     rdi, rbx
        call    r14
        mov     rdi, rbx
        call    r14
        mov     rdi, rbx
        call    r14
        mov     rdi, rbx
        mov     rax, r14
        add     rsp, 8
        pop     rbx
        pop     r14
        jmp     rax

.L__unnamed_1:
        .asciz  "\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000\000\007\000\000"

But sometimes, these constant arguments have to be computed by some additional functions:

pub fn test(f: fn(&[u32; 10])) {
    f(&[std::convert::identity(7); 10]);
    f(&[std::convert::identity(7); 10]);
    f(&[std::convert::identity(7); 10]);
    f(&[std::convert::identity(7); 10]);
}

Then the compiler can’t optimize these constant objects well as the first example, additional copying operations are generated:

.LCPI0_0:
        .long   7
        .long   7
        .long   7
        .long   7
example::test:
        push    r14
        push    rbx
        sub     rsp, 40
        mov     rbx, rdi
        movaps  xmm0, xmmword ptr [rip + .LCPI0_0]
        movaps  xmmword ptr [rsp], xmm0
        movaps  xmmword ptr [rsp + 16], xmm0
        movabs  r14, 30064771079
        mov     qword ptr [rsp + 32], r14
        mov     rdi, rsp
        call    rbx
        movaps  xmm0, xmmword ptr [rip + .LCPI0_0]
        movaps  xmmword ptr [rsp], xmm0
        movaps  xmmword ptr [rsp + 16], xmm0
        mov     qword ptr [rsp + 32], r14
        mov     rdi, rsp
        call    rbx
        movaps  xmm0, xmmword ptr [rip + .LCPI0_0]
        movaps  xmmword ptr [rsp], xmm0
        movaps  xmmword ptr [rsp + 16], xmm0
        mov     qword ptr [rsp + 32], r14
        mov     rdi, rsp
        call    rbx
        movaps  xmm0, xmmword ptr [rip + .LCPI0_0]
        movaps  xmmword ptr [rsp], xmm0
        movaps  xmmword ptr [rsp + 16], xmm0
        mov     qword ptr [rsp + 32], r14
        mov     rdi, rsp
        call    rbx
        add     rsp, 40
        pop     rbx
        pop     r14
        ret

You can see the comparison here: https://godbolt.org/z/frj9a8TG6.

Additionally, using a const value as a proxy helps:

pub fn test(f: fn(&[u32; 10])) {
    const SEVEN: u32 = std::convert::identity(7);

    f(&[SEVEN; 10]);
    f(&[SEVEN; 10]);
    f(&[SEVEN; 10]);
    f(&[SEVEN; 10]);
}

But some functions can’t be used to compute a const value, such as std::panic::Location::caller, so the method above does not always work.

@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Dec 3, 2023
@EFanZh EFanZh changed the title Non-literal constant objects are not well optimized as literal constants objects Non-literal constant objects are not well optimized as literal constant objects Dec 3, 2023
@EFanZh EFanZh changed the title Non-literal constant objects are not well optimized as literal constant objects Non-literal constant objects are not well optimized comparing to literal constant objects Dec 3, 2023
@the8472
Copy link
Member

the8472 commented Dec 3, 2023

Array repeat expressions are not guaranteed to be consts, e.g. this is valid:

use atomic::Ordering;
use core::sync::atomic::AtomicU32;
static CNT: AtomicU32 = AtomicU32::new(0);

fn foo() -> u32 {
    CNT.fetch_add(1, Ordering::Relaxed)
}

fn main() {
    let _a = &[foo(); 32];
}

You can either use a separate const as in your 2nd example or (on nightly) you can use inline consts

#![feature(inline_const)]

pub fn test(f: fn(&[u32; 10])) {
    f(&[const { std::convert::identity(7) }; 10]);
    f(&[const { std::convert::identity(7) }; 10]);
    f(&[const { std::convert::identity(7) }; 10]);
    f(&[const { std::convert::identity(7) }; 10]);
}

@EFanZh
Copy link
Contributor Author

EFanZh commented Dec 3, 2023

Array repeat expressions are not guaranteed to be consts

If a value can’t be determined at compile time, it is understandable that the compiler can’t do the optimization. The problem is that there might be an optimization opportunity if the value can indeed be determined at compile time, which the compiler failed to utilize.

you can use inline consts

Even if inline consts was stabilized, there are expressions that can’t be enclosed in const blocks, but still generates compile time const values (like std::panic::Location::caller).

@jieyouxu jieyouxu added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Feb 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants