Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worse codegen for a get(a..a+N) slice range check on beta #112509

Closed
adrian17 opened this issue Jun 10, 2023 · 7 comments · Fixed by #125347
Closed

Worse codegen for a get(a..a+N) slice range check on beta #112509

adrian17 opened this issue Jun 10, 2023 · 7 comments · Fixed by #125347
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. I-slow Issue: Problems and improvements with respect to performance of generated code. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Milestone

Comments

@adrian17
Copy link

adrian17 commented Jun 10, 2023

Code

Godbolt repro: https://godbolt.org/z/Yv3qYqq4W

Variant A: check a.., then check ..a+N.

pub fn write_u8(
    bytes: &mut [u8],
    buf: u8,
    offset: usize,
) -> Result<(), Box<dyn std::error::Error>> {
    let buf = buf.to_le_bytes();
    bytes
        .get_mut(offset..).and_then(|bytes| bytes.get_mut(..buf.len()))
        .ok_or_else(|| "RangeError: The specified range is invalid")?
        .copy_from_slice(&buf);
    Ok(())
}

Variant B: check a..a+N in one call.

pub fn write_u8(
    bytes: &mut [u8],
    buf: u8,
    offset: usize,
) -> Result<(), Box<dyn std::error::Error>> {
    let buf = buf.to_le_bytes();
    bytes
        .get_mut(offset..offset+buf.len())         // <- changed line
        .ok_or_else(|| "RangeError: The specified range is invalid")?
        .copy_from_slice(&buf);
    Ok(())
}

Note that N is a constant (1), as it's a size of an u8. Also note that that this example can be extended to other types of buf: u16, u32, [u8; 17] etc.

Results

  • Variant A on stable 1.70: generates a single bytes.len() <= offset check, then sets the byte and returns Ok - perfect. (I wish the push/pop were only on the error path, but that's off topic).
  • Variant B on stable 1.70: generates two branches. (I wish it was the same as A, but I guess there could be some overflow related differences.)
  • Variant A on beta 1.71: same as on stable.
  • Variant B on beta 1.71: seemingly, nonsense? The "happy path" is much longer and contains several conditional moves; also, the lea rdi, [rip + .L__unnamed_1] looks like it moved address calculation of the error string to the happy path.

This issue concerns the last one.

I did an ad-hoc microbenchmark that shows Variant B on Beta (and nightly) to become 2-3x slower, but I'm not sure if that benchmark needs attaching - to me it's clear from the disassembly that the happy path codegen is now worse.

@adrian17 adrian17 added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels Jun 10, 2023
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jun 10, 2023
@saethlin
Copy link
Member

saethlin commented Jun 10, 2023

I think the change is that the MIR inliner now inlines Option::ok_or_else: https://godbolt.org/z/9Gz43cdvW If you add --emit=mir and look at the calls, that's the only difference from the increment of the inlining threshold that I can see. (yes I could do godbolt diff view but then I'd want 6 panes and that's a lot)

The follow-up question is why LLVM can't still optimize the MIR.

@apiraino apiraino added the regression-from-stable-to-beta Performance or correctness regression from stable to beta. label Jun 13, 2023
@Jules-Bertholet
Copy link
Contributor

@rustbot label -regression-untriaged

@rustbot rustbot removed the regression-untriaged Untriaged performance or correctness regression. label Jun 15, 2023
@nikic nikic added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Jun 15, 2023
@Mark-Simulacrum Mark-Simulacrum added this to the 1.71.0 milestone Jul 7, 2023
@nikic nikic added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Jul 7, 2023
@nikic
Copy link
Contributor

nikic commented Jul 7, 2023

llvm/llvm-project#63749

@adrian17
Copy link
Author

adrian17 commented Jul 8, 2023

Just to be sure, would this missing fold also solve the same issue with range sizes bigger than 1? Like: https://godbolt.org/z/KxbE97faE

@nikic
Copy link
Contributor

nikic commented Jul 8, 2023

No, that does not generalize. The previous issue was about the difference on 1.70.

The regression on 1.71 would be llvm/llvm-project#63756.

@nikic nikic self-assigned this Jul 12, 2023
@apiraino
Copy link
Contributor

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-medium

@rustbot rustbot added P-medium Medium priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jul 19, 2023
@apiraino apiraino added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jul 19, 2023
@nikic nikic added the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Aug 14, 2023
@nikic
Copy link
Contributor

nikic commented Aug 14, 2023

Fixed by #114048, needs codegen test.

@nikic nikic removed their assignment Aug 14, 2023
@Mark-Simulacrum Mark-Simulacrum added regression-from-stable-to-stable Performance or correctness regression from one stable version to another. and removed regression-from-stable-to-beta Performance or correctness regression from stable to beta. labels Aug 18, 2023
tesuji added a commit to tesuji/rustc that referenced this issue May 20, 2024
tesuji added a commit to tesuji/rustc that referenced this issue Jun 8, 2024
tesuji added a commit to tesuji/rustc that referenced this issue Jun 9, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 9, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 10, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 11, 2024
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 13, 2024
@bors bors closed this as completed in 7ac6c2f Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. I-slow Issue: Problems and improvements with respect to performance of generated code. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants