Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invert comparison in uN::checked_sub #125038

Merged
merged 1 commit into from May 15, 2024
Merged

Conversation

ivan-shrimp
Copy link
Contributor

After #124114, LLVM no longer combines the comparison and subtraction in uN::checked_sub when either operand is a constant (demo: https://rust.godbolt.org/z/MaeoYbsP1). The difference is more pronounced when the expression is slightly more complex (https://rust.godbolt.org/z/4rPavsYdc).

This is due to the use of >= here:

pub const fn checked_sub(self, rhs: Self) -> Option<Self> {
// Per PR#103299, there's no advantage to the `overflowing` intrinsic
// for *unsigned* subtraction and we just emit the manual check anyway.
// Thus, rather than using `overflowing_sub` that produces a wrapping
// subtraction, check it ourself so we can use an unchecked one.
if self >= rhs {
// SAFETY: just checked this can't overflow
Some(unsafe { intrinsics::unchecked_sub(self, rhs) })
} else {
None
}
}

For constant C, LLVM eagerly converts a >= C into a > C - 1, but the backend can only combine a < C with a - C, not C - 1 < a and a - C: https://github.com/llvm/llvm-project/blob/e586556e375fc5c4f7e76b5c299cb981f2016108/llvm/lib/CodeGen/CodeGenPrepare.cpp#L1697-L1742

This PR1 simply inverts the >= into < to restore the LLVM magic, and somewhat align this with the implementation of uN::overflowing_sub from #103299.

When the result is stored as an Option (rather than being branched/cmoved on), the discriminant is self >= rhs. This PR doesn't affect the codegen (and relevant tests) of that since LLVM will negate self < rhs to self >= rhs when necessary.

Footnotes

  1. Note to self: My very first contribution to publicly-used code. Hopefully like what I should learn to always be, tiny and humble.

@rustbot
Copy link
Collaborator

rustbot commented May 12, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @joboet (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 12, 2024
@workingjubilee
Copy link
Contributor

There is some possibly-hot-looking compiler code that uses checked_sub, so I think it is reasonable to consider this to be performance-sensitive enough, and the compiler may give useful feedback, to justify giving perf a whirl:

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 12, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request May 12, 2024
Invert comparison in `uN::checked_sub`

After rust-lang#124114, LLVM no longer combines the comparison and subtraction in `uN::checked_sub` when either operand is a constant (demo: https://rust.godbolt.org/z/MaeoYbsP1). The difference is more pronounced when the expression is slightly more complex (https://rust.godbolt.org/z/4rPavsYdc).

This is due to the use of `>=` here:

https://github.com/rust-lang/rust/blob/ee97564e3a9f9ac8c65103abb37c6aa48d95bfa2/library/core/src/num/uint_macros.rs#L581-L593

For constant `C`, LLVM eagerly converts `a >= C` into `a > C - 1`, but the backend can only combine `a < C` with `a - C`, not `C - 1 < a` and `a - C`: https://github.com/llvm/llvm-project/blob/e586556e375fc5c4f7e76b5c299cb981f2016108/llvm/lib/CodeGen/CodeGenPrepare.cpp#L1697-L1742

This PR[^1] simply inverts the `>=` into `<` to restore the LLVM magic, and somewhat align this with the implementation of `uN::overflowing_sub` from rust-lang#103299.

When the result is stored as an `Option` (rather than being branched/cmoved on), the discriminant is `self >= rhs`. This PR doesn't affect the codegen (and relevant tests) of that since LLVM will negate `self < rhs` to `self >= rhs` when necessary.

[^1]: Note to `self`: My very first contribution to publicly-used code. Hopefully like what I should learn to always be, tiny and humble.
@bors
Copy link
Contributor

bors commented May 12, 2024

⌛ Trying commit 7fde730 with merge 637dea2...

@bors
Copy link
Contributor

bors commented May 12, 2024

☀️ Try build successful - checks-actions
Build commit: 637dea2 (637dea2577d204b1bbc746cf68a3488d5e4e42d5)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (637dea2): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.2% [1.2%, 1.2%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.3% [-2.3%, -2.3%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.6% [-2.3%, 1.2%] 2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.2% [-1.2%, -1.2%] 1
All ❌✅ (primary) - - 0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.2% [0.2%, 0.2%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.2% [0.2%, 0.2%] 1

Bootstrap: 676.904s -> 675.76s (-0.17%)
Artifact size: 316.08 MiB -> 315.95 MiB (-0.04%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 12, 2024
@joboet
Copy link
Contributor

joboet commented May 15, 2024

I'd say this should be considered an LLVM bug, but working around it on our side seems fine, especially since the fix is equally readable.

Thank you!
@bors r+ rollup=maybe

@bors
Copy link
Contributor

bors commented May 15, 2024

📌 Commit 7fde730 has been approved by joboet

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 15, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request May 15, 2024
Rollup of 6 pull requests

Successful merges:

 - rust-lang#124307 (Optimize character escaping.)
 - rust-lang#124975 (Use an helper to move the files)
 - rust-lang#125027 (Migrate `run-make/c-link-to-rust-staticlib` to `rmake`)
 - rust-lang#125038 (Invert comparison in `uN::checked_sub`)
 - rust-lang#125104 (Migrate `run-make/no-cdylib-as-rdylib` to `rmake`)
 - rust-lang#125137 (MIR operators: clarify Shl/Shr handling of negative offsets)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 4f7d9d4 into rust-lang:master May 15, 2024
7 checks passed
@rustbot rustbot added this to the 1.80.0 milestone May 15, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request May 15, 2024
Rollup merge of rust-lang#125038 - ivan-shrimp:checked_sub, r=joboet

Invert comparison in `uN::checked_sub`

After rust-lang#124114, LLVM no longer combines the comparison and subtraction in `uN::checked_sub` when either operand is a constant (demo: https://rust.godbolt.org/z/MaeoYbsP1). The difference is more pronounced when the expression is slightly more complex (https://rust.godbolt.org/z/4rPavsYdc).

This is due to the use of `>=` here:

https://github.com/rust-lang/rust/blob/ee97564e3a9f9ac8c65103abb37c6aa48d95bfa2/library/core/src/num/uint_macros.rs#L581-L593

For constant `C`, LLVM eagerly converts `a >= C` into `a > C - 1`, but the backend can only combine `a < C` with `a - C`, not `C - 1 < a` and `a - C`: https://github.com/llvm/llvm-project/blob/e586556e375fc5c4f7e76b5c299cb981f2016108/llvm/lib/CodeGen/CodeGenPrepare.cpp#L1697-L1742

This PR[^1] simply inverts the `>=` into `<` to restore the LLVM magic, and somewhat align this with the implementation of `uN::overflowing_sub` from rust-lang#103299.

When the result is stored as an `Option` (rather than being branched/cmoved on), the discriminant is `self >= rhs`. This PR doesn't affect the codegen (and relevant tests) of that since LLVM will negate `self < rhs` to `self >= rhs` when necessary.

[^1]: Note to `self`: My very first contribution to publicly-used code. Hopefully like what I should learn to always be, tiny and humble.
@klensy
Copy link
Contributor

klensy commented May 15, 2024

Need test?

@ivan-shrimp
Copy link
Contributor Author

I'm not sure how we should test this. LLVM seems to generate the desired llvm.usub.with.overflow intrinsic in a pass that happens later than --emit=llvm-ir, so adding it near existing tests might not be very useful. We can test the sequence in assembly (e.g. look for sub+cmov in x86-64), but that seems a bit too far.

(Feel free to pick this up; I might not have time for this in the coming week or so.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants