Invert comparison in `uN::checked_sub` #125038

ivan-shrimp · 2024-05-12T06:57:01Z

After #124114, LLVM no longer combines the comparison and subtraction in uN::checked_sub when either operand is a constant (demo: https://rust.godbolt.org/z/MaeoYbsP1). The difference is more pronounced when the expression is slightly more complex (https://rust.godbolt.org/z/4rPavsYdc).

This is due to the use of >= here:

rust/library/core/src/num/uint_macros.rs

Lines 581 to 593 in ee97564

    
           pub const fn checked_sub(self, rhs: Self) -> Option<Self> { 
        
               // Per PR#103299, there's no advantage to the `overflowing` intrinsic 
        
               // for *unsigned* subtraction and we just emit the manual check anyway. 
        
               // Thus, rather than using `overflowing_sub` that produces a wrapping 
        
               // subtraction, check it ourself so we can use an unchecked one. 
        
               if self >= rhs { 
        
                   // SAFETY: just checked this can't overflow 
        
                   Some(unsafe { intrinsics::unchecked_sub(self, rhs) }) 
        
               } else { 
        
                   None 
        
               } 
        
           }

For constant C, LLVM eagerly converts a >= C into a > C - 1, but the backend can only combine a < C with a - C, not C - 1 < a and a - C: https://github.com/llvm/llvm-project/blob/e586556e375fc5c4f7e76b5c299cb981f2016108/llvm/lib/CodeGen/CodeGenPrepare.cpp#L1697-L1742

This PR¹ simply inverts the >= into < to restore the LLVM magic, and somewhat align this with the implementation of uN::overflowing_sub from #103299.

When the result is stored as an Option (rather than being branched/cmoved on), the discriminant is self >= rhs. This PR doesn't affect the codegen (and relevant tests) of that since LLVM will negate self < rhs to self >= rhs when necessary.

Note to self: My very first contribution to publicly-used code. Hopefully like what I should learn to always be, tiny and humble. ↩

rustbot · 2024-05-12T06:57:09Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @joboet (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

@rustbot author: the review is finished, PR author should check the comments and take action accordingly
@rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

workingjubilee · 2024-05-12T18:53:32Z

There is some possibly-hot-looking compiler code that uses checked_sub, so I think it is reasonable to consider this to be performance-sensitive enough, and the compiler may give useful feedback, to justify giving perf a whirl:

@bors try @rust-timer queue

Invert comparison in `uN::checked_sub` After rust-lang#124114, LLVM no longer combines the comparison and subtraction in `uN::checked_sub` when either operand is a constant (demo: https://rust.godbolt.org/z/MaeoYbsP1). The difference is more pronounced when the expression is slightly more complex (https://rust.godbolt.org/z/4rPavsYdc). This is due to the use of `>=` here: https://github.com/rust-lang/rust/blob/ee97564e3a9f9ac8c65103abb37c6aa48d95bfa2/library/core/src/num/uint_macros.rs#L581-L593 For constant `C`, LLVM eagerly converts `a >= C` into `a > C - 1`, but the backend can only combine `a < C` with `a - C`, not `C - 1 < a` and `a - C`: https://github.com/llvm/llvm-project/blob/e586556e375fc5c4f7e76b5c299cb981f2016108/llvm/lib/CodeGen/CodeGenPrepare.cpp#L1697-L1742 This PR[^1] simply inverts the `>=` into `<` to restore the LLVM magic, and somewhat align this with the implementation of `uN::overflowing_sub` from rust-lang#103299. When the result is stored as an `Option` (rather than being branched/cmoved on), the discriminant is `self >= rhs`. This PR doesn't affect the codegen (and relevant tests) of that since LLVM will negate `self < rhs` to `self >= rhs` when necessary. [^1]: Note to `self`: My very first contribution to publicly-used code. Hopefully like what I should learn to always be, tiny and humble.

bors · 2024-05-12T18:54:42Z

⌛ Trying commit 7fde730 with merge 637dea2...

bors · 2024-05-12T20:32:47Z

☀️ Try build successful - checks-actions
Build commit: 637dea2 (637dea2577d204b1bbc746cf68a3488d5e4e42d5)

rust-timer · 2024-05-12T22:47:04Z

Finished benchmarking commit (637dea2): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.2%	[1.2%, 1.2%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.3%	[-2.3%, -2.3%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.6%	[-2.3%, 1.2%]	2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-1.2%	[-1.2%, -1.2%]	1
All ❌✅ (primary)	-	-	0

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.2%, 0.2%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.2%	[0.2%, 0.2%]	1

Bootstrap: 676.904s -> 675.76s (-0.17%)
Artifact size: 316.08 MiB -> 315.95 MiB (-0.04%)

joboet · 2024-05-15T10:43:39Z

I'd say this should be considered an LLVM bug, but working around it on our side seems fine, especially since the fix is equally readable.

Thank you!
@bors r+ rollup=maybe

bors · 2024-05-15T10:43:41Z

📌 Commit 7fde730 has been approved by joboet

It is now in the queue for this repository.

Rollup of 6 pull requests Successful merges: - rust-lang#124307 (Optimize character escaping.) - rust-lang#124975 (Use an helper to move the files) - rust-lang#125027 (Migrate `run-make/c-link-to-rust-staticlib` to `rmake`) - rust-lang#125038 (Invert comparison in `uN::checked_sub`) - rust-lang#125104 (Migrate `run-make/no-cdylib-as-rdylib` to `rmake`) - rust-lang#125137 (MIR operators: clarify Shl/Shr handling of negative offsets) r? `@ghost` `@rustbot` modify labels: rollup

Rollup merge of rust-lang#125038 - ivan-shrimp:checked_sub, r=joboet Invert comparison in `uN::checked_sub` After rust-lang#124114, LLVM no longer combines the comparison and subtraction in `uN::checked_sub` when either operand is a constant (demo: https://rust.godbolt.org/z/MaeoYbsP1). The difference is more pronounced when the expression is slightly more complex (https://rust.godbolt.org/z/4rPavsYdc). This is due to the use of `>=` here: https://github.com/rust-lang/rust/blob/ee97564e3a9f9ac8c65103abb37c6aa48d95bfa2/library/core/src/num/uint_macros.rs#L581-L593 For constant `C`, LLVM eagerly converts `a >= C` into `a > C - 1`, but the backend can only combine `a < C` with `a - C`, not `C - 1 < a` and `a - C`: https://github.com/llvm/llvm-project/blob/e586556e375fc5c4f7e76b5c299cb981f2016108/llvm/lib/CodeGen/CodeGenPrepare.cpp#L1697-L1742 This PR[^1] simply inverts the `>=` into `<` to restore the LLVM magic, and somewhat align this with the implementation of `uN::overflowing_sub` from rust-lang#103299. When the result is stored as an `Option` (rather than being branched/cmoved on), the discriminant is `self >= rhs`. This PR doesn't affect the codegen (and relevant tests) of that since LLVM will negate `self < rhs` to `self >= rhs` when necessary. [^1]: Note to `self`: My very first contribution to publicly-used code. Hopefully like what I should learn to always be, tiny and humble.

klensy · 2024-05-15T18:22:34Z

Need test?

ivan-shrimp · 2024-05-16T07:22:45Z

I'm not sure how we should test this. LLVM seems to generate the desired llvm.usub.with.overflow intrinsic in a pass that happens later than --emit=llvm-ir, so adding it near existing tests might not be very useful. We can test the sequence in assembly (e.g. look for sub+cmov in x86-64), but that seems a bit too far.

(Feel free to pick this up; I might not have time for this in the coming week or so.)

reverse condition in uN::checked_sub

7fde730

rustbot assigned joboet May 12, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 12, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 12, 2024

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label May 12, 2024

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 15, 2024

fmease mentioned this pull request May 15, 2024

Rollup of 6 pull requests #125144

Merged

bors merged commit 4f7d9d4 into rust-lang:master May 15, 2024
7 checks passed

rustbot added this to the 1.80.0 milestone May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invert comparison in `uN::checked_sub` #125038

Invert comparison in `uN::checked_sub` #125038

ivan-shrimp commented May 12, 2024

rustbot commented May 12, 2024

workingjubilee commented May 12, 2024

This comment has been minimized.

bors commented May 12, 2024

bors commented May 12, 2024

This comment has been minimized.

rust-timer commented May 12, 2024

joboet commented May 15, 2024

bors commented May 15, 2024

klensy commented May 15, 2024

ivan-shrimp commented May 16, 2024

	pub const fn checked_sub(self, rhs: Self) -> Option<Self> {
	// Per PR#103299, there's no advantage to the `overflowing` intrinsic
	// for unsigned subtraction and we just emit the manual check anyway.
	// Thus, rather than using `overflowing_sub` that produces a wrapping
	// subtraction, check it ourself so we can use an unchecked one.

	if self >= rhs {
	// SAFETY: just checked this can't overflow
	Some(unsafe { intrinsics::unchecked_sub(self, rhs) })
	} else {
	None
	}
	}

Invert comparison in uN::checked_sub #125038

Invert comparison in uN::checked_sub #125038

Conversation

ivan-shrimp commented May 12, 2024

Footnotes

rustbot commented May 12, 2024

workingjubilee commented May 12, 2024

This comment has been minimized.

bors commented May 12, 2024

bors commented May 12, 2024

This comment has been minimized.

rust-timer commented May 12, 2024

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Binary size

joboet commented May 15, 2024

bors commented May 15, 2024

klensy commented May 15, 2024

ivan-shrimp commented May 16, 2024

Invert comparison in `uN::checked_sub` #125038

Invert comparison in `uN::checked_sub` #125038