-
Notifications
You must be signed in to change notification settings - Fork 15.3k
Description
Description
I am proposing an optimization to merge two range checks into a single comparison.
Specifically, the pattern (X < C) || (X - C < C2) can be folded into X < C + C2 under certain conditions (positive constants and no signed wrap for the sum).
However, the current InstCombine pipeline transforms the original pattern (umin + sub nsw) into a form (equivalent to usub.sat) that drops the implicit nsw constraint. Consequently, the final optimization becomes invalid because it lacks the necessary range information for %x.
- The Goal
We want to optimize the range check logic into a single comparison.
Preconditions: C >= 0, C2 >= 0, C + C2 does not overflow (signed), and the predicate is slt.
Original Source (Optimization Valid):
In this original IR, the nsw flag on sub implies that the operation does not overflow. This implicitly restricts the range of %x (e.g., preventing negative values that would cause underflow), making the transformation to @tgt valid.
define i1 @src_origin(i8 noundef %x, i8 noundef %C, i8 %C2) {
%assume = icmp slt i8 0, %C2
call void @llvm.assume(i1 %assume)
%C_is_pos = icmp sge i8 %C, 0
call void @llvm.assume(i1 %C_is_pos)
%sum = add i8 %C, %C2
%sum_is_valid = icmp sge i8 %sum, 0
call void @llvm.assume(i1 %sum_is_valid)
%v0 = tail call i8 @llvm.umin.i8(i8 %x, i8 %C)
; The 'nsw' here is crucial. It makes UB if x causes underflow.
%v1 = sub nsw i8 %x, %v0
%v2 = icmp slt i8 %v1, %C2
ret i1 %v2
}
define i1 @tgt_origin(i8 noundef %x, i8 noundef %C, i8 %C2) {
; Same assumptions as above...
%sum = add i8 %C, %C2
; This optimization is valid because 'nsw' in src refined the domain of %x.
%cmp = icmp slt i8 %x, %sum
ret i1 %cmp
}
- The Problem: Information Loss in Current Transformation
Currently, LLVM canonicalizes umin + sub patterns into usub.sat (or an equivalent OR-pattern: (X u< C) || (X - C < C2)).
During this step, the nsw information is discarded because usub.sat is well-defined for all inputs.
Once converted to usub.sat, the final optimization is no longer valid without explicit constraints on %x.
Current IR State (Optimization Invalid):
define i1 @src_current_transformed(i8 noundef %x, i8 noundef %C, i8 %C2) {
; Assumptions on C and C2 exist, BUT NO assumption on %x.
%assume = icmp slt i8 0, %C2
call void @llvm.assume(i1 %assume)
%C_is_pos = icmp sge i8 %C, 0
call void @llvm.assume(i1 %C_is_pos)
%sum = add i8 %C, %C2
%sum_is_valid = icmp sge i8 %sum, 0
call void @llvm.assume(i1 %sum_is_valid)
; 'nsw' is gone. usub.sat is defined even if x is negative (large unsigned).
%sub = call i8 @llvm.usub.sat.i8(i8 %x, i8 %C)
%cmp = icmp slt i8 %sub, %C2
ret i1 %cmp
}
define i1 @tgt_desired(i8 noundef %x, i8 noundef %C, i8 %C2) {
; ... assumptions ...
%sum = add i8 %C, %C2
%cmp = icmp slt i8 %x, %sum
ret i1 %cmp
}
Reason for failure: If %x is a negative value (e.g., 0x80 / -128), @src_current_transformed evaluates to false (because usub.sat returns a positive result, and slt checks it against C2), whereas @tgt_desired evaluates to true (because -128 is less than Sum).
The original @src_origin allowed this because %x = -128 would result in poison due to sub nsw, making the refinement valid.
- Proposed Solution
To enable this optimization, we must preserve the information provided by nsw before it is lost during the umin + sub nsw
usub.sat transformation.
I suggest that when InstCombine transforms sub nsw patterns into usub.sat, it should materialize the implicit nsw constraint into an explicit @llvm.assume (e.g., asserting %x >= 0 or relevant range checks).
By keeping the range hint for %x, we can safely fold the logic into the efficient icmp slt %x, %sum form in subsequent steps.
Alive2 Proof : https://alive2.llvm.org/ce/z/BAVye9
associated with : #167055