Missed optimization: constant range analysis

In the following testcase, ideally, the jump to the trap can de elided, but the current optimization pipeline misses this.

Running CorrelatedValuePropagation (`-passes=correlated-propagation`) misses this due to merging the ranges for the incoming predecessors into a block. Currently LVI does "union(predecessors)", followed by "computation based on instruction opcode". If the ranges were stored per block predecessor (another cache...so added cost...), LVI could theoretically do first "computation based on instruction opcode for each predecessor", followed by "union(predecessors)").

For the example below, the range for`%lshr` is currently the result of `lshr` [7, 277632] [3, 8], which is [0, 34704]. The intervals used for the `lshr` are the results of previous union operations from the phi nodes.
But if the intervals for the operands were computed per predecessor (for the add and phis, store the result of `getEdgeValue` before doing the union in `mergeIn`), then the range for`%lshr` could become the union of ((`lshr` [7, 1032] , 3), (`lshr` [15487, 262145], 7))=union of ([0, 129], [129, 2169])=[0, 2169]. With this, `%icmp6` could be replaced with `true` in the `br`.

Is it worth extending LVI to support this for constant ranges only, when all predecessor values are known (i.e. clear the predecessor cache on early exit for overdefined), with compile times evaluated and the extended cache populated/used under a flag?

Are there other passes or ways this range analysis is done and could currently catch this? 

```
@global = external local_unnamed_addr global [4338 x i32], align 16

define dso_local noundef zeroext i1 @foo(i64 noundef %arg, ptr noundef writeonly captures(none) %arg1) local_unnamed_addr {
bb:
  %icmp = icmp ult i64 %arg, 1025
  br i1 %icmp, label %bb4, label %bb2

bb2:                                              ; preds = %bb
  %icmp3 = icmp ult i64 %arg, 262145
  br i1 %icmp3, label %bb4, label %bb9

bb4:                                              ; preds = %bb2, %bb
  %phi = phi i64 [ 7, %bb ], [ 15487, %bb2 ]
  %phi5 = phi i64 [ 3, %bb ], [ 7, %bb2 ]
  %add = add nuw nsw i64 %phi, %arg
  %lshr = lshr i64 %add, %phi5
  %icmp6 = icmp samesign ult i64 %lshr, 4338
  br i1 %icmp6, label %bb8, label %bb7

bb7:                                              ; preds = %bb4
  tail call void @llvm.ubsantrap(i8 18)
  unreachable

bb8:                                              ; preds = %bb4
  %getelementptr = getelementptr inbounds nuw [4338 x i32], ptr @global, i64 0, i64 %lshr
  %load = load i32, ptr %getelementptr, align 4
  %sext = sext i32 %load to i64
  store i64 %sext, ptr %arg1, align 8
  br label %bb9

bb9:                                              ; preds = %bb8, %bb2
  %phi10 = phi i1 [ true, %bb8 ], [ false, %bb2 ]
  ret i1 %phi10
}

; Function Attrs: cold noreturn nounwind
declare void @llvm.ubsantrap(i8 immarg) #0

attributes #0 = { cold noreturn nounwind }
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Missed optimization: constant range analysis #158139

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Missed optimization: constant range analysis #158139

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions