Skip to content

Missed optimization: constant range analysis #158139

@alinas

Description

@alinas

In the following testcase, ideally, the jump to the trap can de elided, but the current optimization pipeline misses this.

Running CorrelatedValuePropagation (-passes=correlated-propagation) misses this due to merging the ranges for the incoming predecessors into a block. Currently LVI does "union(predecessors)", followed by "computation based on instruction opcode". If the ranges were stored per block predecessor (another cache...so added cost...), LVI could theoretically do first "computation based on instruction opcode for each predecessor", followed by "union(predecessors)").

For the example below, the range for%lshr is currently the result of lshr [7, 277632] [3, 8], which is [0, 34704]. The intervals used for the lshr are the results of previous union operations from the phi nodes.
But if the intervals for the operands were computed per predecessor (for the add and phis, store the result of getEdgeValue before doing the union in mergeIn), then the range for%lshr could become the union of ((lshr [7, 1032] , 3), (lshr [15487, 262145], 7))=union of ([0, 129], [129, 2169])=[0, 2169]. With this, %icmp6 could be replaced with true in the br.

Is it worth extending LVI to support this for constant ranges only, when all predecessor values are known (i.e. clear the predecessor cache on early exit for overdefined), with compile times evaluated and the extended cache populated/used under a flag?

Are there other passes or ways this range analysis is done and could currently catch this?

@global = external local_unnamed_addr global [4338 x i32], align 16

define dso_local noundef zeroext i1 @foo(i64 noundef %arg, ptr noundef writeonly captures(none) %arg1) local_unnamed_addr {
bb:
  %icmp = icmp ult i64 %arg, 1025
  br i1 %icmp, label %bb4, label %bb2

bb2:                                              ; preds = %bb
  %icmp3 = icmp ult i64 %arg, 262145
  br i1 %icmp3, label %bb4, label %bb9

bb4:                                              ; preds = %bb2, %bb
  %phi = phi i64 [ 7, %bb ], [ 15487, %bb2 ]
  %phi5 = phi i64 [ 3, %bb ], [ 7, %bb2 ]
  %add = add nuw nsw i64 %phi, %arg
  %lshr = lshr i64 %add, %phi5
  %icmp6 = icmp samesign ult i64 %lshr, 4338
  br i1 %icmp6, label %bb8, label %bb7

bb7:                                              ; preds = %bb4
  tail call void @llvm.ubsantrap(i8 18)
  unreachable

bb8:                                              ; preds = %bb4
  %getelementptr = getelementptr inbounds nuw [4338 x i32], ptr @global, i64 0, i64 %lshr
  %load = load i32, ptr %getelementptr, align 4
  %sext = sext i32 %load to i64
  store i64 %sext, ptr %arg1, align 8
  br label %bb9

bb9:                                              ; preds = %bb8, %bb2
  %phi10 = phi i1 [ true, %bb8 ], [ false, %bb2 ]
  ret i1 %phi10
}

; Function Attrs: cold noreturn nounwind
declare void @llvm.ubsantrap(i8 immarg) #0

attributes #0 = { cold noreturn nounwind }

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions