-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Description
| Bugzilla Link | 22266 |
| Version | trunk |
| OS | Linux |
| CC | @hfinkel |
Extended Description
Given this test case, our current LoopPRE does not lift the loop invariant load from %q.
; RUN: opt -S -basicaa -gvn %s
; Function Attrs: readonly
declare void @hold(i32) #0
define i32 @test1(i1 %cnd, i32* %p, i32* dereferenceable(8) %q) {
entry:
br label %header
header: ; preds = %inner, %entry
br i1 %cnd, label %exit, label %inner
inner: ; preds = %header
%v2 = load i32* %p
%v3 = load i32* %q
%sub = sub i32 %v3, %v2
call void @hold(i32 %sub)
br label %header
exit: ; preds = %header
ret i32 0
}
attributes #0 = { readonly }
This test case is mostly chosen to highlight a weakness with our current establishment of profitability in PRE. The LICM pass does get this case, so the actual test case isn't interesting other than that it sometimes arises due to pass ordering.
LoadPRE gives up when while trying to find a reasonable merge point because it encounters a block with multiple successors (i.e. header). As this case illustrates, that's clearly not always the best choice.
If we know that the pointer in question is dereferencable, we can relax the profitability part of the anticipation notion. Here are a few heuristics that might make sense:
- Skip over all loop exits
- Skip over any split where the other paths are 'cold enough'
- Skip over any split if the load is trivially available along that path (this happens a lot with multiple latch loops)
Note that this is only legal to do for dereferencable pointers. Lifting the load of %p without first recognizing that the condition is itself loop invariant would be illegal. It could introduce a fault which didn't occur in the original program.