-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[LoopPeel] Fix BFI when peeling last iteration without guard #168250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1183,8 +1183,34 @@ bool llvm::peelLoop(Loop *L, unsigned PeelCount, bool PeelLast, LoopInfo *LI, | |
|
|
||
| // If the original loop may only execute a single iteration we need to | ||
| // insert a trip count check and skip the original loop with the last | ||
| // iteration peeled off if necessary. | ||
| if (!SE->isKnownNonZero(BTC)) { | ||
| // iteration peeled off if necessary. Either way, we must update branch | ||
| // weights to maintain the loop body frequency. | ||
| if (SE->isKnownNonZero(BTC)) { | ||
| // We have just proven that, when reached, the original loop always | ||
| // executes at least two iterations. Thus, we unconditionally execute | ||
| // both the remaining loop's initial iteration and the peeled iteration. | ||
| // But that increases the latter's frequency above its frequency in the | ||
| // original loop. To maintain the total frequency, we compensate by | ||
| // decreasing the remaining loop body's frequency to indicate one less | ||
| // iteration. | ||
| // | ||
| // We use this formula to convert probability to/from frequency: | ||
| // Sum(i=0..inf)(P^i) = 1/(1-P) = Freq. | ||
| if (BranchProbability P = getLoopProbability(L); !P.isUnknown()) { | ||
| // Trying to subtract one from an infinite loop is pointless, and our | ||
| // formulas then produce division by zero, so skip that case. | ||
| if (BranchProbability ExitP = P.getCompl(); !ExitP.isZero()) { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why would exitP be 0? oh, maybe message pumps...
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
An always infinite loop has P=1 and so ExitP=0. The new test case gives an example.
I'm not sure what this means.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A message pump is something like this: Server side worker threads, for example, do this.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, I see your point now. Thanks. Issue #165998 is what led me to be concerned about such infinite loops, but I'm not entirely sure how that arises in practice. Does profiling ever record an infinite loop? Would the profiled app have to run forever, or at least for a very long time to round off to a probability of 1? Or maybe it's some pass or declaration that somehow produces such branch weights? |
||
| double Freq = 1 / ExitP.toDouble(); | ||
| // No branch weights can produce a frequency of less than one given | ||
| // the initial iteration, and our formulas produce a negative | ||
| // probability if we try. | ||
| assert(Freq >= 1.0 && "expected freq >= 1 due to initial iteration"); | ||
| double NewFreq = std::max(Freq - 1, 1.0); | ||
jdenny-ornl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| setLoopProbability( | ||
| L, BranchProbability::getBranchProbability(1 - 1 / NewFreq)); | ||
| } | ||
| } | ||
| } else { | ||
| NewPreHeader = SplitEdge(PreHeader, Header, &DT, LI); | ||
| SCEVExpander Expander(*SE, Latch->getDataLayout(), "loop-peel"); | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,87 @@ | ||
| ; Check that the loop body frequency is maintained when LoopPeel both: | ||
| ; - Peels one unconditional iteration after the loop. | ||
| ; - Does not add a guard to sometimes skip the remaining loop because it has | ||
| ; proven the original loop always executes at least two iterations, which | ||
| ; become the initial iteration and the peeled iteration. | ||
|
|
||
| ; DEFINE: %{exitWeight} = | ||
| ; DEFINE: %{loopWeight} = | ||
| ; DEFINE: %{loopFreqOld} = | ||
| ; DEFINE: %{loopFreqNew} = | ||
|
|
||
| ; DEFINE: %{run} = \ | ||
| ; DEFINE: cp %s %t.ll && chmod +w %t.ll && \ | ||
| ; DEFINE: echo '!0 = !{!"branch_weights", i32 %{exitWeight}, ' \ | ||
| ; DEFINE: 'i32 %{loopWeight}}' >> %t.ll && \ | ||
| ; DEFINE: opt -p "print<block-freq>,loop-unroll,print<block-freq>" \ | ||
| ; DEFINE: -unroll-full-max-count=0 -S %t.ll 2>&1 | \ | ||
| ; DEFINE: FileCheck -DLOOP_FREQ_OLD='%{loopFreqOld}' \ | ||
| ; DEFINE: -DLOOP_FREQ_NEW='%{loopFreqNew}' %s | ||
|
|
||
| ; Branch weights give the original loop 10 iterations. We expect that | ||
| ; loopFreqOld = loopFreqNew + 1. | ||
| ; REDEFINE: %{exitWeight} = 1 | ||
| ; REDEFINE: %{loopWeight} = 9 | ||
| ; REDEFINE: %{loopFreqOld} = 10.0 | ||
| ; REDEFINE: %{loopFreqNew} = 9.0 | ||
| ; RUN: %{run} | ||
|
|
||
| ; Branch weights give the original loop 2 iterations. We expect that | ||
| ; loopFreqOld = loopFreqNew + 1. | ||
| ; REDEFINE: %{exitWeight} = 1 | ||
| ; REDEFINE: %{loopWeight} = 1 | ||
| ; REDEFINE: %{loopFreqOld} = 2.0 | ||
| ; REDEFINE: %{loopFreqNew} = 1.0 | ||
| ; RUN: %{run} | ||
|
|
||
| ; Branch weights give the original loop 1 iteration, but LoopPeel proved it has | ||
| ; at least 2. There is no loop probability that produces a frequency below 1, | ||
| ; so the original total frequency cannot be maintained. | ||
| ; REDEFINE: %{exitWeight} = 1 | ||
| ; REDEFINE: %{loopWeight} = 0 | ||
| ; REDEFINE: %{loopFreqOld} = 1.0 | ||
| ; REDEFINE: %{loopFreqNew} = 1.0 | ||
| ; RUN: %{run} | ||
|
|
||
| ; Branch weights say the original loop is infinite, maximizing the frequency, | ||
| ; so LoopPeel does not try to decrement it. | ||
| ; REDEFINE: %{exitWeight} = 0 | ||
| ; REDEFINE: %{loopWeight} = 1 | ||
| ; REDEFINE: %{loopFreqOld} = 2147483647.8 | ||
| ; REDEFINE: %{loopFreqNew} = 2147483647.8 | ||
| ; RUN: %{run} | ||
|
|
||
| ; Everything other than loop should be 1.0 because it is reached once. | ||
| ; | ||
| ; CHECK: block-frequency-info: test | ||
| ; CHECK-NEXT: - entry: float = 1.0, | ||
| ; CHECK-NEXT: - loop: float = [[LOOP_FREQ_OLD]], | ||
| ; CHECK-NEXT: - exit: float = 1.0, | ||
| ; | ||
| ; CHECK: block-frequency-info: test | ||
| ; CHECK-NEXT: - entry: float = 1.0, | ||
| ; CHECK-NEXT: - loop: float = [[LOOP_FREQ_NEW]], | ||
| ; CHECK-NEXT: - exit.peel.begin: float = 1.0, | ||
| ; CHECK-NEXT: - loop.peel: float = 1.0, | ||
| ; CHECK-NEXT: - exit.peel.next: float = 1.0, | ||
| ; CHECK-NEXT: - loop.peel.next: float = 1.0, | ||
| ; CHECK-NEXT: - exit: float = 1.0, | ||
|
|
||
| declare void @f(i32) | ||
|
|
||
| define void @test() { | ||
| entry: | ||
| br label %loop | ||
|
|
||
| loop: | ||
| %i = phi i32 [ 0, %entry ], [ %inc, %loop ] | ||
| %isLast = icmp eq i32 %i, 20 | ||
| %sel = select i1 %isLast, i32 1, i32 0 | ||
| call void @f(i32 %sel) | ||
| %inc = add i32 %i, 1 | ||
| %isLast1 = icmp eq i32 %i, 20 | ||
| br i1 %isLast1, label %exit, label %loop, !prof !0 | ||
|
|
||
| exit: | ||
| ret void | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rebase? Easier then to understand how it sits with #166858.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFK today. But that is already included.