[LoopRotate] Set loop back edge weight to not less than exit weight #86496

HaohaiWen · 2024-03-25T12:37:04Z

Branch weight from sample-based PGO may be not inaccurate due to
sampling. If the loop body must be executed, then origin loop back
edge weight must be not less than exit weight.

Branch weight from sample-based PGO may be not inaccurate due to sampling. This test tracks such case where updateBranchWeights wraps unsigned.

Branch weight from sample-based PGO may be not inaccurate due to sampling. If the loop body must be executed, then origin loop back edge weight must be not less than exit weight.

llvmbot · 2024-03-25T12:37:37Z

@llvm/pr-subscribers-llvm-transforms

Author: Haohai Wen (HaohaiWen)

Changes

Branch weight from sample-based PGO may be not inaccurate due to
sampling. If the loop body must be executed, then origin loop back
edge weight must be not less than exit weight.

Full diff: https://github.com/llvm/llvm-project/pull/86496.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Utils/LoopRotationUtils.cpp (+10)
(modified) llvm/test/Transforms/LoopRotate/update-branch-weights.ll (+42)

diff --git a/llvm/lib/Transforms/Utils/LoopRotationUtils.cpp b/llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
index bc671171137199..dd31cbd8376c4b 100644
--- a/llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
+++ b/llvm/lib/Transforms/Utils/LoopRotationUtils.cpp
@@ -347,9 +347,19 @@ static void updateBranchWeights(BranchInst &PreHeaderBI, BranchInst &LoopBI,
         // probabilities as if there are only 0-trip and 1-trip cases.
         ExitWeight0 = OrigLoopExitWeight - OrigLoopBackedgeWeight;
       }
+    } else {
+      if (OrigLoopExitWeight > OrigLoopBackedgeWeight) {
+        LLVM_DEBUG(
+            dbgs() << "WARNING: Bad loop back edge weight. Adjust it from "
+                   << OrigLoopBackedgeWeight << " to " << OrigLoopExitWeight
+                   << "\n");
+        OrigLoopBackedgeWeight = OrigLoopExitWeight;
+      }
     }
+    assert(OrigLoopExitWeight >= ExitWeight0 && "Bad branch weight");
     ExitWeight1 = OrigLoopExitWeight - ExitWeight0;
     EnterWeight = ExitWeight1;
+    assert(OrigLoopBackedgeWeight >= EnterWeight && "Bad branch weight");
     LoopBackWeight = OrigLoopBackedgeWeight - EnterWeight;
   } else if (OrigLoopExitWeight == 0) {
     if (OrigLoopBackedgeWeight == 0) {
diff --git a/llvm/test/Transforms/LoopRotate/update-branch-weights.ll b/llvm/test/Transforms/LoopRotate/update-branch-weights.ll
index 5d742b64e0adbf..9a1f36ec5ff2be 100644
--- a/llvm/test/Transforms/LoopRotate/update-branch-weights.ll
+++ b/llvm/test/Transforms/LoopRotate/update-branch-weights.ll
@@ -232,6 +232,46 @@ loop_exit:
   ret void
 }
 
+; BFI_BEFORE-LABEL: block-frequency-info: func6_inaccurate_branch_weight
+; BFI_BEFORE: - entry: {{.*}} count = 1024
+; BFI_BEFORE: - loop_header: {{.*}} count = 2047
+; BFI_BEFORE: - loop_body: {{.*}} count = 1023
+; BFI_BEFORE: - loop_exit: {{.*}} count = 1024
+
+; BFI_AFTER-LABEL: block-frequency-info: func6_inaccurate_branch_weight
+; BFI_AFTER: - entry: {{.*}} count = 1024
+; BFI_AFTER: - loop_body: {{.*}} count = 1024
+; BFI_AFTER: - loop_exit: {{.*}} count = 1024
+
+; IR-LABEL: define void @func6_inaccurate_branch_weight(
+; IR: entry:
+; IR:   br label %loop_body
+; IR: loop_body:
+; IR:   br i1 %cmp, label %loop_body, label %loop_exit, !prof [[PROF_FUNC6_0:![0-9]+]]
+; IR: loop_exit:
+; IR:   ret void
+
+; Branch weight from sample-based PGO may be inaccurate due to sampling.
+; Count for loop_body in following case should be not less than loop_exit.
+; However this may not hold for Sample-based PGO.
+define void @func6_inaccurate_branch_weight() !prof !3 {
+entry:
+  br label %loop_header
+
+loop_header:
+  %i = phi i32 [0, %entry], [%i_inc, %loop_body]
+  %cmp = icmp slt i32 %i, 2
+  br i1 %cmp, label %loop_body, label %loop_exit, !prof !9
+
+loop_body:
+  store volatile i32 %i, ptr @g, align 4
+  %i_inc = add i32 %i, 1
+  br label %loop_header
+
+loop_exit:
+  ret void
+}
+
 !0 = !{!"function_entry_count", i64 1}
 !1 = !{!"branch_weights", i32 1000, i32 1}
 !2 = !{!"branch_weights", i32 3000, i32 1000}
@@ -241,6 +281,7 @@ loop_exit:
 !6 = !{!"branch_weights", i32 0, i32 1}
 !7 = !{!"branch_weights", i32 1, i32 0}
 !8 = !{!"branch_weights", i32 0, i32 0}
+!9 = !{!"branch_weights", i32 1023, i32 1024}
 
 ; IR: [[PROF_FUNC0_0]] = !{!"branch_weights", i32 2000, i32 1000}
 ; IR: [[PROF_FUNC0_1]] = !{!"branch_weights", i32 999, i32 1}
@@ -251,3 +292,4 @@ loop_exit:
 ; IR: [[PROF_FUNC3_0]] = !{!"branch_weights", i32 0, i32 1}
 ; IR: [[PROF_FUNC4_0]] = !{!"branch_weights", i32 1, i32 0}
 ; IR: [[PROF_FUNC5_0]] = !{!"branch_weights", i32 0, i32 0}
+; IR: [[PROF_FUNC6_0]] = !{!"branch_weights", i32 0, i32 1024}

github-actions · 2024-03-25T12:39:44Z

✅ With the latest revision this PR passed the C/C++ code formatter.

github-actions · 2024-03-25T12:39:44Z

✅ With the latest revision this PR passed the Python code formatter.

HaohaiWen · 2024-03-27T13:35:46Z

Any more comments?

WenleiHe

Thanks for the fix. I suggest you wait for @MatzeB to take a look.

WenleiHe · 2024-03-28T06:00:23Z

llvm/lib/Transforms/Utils/LoopRotationUtils.cpp

+        LLVM_DEBUG(
+            dbgs() << "WARNING: Bad loop back edge weight. Adjust it from "
+                   << OrigLoopBackedgeWeight << " to " << OrigLoopExitWeight
+                   << "\n");


I don't think we need debug print warning for this. There is a lot of guessing and fixing already in the heuristic above, and it'd be inconsistent to just print debug warning for this case.

The heuristic wasn't based on guessing. I think those three expressions were bullet proofed (@MatzeB , please correct me if I am wrong). It just have a pre request the profile data is accurate. What I'm trying is to fix inaccurate profile data.

// - x == x0 + x1 # counts to "exit" must stay the same. // - y0 == x - x0 == x1 # how often loop was entered at all. // - y1 == y - y0 # How often loop was repeated (after first iter.).

The heuristic wasn't based on guessing. I think those three expressions were bullet proofed

I was referring to the existing heuristic. With loop rotate, you cannot recover ground truth counts. And you have to make assumptions like "as if 0-trip count nearly never happens"/ "as if there are only 0-trip and 1-trip cases".

WenleiHe

Actually I'm a bit confused looking at the test case..

You can have zero-trip loop, in which case loop_exit can be reached without going into loop_body, why loop back edge needs to be not less than exit weight?
In principle, profile inference for sample PGO should make input counts consistent. If there's input count inconsistency, we need track down the source of inconsistency. Fixing the input count inconsistency in a transformation isn't the right approach IMO. (The existing heuristic are making guesses for loop rotation which is different and is fine)

MatzeB · 2024-03-28T06:12:07Z

I don't follow why this would be a situation that needs fixup. While it is less common, it is perfectly fine for a loop header to exit more often than entering the body and using the loop backedge (when 0-trip counts happen often enough for a given loop to push the average trip count below 1).

MatzeB · 2024-03-28T06:21:09Z

Oh I see, this is for the case where the condition can be constant folded. I guess this makes sense in that case and you want to avoid an underflow? (may be good to mention in a comment/commit message).

MatzeB · 2024-03-28T06:28:20Z

I suspect some amount of noise in the sample / mapping samples to actual blocks etc. is unavoidable and I am not sure this situation (that only becomes "bad" when you consider the fact that the particular loop condition cannot be false in the first iteration) will be considered by smoothing algos. So seems fine to me, but leaving final call to @WenleiHe about whether some inaccuracies like this are considered normal (and hence must be worked around).

HaohaiWen · 2024-03-28T06:46:28Z

Actually I'm a bit confused looking at the test case..

You can have zero-trip loop, in which case loop_exit can be reached without going into loop_body, why loop back edge needs to be not less than exit weight?

Entering line 351 means HasConditionalPreHeader is false. That means this loop body must be executed at least once. Each execution of loop body must have a back edge to latch. Then it either go back to loop body or exit.

In principle, profile inference for sample PGO should make input counts consistent. If there's input count inconsistency, we need track down the source of inconsistency. Fixing the input count inconsistency in a transformation isn't the right approach IMO. (The existing heuristic are making guesses for loop rotation which is different and is fine)

Ideally, It should be fixed when loading profile in SampleProfileLoader. For this case, unless SampleProfilerLoader detecting the loop body must be executed at least (I think it's not easy for SampleProfileLoader), it have no idea whether this profile data is good.

WenleiHe · 2024-03-28T06:49:28Z

Ok, thanks for clarification. Please add a comment in the code explaining this case. Also clean up the debug prints.

WenleiHe

The latest revision LGTM, thanks.

alexfh · 2024-04-04T11:29:19Z

We see regressions of up to 50% (depending on microarchitecture) in llvm-test-suite microbenchmarks, when compiled with PGO+thinlto:

I don't think we consider these important, but I thought you might be interested in this observation.

HaohaiWen added 2 commits March 25, 2024 20:29

[LoopRotate] Add test to track update for inaccurate branch weight

31d4ce6

Branch weight from sample-based PGO may be not inaccurate due to sampling. This test tracks such case where updateBranchWeights wraps unsigned.

[LoopRotate] Set loop back edge weight to not less than exit weight

2b3c4a2

Branch weight from sample-based PGO may be not inaccurate due to sampling. If the loop body must be executed, then origin loop back edge weight must be not less than exit weight.

HaohaiWen requested review from MatzeB and williamweixiao March 25, 2024 12:37

llvmbot added the llvm:transforms label Mar 25, 2024

williamweixiao approved these changes Mar 26, 2024

View reviewed changes

HaohaiWen mentioned this pull request Mar 28, 2024

[InlineCost] Disable cost-benefit when sample based PGO is used #86626

Merged

WenleiHe reviewed Mar 28, 2024

View reviewed changes

WenleiHe requested changes Mar 28, 2024

View reviewed changes

Address comment

db9a078

WenleiHe approved these changes Mar 28, 2024

View reviewed changes

Merge remote-tracking branch 'origin/main' into looprotate

a146e92

HaohaiWen merged commit 896037c into llvm:main Mar 28, 2024
4 checks passed

HaohaiWen deleted the looprotate branch March 28, 2024 12:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LoopRotate] Set loop back edge weight to not less than exit weight #86496

[LoopRotate] Set loop back edge weight to not less than exit weight #86496

HaohaiWen commented Mar 25, 2024

llvmbot commented Mar 25, 2024

github-actions bot commented Mar 25, 2024

github-actions bot commented Mar 25, 2024

HaohaiWen commented Mar 27, 2024

WenleiHe left a comment

WenleiHe Mar 28, 2024

HaohaiWen Mar 28, 2024

WenleiHe Mar 28, 2024

WenleiHe left a comment •

edited

MatzeB commented Mar 28, 2024 •

edited

MatzeB commented Mar 28, 2024

MatzeB commented Mar 28, 2024

HaohaiWen commented Mar 28, 2024

WenleiHe commented Mar 28, 2024

WenleiHe left a comment

alexfh commented Apr 4, 2024

[LoopRotate] Set loop back edge weight to not less than exit weight #86496

[LoopRotate] Set loop back edge weight to not less than exit weight #86496

Conversation

HaohaiWen commented Mar 25, 2024

llvmbot commented Mar 25, 2024

github-actions bot commented Mar 25, 2024

github-actions bot commented Mar 25, 2024

HaohaiWen commented Mar 27, 2024

WenleiHe left a comment

Choose a reason for hiding this comment

WenleiHe Mar 28, 2024

Choose a reason for hiding this comment

HaohaiWen Mar 28, 2024

Choose a reason for hiding this comment

WenleiHe Mar 28, 2024

Choose a reason for hiding this comment

WenleiHe left a comment • edited

Choose a reason for hiding this comment

MatzeB commented Mar 28, 2024 • edited

MatzeB commented Mar 28, 2024

MatzeB commented Mar 28, 2024

HaohaiWen commented Mar 28, 2024

WenleiHe commented Mar 28, 2024

WenleiHe left a comment

Choose a reason for hiding this comment

alexfh commented Apr 4, 2024

WenleiHe left a comment •

edited

MatzeB commented Mar 28, 2024 •

edited