-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[profcheck] Add unknown branch weights to expanded cmpxchg loop. #165841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[profcheck] Add unknown branch weights to expanded cmpxchg loop. #165841
Conversation
|
@llvm/pr-subscribers-backend-powerpc @llvm/pr-subscribers-backend-risc-v Author: Jin Huang (jinhuang1102) ChangesThe AtomicExpandPass is responsible for lowering high-level atomic operations (like Given that we cannot empirically prove the precision branch weights, It uses the Patch is 2.90 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/165841.diff 23 Files Affected:
diff --git a/llvm/lib/CodeGen/AtomicExpandPass.cpp b/llvm/lib/CodeGen/AtomicExpandPass.cpp
index 53f1cfe24a68d..dffb69425bb31 100644
--- a/llvm/lib/CodeGen/AtomicExpandPass.cpp
+++ b/llvm/lib/CodeGen/AtomicExpandPass.cpp
@@ -38,6 +38,7 @@
#include "llvm/IR/MDBuilder.h"
#include "llvm/IR/MemoryModelRelaxationAnnotations.h"
#include "llvm/IR/Module.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/IR/Type.h"
#include "llvm/IR/User.h"
#include "llvm/IR/Value.h"
@@ -1259,8 +1260,7 @@ Value *AtomicExpandImpl::insertRMWLLSCLoop(
BasicBlock *BB = Builder.GetInsertBlock();
Function *F = BB->getParent();
- assert(AddrAlign >=
- F->getDataLayout().getTypeStoreSize(ResultTy) &&
+ assert(AddrAlign >= F->getDataLayout().getTypeStoreSize(ResultTy) &&
"Expected at least natural alignment at this point.");
// Given: atomicrmw some_op iN* %addr, iN %incr ordering
@@ -1680,7 +1680,12 @@ Value *AtomicExpandImpl::insertRMWCmpXchgLoop(
Loaded->addIncoming(NewLoaded, LoopBB);
- Builder.CreateCondBr(Success, ExitBB, LoopBB);
+ Instruction *CondBr = Builder.CreateCondBr(Success, ExitBB, LoopBB);
+
+ // Atomic RMW expands to a cmpxchg loop, Since precise branch weights
+ // cannot be easily determined here, we mark the branch as "unknown" (50/50)
+ // to prevent misleading optimizations.
+ setExplicitlyUnknownBranchWeightsIfProfiled(*CondBr, *F, DEBUG_TYPE);
Builder.SetInsertPoint(ExitBB, ExitBB->begin());
return NewLoaded;
diff --git a/llvm/test/Transforms/AtomicExpand/AArch64/atomicrmw-fp.ll b/llvm/test/Transforms/AtomicExpand/AArch64/atomicrmw-fp.ll
index 8ffacb9bdd5f6..fe42a5439857c 100644
--- a/llvm/test/Transforms/AtomicExpand/AArch64/atomicrmw-fp.ll
+++ b/llvm/test/Transforms/AtomicExpand/AArch64/atomicrmw-fp.ll
@@ -14,7 +14,7 @@ define float @test_atomicrmw_fadd_f32(ptr %ptr, float %value) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP4]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i32, i1 } [[TMP4]], 0
; CHECK-NEXT: [[TMP5]] = bitcast i32 [[NEWLOADED]] to float
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1:![0-9]+]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret float [[TMP5]]
;
@@ -35,7 +35,7 @@ define float @test_atomicrmw_fsub_f32(ptr %ptr, float %value) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP4]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i32, i1 } [[TMP4]], 0
; CHECK-NEXT: [[TMP5]] = bitcast i32 [[NEWLOADED]] to float
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret float [[TMP5]]
;
@@ -56,7 +56,7 @@ define float @atomicrmw_fmin_float(ptr %ptr, float %value) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i32, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i32 [[NEWLOADED]] to float
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret float [[TMP6]]
;
@@ -77,7 +77,7 @@ define float @atomicrmw_fmax_float(ptr %ptr, float %value) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i32, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i32 [[NEWLOADED]] to float
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret float [[TMP6]]
;
@@ -98,7 +98,7 @@ define double @atomicrmw_fmin_double(ptr %ptr, double %value) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i64, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i64, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i64 [[NEWLOADED]] to double
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret double [[TMP6]]
;
@@ -119,7 +119,7 @@ define double @atomicrmw_fmax_double(ptr %ptr, double %value) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i64, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i64, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i64 [[NEWLOADED]] to double
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret double [[TMP6]]
;
@@ -140,7 +140,7 @@ define float @atomicrmw_fminimum_float(ptr %ptr, float %value) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i32, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i32 [[NEWLOADED]] to float
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret float [[TMP6]]
;
@@ -161,7 +161,7 @@ define float @atomicrmw_fmaximum_float(ptr %ptr, float %value) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i32, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i32 [[NEWLOADED]] to float
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret float [[TMP6]]
;
@@ -182,7 +182,7 @@ define double @atomicrmw_fminimum_double(ptr %ptr, double %value) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i64, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i64, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i64 [[NEWLOADED]] to double
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret double [[TMP6]]
;
@@ -203,7 +203,7 @@ define double @atomicrmw_fmaximum_double(ptr %ptr, double %value) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i64, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i64, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i64 [[NEWLOADED]] to double
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret double [[TMP6]]
;
@@ -224,7 +224,7 @@ define bfloat @atomicrmw_fmaximum_bfloat(ptr %ptr, bfloat %val) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i16, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i16, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i16 [[NEWLOADED]] to bfloat
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret bfloat [[TMP6]]
;
@@ -245,7 +245,7 @@ define half @atomicrmw_fmaximum_half(ptr %ptr, half %val) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i16, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i16, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i16 [[NEWLOADED]] to half
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret half [[TMP6]]
;
@@ -266,7 +266,7 @@ define <2 x half> @atomicrmw_fmaximum_2_x_half(ptr %ptr, <2 x half> %val) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i32, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i32 [[NEWLOADED]] to <2 x half>
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret <2 x half> [[TMP6]]
;
@@ -287,7 +287,7 @@ define bfloat @atomicrmw_fminimum_bfloat(ptr %ptr, bfloat %val) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i16, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i16, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i16 [[NEWLOADED]] to bfloat
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret bfloat [[TMP6]]
;
@@ -308,7 +308,7 @@ define half @atomicrmw_fminimum_half(ptr %ptr, half %val) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i16, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i16, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i16 [[NEWLOADED]] to half
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret half [[TMP6]]
;
@@ -329,7 +329,7 @@ define <2 x half> @atomicrmw_fminimum_2_x_half(ptr %ptr, <2 x half> %val) {
; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i32, i1 } [[TMP5]], 1
; CHECK-NEXT: [[NEWLOADED:%.*]] = extractvalue { i32, i1 } [[TMP5]], 0
; CHECK-NEXT: [[TMP6]] = bitcast i32 [[NEWLOADED]] to <2 x half>
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret <2 x half> [[TMP6]]
;
diff --git a/llvm/test/Transforms/AtomicExpand/AArch64/pcsections.ll b/llvm/test/Transforms/AtomicExpand/AArch64/pcsections.ll
index c5c890559152d..5c60f21c207b9 100644
--- a/llvm/test/Transforms/AtomicExpand/AArch64/pcsections.ll
+++ b/llvm/test/Transforms/AtomicExpand/AArch64/pcsections.ll
@@ -4,7 +4,7 @@
define i8 @atomic8_load_unordered(ptr %a) nounwind uwtable {
; CHECK-LABEL: @atomic8_load_unordered(
; CHECK-NEXT: entry:
-; CHECK-NEXT: [[TMP0:%.*]] = load atomic i8, ptr [[A:%.*]] unordered, align 1, !pcsections [[META0:![0-9]+]]
+; CHECK-NEXT: [[TMP0:%.*]] = load atomic i8, ptr [[A:%.*]] unordered, align 1, !pcsections [[META1:![0-9]+]]
; CHECK-NEXT: ret i8 [[TMP0]]
;
entry:
@@ -15,7 +15,7 @@ entry:
define i8 @atomic8_load_monotonic(ptr %a) nounwind uwtable {
; CHECK-LABEL: @atomic8_load_monotonic(
; CHECK-NEXT: entry:
-; CHECK-NEXT: [[TMP0:%.*]] = load atomic i8, ptr [[A:%.*]] monotonic, align 1, !pcsections [[META0]]
+; CHECK-NEXT: [[TMP0:%.*]] = load atomic i8, ptr [[A:%.*]] monotonic, align 1, !pcsections [[META1]]
; CHECK-NEXT: ret i8 [[TMP0]]
;
entry:
@@ -26,7 +26,7 @@ entry:
define i8 @atomic8_load_acquire(ptr %a) nounwind uwtable {
; CHECK-LABEL: @atomic8_load_acquire(
; CHECK-NEXT: entry:
-; CHECK-NEXT: [[TMP0:%.*]] = load atomic i8, ptr [[A:%.*]] acquire, align 1, !pcsections [[META0]]
+; CHECK-NEXT: [[TMP0:%.*]] = load atomic i8, ptr [[A:%.*]] acquire, align 1, !pcsections [[META1]]
; CHECK-NEXT: ret i8 [[TMP0]]
;
entry:
@@ -37,7 +37,7 @@ entry:
define i8 @atomic8_load_seq_cst(ptr %a) nounwind uwtable {
; CHECK-LABEL: @atomic8_load_seq_cst(
; CHECK-NEXT: entry:
-; CHECK-NEXT: [[TMP0:%.*]] = load atomic i8, ptr [[A:%.*]] seq_cst, align 1, !pcsections [[META0]]
+; CHECK-NEXT: [[TMP0:%.*]] = load atomic i8, ptr [[A:%.*]] seq_cst, align 1, !pcsections [[META1]]
; CHECK-NEXT: ret i8 [[TMP0]]
;
entry:
@@ -48,7 +48,7 @@ entry:
define void @atomic8_store_unordered(ptr %a) nounwind uwtable {
; CHECK-LABEL: @atomic8_store_unordered(
; CHECK-NEXT: entry:
-; CHECK-NEXT: store atomic i8 0, ptr [[A:%.*]] unordered, align 1, !pcsections [[META0]]
+; CHECK-NEXT: store atomic i8 0, ptr [[A:%.*]] unordered, align 1, !pcsections [[META1]]
; CHECK-NEXT: ret void
;
entry:
@@ -59,7 +59,7 @@ entry:
define void @atomic8_store_monotonic(ptr %a) nounwind uwtable {
; CHECK-LABEL: @atomic8_store_monotonic(
; CHECK-NEXT: entry:
-; CHECK-NEXT: store atomic i8 0, ptr [[A:%.*]] monotonic, align 1, !pcsections [[META0]]
+; CHECK-NEXT: store atomic i8 0, ptr [[A:%.*]] monotonic, align 1, !pcsections [[META1]]
; CHECK-NEXT: ret void
;
entry:
@@ -70,7 +70,7 @@ entry:
define void @atomic8_store_release(ptr %a) nounwind uwtable {
; CHECK-LABEL: @atomic8_store_release(
; CHECK-NEXT: entry:
-; CHECK-NEXT: store atomic i8 0, ptr [[A:%.*]] release, align 1, !pcsections [[META0]]
+; CHECK-NEXT: store atomic i8 0, ptr [[A:%.*]] release, align 1, !pcsections [[META1]]
; CHECK-NEXT: ret void
;
entry:
@@ -81,7 +81,7 @@ entry:
define void @atomic8_store_seq_cst(ptr %a) nounwind uwtable {
; CHECK-LABEL: @atomic8_store_seq_cst(
; CHECK-NEXT: entry:
-; CHECK-NEXT: store atomic i8 0, ptr [[A:%.*]] seq_cst, align 1, !pcsections [[META0]]
+; CHECK-NEXT: store atomic i8 0, ptr [[A:%.*]] seq_cst, align 1, !pcsections [[META1]]
; CHECK-NEXT: ret void
;
entry:
@@ -92,14 +92,14 @@ entry:
define void @atomic8_xchg_monotonic(ptr %a) nounwind uwtable {
; CHECK-LABEL: @atomic8_xchg_monotonic(
; CHECK-NEXT: entry:
-; CHECK-NEXT: [[TMP0:%.*]] = load i8, ptr [[A:%.*]], align 1, !pcsections [[META0]]
-; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]], !pcsections [[META0]]
+; CHECK-NEXT: [[TMP0:%.*]] = load i8, ptr [[A:%.*]], align 1, !pcsections [[META1]]
+; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]], !pcsections [[META1]]
; CHECK: atomicrmw.start:
-; CHECK-NEXT: [[LOADED:%.*]] = phi i8 [ [[TMP0]], [[ENTRY:%.*]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ], !pcsections [[META0]]
-; CHECK-NEXT: [[TMP1:%.*]] = cmpxchg ptr [[A]], i8 [[LOADED]], i8 0 monotonic monotonic, align 1, !pcsections [[META0]]
-; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i8, i1 } [[TMP1]], 1, !pcsections [[META0]]
-; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i8, i1 } [[TMP1]], 0, !pcsections [[META0]]
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !pcsections [[META0]]
+; CHECK-NEXT: [[LOADED:%.*]] = phi i8 [ [[TMP0]], [[ENTRY:%.*]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ], !pcsections [[META1]]
+; CHECK-NEXT: [[TMP1:%.*]] = cmpxchg ptr [[A]], i8 [[LOADED]], i8 0 monotonic monotonic, align 1, !pcsections [[META1]]
+; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i8, i1 } [[TMP1]], 1, !pcsections [[META1]]
+; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i8, i1 } [[TMP1]], 0, !pcsections [[META1]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF2:![0-9]+]], !pcsections [[META1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret void
;
@@ -111,14 +111,14 @@ entry:
define void @atomic8_add_monotonic(ptr %a) nounwind uwtable {
; CHECK-LABEL: @atomic8_add_monotonic(
; CHECK-NEXT: entry:
-; CHECK-NEXT: [[TMP0:%.*]] = load i8, ptr [[A:%.*]], align 1, !pcsections [[META0]]
-; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]], !pcsections [[META0]]
+; CHECK-NEXT: [[TMP0:%.*]] = load i8, ptr [[A:%.*]], align 1, !pcsections [[META1]]
+; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]], !pcsections [[META1]]
; CHECK: atomicrmw.start:
-; CHECK-NEXT: [[LOADED:%.*]] = phi i8 [ [[TMP0]], [[ENTRY:%.*]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ], !pcsections [[META0]]
-; CHECK-NEXT: [[TMP1:%.*]] = cmpxchg ptr [[A]], i8 [[LOADED]], i8 [[LOADED]] monotonic monotonic, align 1, !pcsections [[META0]]
-; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i8, i1 } [[TMP1]], 1, !pcsections [[META0]]
-; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i8, i1 } [[TMP1]], 0, !pcsections [[META0]]
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !pcsections [[META0]]
+; CHECK-NEXT: [[LOADED:%.*]] = phi i8 [ [[TMP0]], [[ENTRY:%.*]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ], !pcsections [[META1]]
+; CHECK-NEXT: [[TMP1:%.*]] = cmpxchg ptr [[A]], i8 [[LOADED]], i8 [[LOADED]] monotonic monotonic, align 1, !pcsections [[META1]]
+; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i8, i1 } [[TMP1]], 1, !pcsections [[META1]]
+; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i8, i1 } [[TMP1]], 0, !pcsections [[META1]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF2]], !pcsections [[META1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret void
;
@@ -130,14 +130,14 @@ entry:
define void @atomic8_sub_monotonic(ptr %a) nounwind uwtable {
; CHECK-LABEL: @atomic8_sub_monotonic(
; CHECK-NEXT: entry:
-; CHECK-NEXT: [[TMP0:%.*]] = load i8, ptr [[A:%.*]], align 1, !pcsections [[META0]]
-; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]], !pcsections [[META0]]
+; CHECK-NEXT: [[TMP0:%.*]] = load i8, ptr [[A:%.*]], align 1, !pcsections [[META1]]
+; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]], !pcsections [[META1]]
; CHECK: atomicrmw.start:
-; CHECK-NEXT: [[LOADED:%.*]] = phi i8 [ [[TMP0]], [[ENTRY:%.*]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ], !pcsections [[META0]]
-; CHECK-NEXT: [[TMP1:%.*]] = cmpxchg ptr [[A]], i8 [[LOADED]], i8 [[LOADED]] monotonic monotonic, align 1, !pcsections [[META0]]
-; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i8, i1 } [[TMP1]], 1, !pcsections [[META0]]
-; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i8, i1 } [[TMP1]], 0, !pcsections [[META0]]
-; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !pcsections [[META0]]
+; CHECK-NEXT: [[LOADED:%.*]] = phi i8 [ [[TMP0]], [[ENTRY:%.*]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ], !pcsections [[META1]]
+; CHECK-NEXT: [[TMP1:%.*]] = cmpxchg ptr [[A]], i8 [[LOADED]], i8 [[LOADED]] monotonic monotonic, align 1, !pcsections [[META1]]
+; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i8, i1 } [[TMP1]], 1, !pcsections [[META1]]
+; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i8, i1 } [[TMP1]], 0, !pcsections [[META1]]
+; CHECK-NEXT: br i1 [[SUCCESS]], label [[ATOMICRMW_END:%.*]], label [[ATOMICRMW_START]], !prof [[PROF2]], !pcsections [[META1]]
; CHECK: atomicrmw.end:
; CHECK-NEXT: ret void
;
@@ -149,14 +149,14 @@ entry:
define void @atomic8_and_monotonic(ptr %a) nounwind uwtable {
; CHECK-LABEL: @atomic8_and_monotonic(
; CHECK-NEXT: entry:
-; CHECK-NEXT: [[TMP0:%.*]] = load i8, ptr [[A:%.*]], align 1, !pcsections [[META0]]
-; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]], !pcsections [[META0]]
+; CHECK-NEXT: [[TMP0:%.*]] = load i8, ptr [[A:%.*]], align 1, !pcsections [[META1]]
+; CHECK-NEXT: br label [[ATOMICRMW_START:%.*]], !pcsections [[META1]]
; CHECK: atomicrmw.start:
-; CHECK-NEXT: [[LOADED:%.*]] = phi i8 [ [[TMP0]], [[ENTRY:%.*]] ], [ [[NEWLOADED:%.*]], [[ATOMICRMW_START]] ], !pcsections [[META0]]
-; CHECK-NEXT: [[TMP1:%.*]] = cmpxchg ptr [[A]], i8 [[LOADED]], i8 0 monotonic monotonic, align 1, !pcsections [[META0]]
-; CHECK-NEXT: [[SUCCESS:%.*]] = extractvalue { i8, i1 } [[TMP1]], 1, !pcsections [[META0]]
-; CHECK-NEXT: [[NEWLOADED]] = extractvalue { i8, i1 } [[TMP1]], 0, !pcse...
[truncated]
|
arsenm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand this; if you're going to treat it as unknown 50/50, wouldn't that be the default without adding the metadata? In this context a 50/50 is a poor guess, since 36cb33b we set likely branch weights
|
|
||
| assert(AddrAlign >= | ||
| F->getDataLayout().getTypeStoreSize(ResultTy) && | ||
| assert(AddrAlign >= F->getDataLayout().getTypeStoreSize(ResultTy) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
restore the changes. Thx
see https://discourse.llvm.org/t/rfc-profile-information-propagation-unittesting/73595 |
edf9bfe to
8140097
Compare
boomanaiden154
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably remove the test case(s) it fixed from the profcheck-xfail.txt list?
8140097 to
015201c
Compare
Thanks for letting me know! Yes, I will remove all of these expected failed tests after the PR is merged! |
015201c to
0366285
Compare
Is there a reason you can't update it as part of this PR? |
As a follow-up to PR#165841, this change addresses `prof_md` metadata loss in AtomicExpandPass when lowering `atomicrmw xchg` to a Load-Linked/Store-Exclusive (LL/SC) loop. This path is distinct from the LSE path addressed previously: PR #165841 (and its tests) used `-mtriple=aarch64-linux-gnu`, which targets a modern **ARMv8.1+** architecture. This architecture supports **Large System Extensions (LSE)**, allowing `atomicrmw` to be lowered directly to a more efficient hardware instruction. This PR (and its tests) uses `-mtriple=aarch64--` or `-mtriple=armv8-linux-gnueabihf`. This indicates an `ARMv8.0 or lower architecture that does not support LSE`. On these targets, the pass must fall back to synthesizing a manual LL/SC loop using the `ldaxr/stxr` instruction pair. Similar to previous issue, the new conditional branch was failin to inherit the `prof_md` metadata. Theis PR correctly fix the branch weights to the newly created branch within the LL/SC loop, ensuring profile information is preserved. Co-authored-by: Jin Huang <jingold@google.com>
0cb1083 to
ed77198
Compare
Because not all 45 failed tests are fixed in this single PR. I will update them after all PR merged. |
|
|
||
| !0 = !{!"function_entry_count", i64 1000} | ||
| ;. | ||
| ; CHECK: attributes #[[ATTR0:[0-9]+]] = { nocallback nocreateundeforpoison nofree nosync nounwind speculatable willreturn memory(none) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manually add nocreateundeforpoison to pass the upstream test checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing something else was changed to infer the new attribute and your local opt binary was too old.
Can you not add the ones that do get fixed? |
Sure! Added the fixed tests in the description. |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/159/builds/34516 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/177/builds/23762 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/30375 Here is the relevant piece of the build log for the reference |
| // Atomic RMW expands to a cmpxchg loop, Since precise branch weights | ||
| // cannot be easily determined here, we mark the branch as "unknown" (50/50) | ||
| // to prevent misleading optimizations. | ||
| setExplicitlyUnknownBranchWeightsIfProfiled(*CondBr, *F, DEBUG_TYPE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| setExplicitlyUnknownBranchWeightsIfProfiled(*CondBr, *F, DEBUG_TYPE); | |
| setExplicitlyUnknownBranchWeightsIfProfiled(*CondBr, DEBUG_TYPE, *F); |
The argument order seems wrong. (was changed 2h ago #166032)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Tim!
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/204/builds/27054 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/203/builds/28242 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/205/builds/27033 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/2/builds/37654 Here is the relevant piece of the build log for the reference |
|
#165841 (review) (earlier on this PR) links to what fixed the build failures. |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/144/builds/39438 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/137/builds/28490 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/33/builds/25998 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/185/builds/28312 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/129/builds/32530 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/160/builds/27787 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/180/builds/27927 Here is the relevant piece of the build log for the reference |
The AtomicExpandPass is responsible for lowering high-level atomic operations (like
atomicrmw fadd) that are unsupported by the target hardware into a cmpxchg retry loop.Given that we cannot empirically prove the precision branch weights, It uses the
setExplicitlyUnknownBranchWeightsIfProfiledfunction to explicitly add "unknown" (50/50) branch weights to this branch.This PR includes fies for the following tests: