-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TBAA] Add verifier for tbaa.struct metadata #86709
Conversation
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-llvm-ir Author: Julian Nagele (juliannagele) ChangesAdds logic to the IR verifier that checks whether !tbaa.struct nodes are well-formed. That is, it checks that the operands of !tbaa.struct nodes are in groups of three, that each group of three operands consists of two integers and a valid tbaa node, and that the regions described by the offset and size operands are non-overlapping. Full diff: https://github.com/llvm/llvm-project/pull/86709.diff 9 Files Affected:
diff --git a/llvm/include/llvm/IR/Verifier.h b/llvm/include/llvm/IR/Verifier.h
index b25f8eb77ee38b..b7db6e0bbfb71c 100644
--- a/llvm/include/llvm/IR/Verifier.h
+++ b/llvm/include/llvm/IR/Verifier.h
@@ -77,6 +77,7 @@ class TBAAVerifier {
/// Visit an instruction and return true if it is valid, return false if an
/// invalid TBAA is attached.
bool visitTBAAMetadata(Instruction &I, const MDNode *MD);
+ bool visitTBAAStructMetadata(Instruction &I, const MDNode *MD);
};
/// Check a function for errors, useful for use when debugging a
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 33f358440a312d..3035c6f53ad0d4 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -5096,6 +5096,9 @@ void Verifier::visitInstruction(Instruction &I) {
if (MDNode *TBAA = I.getMetadata(LLVMContext::MD_tbaa))
TBAAVerifyHelper.visitTBAAMetadata(I, TBAA);
+ if (MDNode *TBAA = I.getMetadata(LLVMContext::MD_tbaa_struct))
+ TBAAVerifyHelper.visitTBAAStructMetadata(I, TBAA);
+
if (MDNode *MD = I.getMetadata(LLVMContext::MD_noalias))
visitAliasScopeListMetadata(MD);
if (MDNode *MD = I.getMetadata(LLVMContext::MD_alias_scope))
@@ -7419,6 +7422,35 @@ bool TBAAVerifier::visitTBAAMetadata(Instruction &I, const MDNode *MD) {
return true;
}
+bool TBAAVerifier::visitTBAAStructMetadata(Instruction &I, const MDNode *MD) {
+ CheckTBAA(MD->getNumOperands() % 3 == 0,
+ "tbaa.struct operands must occur in groups of three", &I, MD);
+
+ // Each group of three operands must consist of two integers and a
+ // tbaa node. Moreover, the regions described by the offset and size
+ // operands must be non-overlapping.
+ std::optional<APInt> NextFree;
+ for (unsigned int Idx = 0; Idx < MD->getNumOperands(); Idx += 3) {
+ auto *OffsetCI =
+ mdconst::dyn_extract_or_null<ConstantInt>(MD->getOperand(Idx));
+ CheckTBAA(OffsetCI, "Offset must be a constant integer", &I, MD);
+
+ auto *SizeCI =
+ mdconst::dyn_extract_or_null<ConstantInt>(MD->getOperand(Idx + 1));
+ CheckTBAA(SizeCI, "Size must be a constant integer", &I, MD);
+
+ MDNode *TBAA = dyn_cast_or_null<MDNode>(MD->getOperand(Idx + 2));
+ CheckTBAA(TBAA, "TBAA tag missing", &I, MD);
+ visitTBAAMetadata(I, TBAA);
+
+ bool NonOverlapping = !NextFree || NextFree->ule(OffsetCI->getValue());
+ CheckTBAA(NonOverlapping, "Overlapping tbaa.struct regions", &I, MD);
+
+ NextFree = OffsetCI->getValue() + SizeCI->getValue();;
+ }
+ return true;
+}
+
char VerifierLegacyPass::ID = 0;
INITIALIZE_PASS(VerifierLegacyPass, "verify", "Module Verifier", false, false)
diff --git a/llvm/test/CodeGen/AArch64/arm64-abi_align.ll b/llvm/test/CodeGen/AArch64/arm64-abi_align.ll
index 089e171e5a4a79..c9fd2d38e27acd 100644
--- a/llvm/test/CodeGen/AArch64/arm64-abi_align.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-abi_align.ll
@@ -518,4 +518,6 @@ attributes #5 = { nobuiltin }
!1 = !{!"omnipotent char", !2}
!2 = !{!"Simple C/C++ TBAA"}
!3 = !{!"short", !1}
-!4 = !{i64 0, i64 4, !0, i64 4, i64 2, !3, i64 8, i64 4, !0, i64 12, i64 2, !3, i64 16, i64 4, !0, i64 20, i64 2, !3}
+!4 = !{i64 0, i64 4, !5, i64 4, i64 2, !6, i64 8, i64 4, !5, i64 12, i64 2, !6, i64 16, i64 4, !5, i64 20, i64 2, !6}
+!5 = !{!0, !0, i64 0}
+!6 = !{!3, !3, i64 0}
diff --git a/llvm/test/Transforms/InferAddressSpaces/AMDGPU/mem-intrinsics.ll b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/mem-intrinsics.ll
index 50b0e7a0f5471b..2f264a2432fc3d 100644
--- a/llvm/test/Transforms/InferAddressSpaces/AMDGPU/mem-intrinsics.ll
+++ b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/mem-intrinsics.ll
@@ -141,4 +141,4 @@ attributes #1 = { argmemonly nounwind }
!5 = distinct !{!5, !"some domain"}
!6 = !{!7}
!7 = distinct !{!7, !5, !"some scope 2"}
-!8 = !{i64 0, i64 8, null}
+!8 = !{i64 0, i64 8, !0}
diff --git a/llvm/test/Transforms/InstCombine/struct-assign-tbaa.ll b/llvm/test/Transforms/InstCombine/struct-assign-tbaa.ll
index 996d2c0e67e165..d079c03f1dcb93 100644
--- a/llvm/test/Transforms/InstCombine/struct-assign-tbaa.ll
+++ b/llvm/test/Transforms/InstCombine/struct-assign-tbaa.ll
@@ -75,7 +75,7 @@ entry:
!1 = !{!"omnipotent char", !0}
!2 = !{!5, !5, i64 0}
!3 = !{i64 0, i64 4, !2}
-!4 = !{i64 0, i64 8, null}
+!4 = !{i64 0, i64 8, !2}
!5 = !{!"float", !0}
!6 = !{i64 0, i64 4, !2, i64 4, i64 4, !2}
!7 = !{i64 0, i64 2, !2, i64 4, i64 6, !2}
diff --git a/llvm/test/Transforms/SROA/tbaa-struct3.ll b/llvm/test/Transforms/SROA/tbaa-struct3.ll
index 0fcd787fef9769..61034de81e4b27 100644
--- a/llvm/test/Transforms/SROA/tbaa-struct3.ll
+++ b/llvm/test/Transforms/SROA/tbaa-struct3.ll
@@ -539,7 +539,7 @@ declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias
!6 = !{!5, !5, i64 0}
!7 = !{i64 0, i64 8, !6, i64 8, i64 4, !1}
!8 = !{i64 0, i64 4, !1, i64 4, i64 8, !6}
-!9 = !{i64 0, i64 8, !6, i64 4, i64 8, !1}
+!9 = !{i64 0, i64 8, !6, i64 8, i64 8, !1}
!10 = !{i64 0, i64 2, !1, i64 2, i64 2, !1}
!11 = !{i64 0, i64 1, !1, i64 1, i64 3, !1}
!12 = !{i64 0, i64 2, !1, i64 2, i64 6, !1}
diff --git a/llvm/test/Transforms/Scalarizer/basic-inseltpoison.ll b/llvm/test/Transforms/Scalarizer/basic-inseltpoison.ll
index bbcdcb6f586742..73ae66dd76c66e 100644
--- a/llvm/test/Transforms/Scalarizer/basic-inseltpoison.ll
+++ b/llvm/test/Transforms/Scalarizer/basic-inseltpoison.ll
@@ -836,5 +836,6 @@ define <2 x i32> @f23_crash(<2 x i32> %srcvec, i32 %v1) {
!2 = !{ !"set2", !0 }
!3 = !{ !3, !{!"llvm.loop.parallel_accesses", !13} }
!4 = !{ float 4.0 }
-!5 = !{ i64 0, i64 8, null }
+!5 = !{ i64 0, i64 8, !6 }
+!6 = !{ !1, !1, i64 0 }
!13 = distinct !{}
diff --git a/llvm/test/Transforms/Scalarizer/basic.ll b/llvm/test/Transforms/Scalarizer/basic.ll
index db7c5f535f7e9d..87a70ccd3fc7c5 100644
--- a/llvm/test/Transforms/Scalarizer/basic.ll
+++ b/llvm/test/Transforms/Scalarizer/basic.ll
@@ -870,5 +870,6 @@ define <2 x float> @f25(<2 x float> %src) {
!2 = !{ !"set2", !0 }
!3 = !{ !3, !{!"llvm.loop.parallel_accesses", !13} }
!4 = !{ float 4.0 }
-!5 = !{ i64 0, i64 8, null }
+!5 = !{ i64 0, i64 8, !6 }
+!6 = !{ !1, !1, i64 0 }
!13 = distinct !{}
diff --git a/llvm/test/Verifier/tbaa-struct.ll b/llvm/test/Verifier/tbaa-struct.ll
index b8ddc7cee496a9..14c19a19d5ae89 100644
--- a/llvm/test/Verifier/tbaa-struct.ll
+++ b/llvm/test/Verifier/tbaa-struct.ll
@@ -1,28 +1,36 @@
-; RUN: llvm-as < %s 2>&1
-
-; FIXME: The verifer should reject the invalid !tbaa.struct nodes below.
+; RUN: not llvm-as < %s 2>&1 | FileCheck %s
define void @test_overlapping_regions(ptr %a1) {
+; CHECK: Overlapping tbaa.struct regions
+; CHECK-NEXT: %ld = load i8, ptr %a1, align 1, !tbaa.struct !0
%ld = load i8, ptr %a1, align 1, !tbaa.struct !0
ret void
}
define void @test_size_not_integer(ptr %a1) {
+; CHECK: Size must be a constant integer
+; CHECK-NEXT: store i8 1, ptr %a1, align 1, !tbaa.struct !5
store i8 1, ptr %a1, align 1, !tbaa.struct !5
ret void
}
define void @test_offset_not_integer(ptr %a1, ptr %a2) {
+; CHECK: Offset must be a constant integer
+; CHECK-NEXT: tail call void @llvm.memcpy.p0.p0.i64(ptr align 8 %a1, ptr align 8 %a2, i64 16, i1 false), !tbaa.struct !6
tail call void @llvm.memcpy.p0.p0.i64(ptr align 8 %a1, ptr align 8 %a2, i64 16, i1 false), !tbaa.struct !6
ret void
}
define void @test_tbaa_missing(ptr %a1, ptr %a2) {
+; CHECK: TBAA tag missing
+; CHECK-NEXT: tail call void @llvm.memcpy.p0.p0.i64(ptr align 8 %a1, ptr align 8 %a2, i64 16, i1 false), !tbaa.struct !7
tail call void @llvm.memcpy.p0.p0.i64(ptr align 8 %a1, ptr align 8 %a2, i64 16, i1 false), !tbaa.struct !7
ret void
}
define void @test_tbaa_invalid(ptr %a1) {
+; CHECK: Old-style TBAA is no longer allowed, use struct-path TBAA instead
+; CHECK-NEXT: store i8 1, ptr %a1, align 1, !tbaa.struct !8
store i8 1, ptr %a1, align 1, !tbaa.struct !8
ret void
}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
6f07803
to
a490f1d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
This is breaking the openmp-s390x-linux build bot. I was able to reduce the problem to the following test case:
Building this with
The first offset of 2 seems wrong to me - not sure where this came from. This seems to be an actual bug uncovered by this test? |
I believe this patch is still causing failures on the PPC64BE bots in the recent builds: https://lab.llvm.org/buildbot/#/builders/93/builds/19428/steps/10/logs/stdio The errors are similar to what @uweigand reported. |
Hmm, if it fails on big-endian platforms, maybe that indicates an endian bug in (bit)field offset computation somewhere? |
This reverts commit df75183. Revert for now as this appears to cause failures on some buildbots, e.g.: https://lab.llvm.org/buildbot/#/builders/93/builds/19428/steps/10/logs/stdio
Thanks for the report, reverted for now |
This reverts commit b9cd48f. ------------------------------------------------------------- Original commit message: Adds logic to the IR verifier that checks whether !tbaa.struct nodes are well-formed. That is, it checks that the operands of !tbaa.struct nodes are in groups of three, that each group of three operands consists of two integers and a valid tbaa node, and that the regions described by the offset and size operands are non-overlapping. PR: #86709
This reverts commit b9cd48f. ------------------------------------------------------------- Original commit message: Adds logic to the IR verifier that checks whether !tbaa.struct nodes are well-formed. That is, it checks that the operands of !tbaa.struct nodes are in groups of three, that each group of three operands consists of two integers and a valid tbaa node, and that the regions described by the offset and size operands are non-overlapping. PR: llvm#86709
This reverts commit b9cd48f. ------------------------------------------------------------- Original commit message: Adds logic to the IR verifier that checks whether !tbaa.struct nodes are well-formed. That is, it checks that the operands of !tbaa.struct nodes are in groups of three, that each group of three operands consists of two integers and a valid tbaa node, and that the regions described by the offset and size operands are non-overlapping. PR: llvm#86709
Looks like it fails clang/llvm compiler build with ThinLTO with openmp
|
Thanks for the report, do you have the full clang invocation? If possible can you check if the IR coming out of clang is already invalid? |
Looks like it happens at linking stage, not at compile time:
|
hmm, struggling to reproduce this one I'm afraid, any chance you could share the bitcode input for ThinLTO (--save-temps), and maybe the full cmake invocation |
Here is the config of clang/LLVM building:
|
Can confirm that this also breaks on our end on a multitude of cases (all of them involving LTO). |
Ok thanks, still trying to get this to happen on my end, if you have any more info on how to reproduce that'd be great |
I can sadly only reproduce this on a very large codebase that mostly consists of internal source code that I'm not allowed to share. @alexey-bataev do you have anything you can share that allows reproducing this issue? |
I shared already, for me it crashes on on clang/LLVM (+ openmp runtime) build with ThinLTO, shared the options how to build it |
Note: In my case some of the broken memcpy intrinsics seem to stem from IR produced from libc++'s basic_string files. |
This change is causing our builds of LLVM to fail in Fedora. Here is how you can reproduce:
|
Given the number of issues reported with this change, can we revert it while it is investigated? |
Sure, I've finally managed to reproduce it, but will probably need a bit of time to figure out what's going wrong |
Will you open a PR for this? If you need an approve/merge, let me know |
This reverts commit 7dbba39. Revert as there are reports this triggers during ThinLTO in some configurations.
Reverted again for now c8e5ad4 |
Adds logic to the IR verifier that checks whether !tbaa.struct nodes are well-formed. That is, it checks that the operands of !tbaa.struct nodes are in groups of three, that each group of three operands consists of two integers and a valid tbaa node, and that the regions described by the offset and size operands are non-overlapping.