-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[TySan] User-friendly (C style) pointer type names for error reports #166381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-compiler-rt-sanitizer Author: None (BStott6) ChangesChanged the TypeSanitizer instrumentation pass to parse TBAA's Full diff: https://github.com/llvm/llvm-project/pull/166381.diff 4 Files Affected:
diff --git a/clang/docs/TypeSanitizer.rst b/clang/docs/TypeSanitizer.rst
index 3c683a6c24bb4..c2f628cb231db 100644
--- a/clang/docs/TypeSanitizer.rst
+++ b/clang/docs/TypeSanitizer.rst
@@ -119,8 +119,6 @@ brief dictionary of these terms.
* ``omnipotent char``: This is a special type which can alias with anything. Its name comes from the C/C++
type ``char``.
-* ``type p[x]``: This signifies pointers to the type. ``x`` is the number of indirections to reach the final value.
- As an example, a pointer to a pointer to an integer would be ``type p2 int``.
TypeSanitizer is still experimental. User-facing error messages should be improved in the future to remove
references to LLVM IR specific terms.
diff --git a/compiler-rt/test/tysan/print_stacktrace.c b/compiler-rt/test/tysan/print_stacktrace.c
index 3ffb6063377d9..831be5e4afed9 100644
--- a/compiler-rt/test/tysan/print_stacktrace.c
+++ b/compiler-rt/test/tysan/print_stacktrace.c
@@ -10,7 +10,7 @@ void zero_array() {
for (i = 0; i < 1; ++i)
P[i] = 0.0f;
// CHECK: ERROR: TypeSanitizer: type-aliasing-violation
- // CHECK: WRITE of size 4 at {{.*}} with type float accesses an existing object of type p1 float
+ // CHECK: WRITE of size 4 at {{.*}} with type float accesses an existing object of type float*
// CHECK: {{#0 0x.* in zero_array .*print_stacktrace.c:}}[[@LINE-3]]
// CHECK-SHORT-NOT: {{#1 0x.* in main .*print_stacktrace.c}}
// CHECK-LONG-NEXT: {{#1 0x.* in main .*print_stacktrace.c}}
diff --git a/compiler-rt/test/tysan/ptr-float.c b/compiler-rt/test/tysan/ptr-float.c
index aaa9895986988..145d5d8f289ea 100644
--- a/compiler-rt/test/tysan/ptr-float.c
+++ b/compiler-rt/test/tysan/ptr-float.c
@@ -7,7 +7,7 @@ void zero_array() {
for (i = 0; i < 1; ++i)
P[i] = 0.0f;
// CHECK: ERROR: TypeSanitizer: type-aliasing-violation
- // CHECK: WRITE of size 4 at {{.*}} with type float accesses an existing object of type p1 float
+ // CHECK: WRITE of size 4 at {{.*}} with type float accesses an existing object of type float*
// CHECK: {{#0 0x.* in zero_array .*ptr-float.c:}}[[@LINE-3]]
}
diff --git a/llvm/lib/Transforms/Instrumentation/TypeSanitizer.cpp b/llvm/lib/Transforms/Instrumentation/TypeSanitizer.cpp
index 87eba5f2c5242..e5109c047584e 100644
--- a/llvm/lib/Transforms/Instrumentation/TypeSanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/TypeSanitizer.cpp
@@ -70,6 +70,12 @@ static cl::opt<bool> ClVerifyOutlinedInstrumentation(
"function calls. This verifies that they behave the same."),
cl::Hidden, cl::init(false));
+static cl::opt<bool> ClUseTBAATypeNames(
+ "tysan-use-tbaa-type-names",
+ cl::desc("Print TBAA-style type names for pointers rather than C-style "
+ "names (e.g. 'p2 int' rather than 'int**')"),
+ cl::Hidden, cl::init(false));
+
STATISTIC(NumInstrumentedAccesses, "Number of instrumented accesses");
namespace {
@@ -260,6 +266,29 @@ static std::string encodeName(StringRef Name) {
return Output;
}
+/// Converts pointer type names from TBAA "p2 int" style to C style ("int**").
+/// Currently leaves "omnipotent char" unchanged - not sure of a user-friendly name for this type.
+/// If the type name was changed, returns true and stores the new type name in `Dest`.
+/// Otherwise, returns false (`Dest` is unchanged).
+static bool convertTBAAStyleTypeNamesToCStyle(StringRef TypeName, std::string &Dest) {
+ if (!TypeName.consume_front("p"))
+ return false;
+
+ int Indirection;
+ if (TypeName.consumeInteger(10, Indirection))
+ return false;
+
+ if (!TypeName.consume_front(" "))
+ return false;
+
+ Dest.clear();
+ Dest.reserve(TypeName.size() + Indirection); // One * per indirection
+ Dest.append(TypeName);
+ Dest.append(Indirection, '*');
+
+ return true;
+}
+
std::string
TypeSanitizer::getAnonymousStructIdentifier(const MDNode *MD,
TypeNameMapTy &TypeNames) {
@@ -355,7 +384,16 @@ bool TypeSanitizer::generateBaseTypeDescriptor(
// [2, member count, [type pointer, offset]..., name]
LLVMContext &C = MD->getContext();
- Constant *NameData = ConstantDataArray::getString(C, NameNode->getString());
+ StringRef TypeName = NameNode->getString();
+
+ // Convert LLVM-internal TBAA-style type names to C-style type names
+ // (more user-friendly)
+ std::string CStyleTypeName;
+ if (!ClUseTBAATypeNames)
+ if (convertTBAAStyleTypeNamesToCStyle(TypeName, CStyleTypeName))
+ TypeName = CStyleTypeName;
+
+ Constant *NameData = ConstantDataArray::getString(C, TypeName);
SmallVector<Type *> TDSubTys;
SmallVector<Constant *> TDSubData;
|
|
|
||
| * ``omnipotent char``: This is a special type which can alias with anything. Its name comes from the C/C++ | ||
| type ``char``. | ||
| * ``type p[x]``: This signifies pointers to the type. ``x`` is the number of indirections to reach the final value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use "print tbaa name" flag then this terminology is still used, so maybe we want to keep it in the docs? Adding a note before/ after that this is no longer default behaviour would maybe be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we need the flag. Given that we print the C type name for other cases, I think it would make sense to always print the pointer in C style, and remove the flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are always doing C style maybe it would make sense then to change "omnipotent char" to char as well? That or change the docs to state the name is permanent
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Changed the TypeSanitizer instrumentation pass to parse TBAA's
pN T(e.g.p2 int) pointer type names and rewrite them in a more user-familiarT*notation. Updated TySan docs to remove the explanation for the strange pointer type names. Updated TySan regression tests which refer to the pointer type formatting to match the new formatting. Added a command line option inTypeSanitizer.cppto use the old TBAA-style type names instead (tysan-use-tbaa-type-names).