-
Notifications
You must be signed in to change notification settings - Fork 13.9k
[UBSan] Support src:*=sanitize for multiple ignorelists. #141640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[UBSan] Support src:*=sanitize for multiple ignorelists. #141640
Conversation
Created using spr 1.3.6 [skip ci]
Created using spr 1.3.6
@llvm/pr-subscribers-llvm-support @llvm/pr-subscribers-clang Author: Qinkun Bao (qinkunbao) ChangesSee: #139128 for the background. Full diff: https://github.com/llvm/llvm-project/pull/141640.diff 4 Files Affected:
diff --git a/clang/include/clang/Basic/SanitizerSpecialCaseList.h b/clang/include/clang/Basic/SanitizerSpecialCaseList.h
index 25d518e7128cf..a19392399bfd7 100644
--- a/clang/include/clang/Basic/SanitizerSpecialCaseList.h
+++ b/clang/include/clang/Basic/SanitizerSpecialCaseList.h
@@ -19,6 +19,7 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/SpecialCaseList.h"
#include <memory>
+#include <utility>
#include <vector>
namespace llvm {
@@ -44,20 +45,23 @@ class SanitizerSpecialCaseList : public llvm::SpecialCaseList {
StringRef Category = StringRef()) const;
// Query ignorelisted entries if any bit in Mask matches the entry's section.
- // Return 0 if not found. If found, return the line number (starts with 1).
- unsigned inSectionBlame(SanitizerMask Mask, StringRef Prefix, StringRef Query,
- StringRef Category = StringRef()) const;
+ // Return (0,0 if not found. If found, return the file index number and the
+ // line number (FileIdx, LineNo) (all start with 1).
+ std::pair<unsigned, unsigned>
+ inSectionBlame(SanitizerMask Mask, StringRef Prefix, StringRef Query,
+ StringRef Category = StringRef()) const;
protected:
// Initialize SanitizerSections.
void createSanitizerSections();
struct SanitizerSection {
- SanitizerSection(SanitizerMask SM, SectionEntries &E)
- : Mask(SM), Entries(E) {};
+ SanitizerSection(SanitizerMask SM, SectionEntries &E, unsigned idx)
+ : Mask(SM), Entries(E), FileIdx(idx) {};
SanitizerMask Mask;
SectionEntries &Entries;
+ unsigned FileIdx;
};
std::vector<SanitizerSection> SanitizerSections;
diff --git a/clang/lib/Basic/NoSanitizeList.cpp b/clang/lib/Basic/NoSanitizeList.cpp
index 4feef5c6ea052..c58a67971dfb6 100644
--- a/clang/lib/Basic/NoSanitizeList.cpp
+++ b/clang/lib/Basic/NoSanitizeList.cpp
@@ -44,14 +44,17 @@ bool NoSanitizeList::containsFunction(SanitizerMask Mask,
bool NoSanitizeList::containsFile(SanitizerMask Mask, StringRef FileName,
StringRef Category) const {
- unsigned NoSanLine = SSCL->inSectionBlame(Mask, "src", FileName, Category);
+ auto [NoSanFileIdx, NoSanLine] =
+ SSCL->inSectionBlame(Mask, "src", FileName, Category);
if (NoSanLine == 0)
return false;
- unsigned SanLine = SSCL->inSectionBlame(Mask, "src", FileName, "sanitize");
+ auto [SanFileIdx, SanLine] =
+ SSCL->inSectionBlame(Mask, "src", FileName, "sanitize");
// If we have two cases such as `src:a.cpp=sanitize` and `src:a.cpp`, the
// current entry override the previous entry.
if (SanLine > 0)
- return NoSanLine > SanLine;
+ return std::make_pair(NoSanFileIdx, NoSanLine) >
+ std::make_pair(SanFileIdx, SanLine);
return true;
}
diff --git a/clang/lib/Basic/SanitizerSpecialCaseList.cpp b/clang/lib/Basic/SanitizerSpecialCaseList.cpp
index 4508d705eb43a..0b2c0f9c4478a 100644
--- a/clang/lib/Basic/SanitizerSpecialCaseList.cpp
+++ b/clang/lib/Basic/SanitizerSpecialCaseList.cpp
@@ -49,28 +49,29 @@ void SanitizerSpecialCaseList::createSanitizerSections() {
#undef SANITIZER
#undef SANITIZER_GROUP
- SanitizerSections.emplace_back(Mask, S.Entries);
+ SanitizerSections.emplace_back(Mask, S.Entries, S.FileIdx);
}
}
bool SanitizerSpecialCaseList::inSection(SanitizerMask Mask, StringRef Prefix,
StringRef Query,
StringRef Category) const {
- return inSectionBlame(Mask, Prefix, Query, Category);
+ auto [FileIdx, LineNo] = inSectionBlame(Mask, Prefix, Query, Category);
+ return FileIdx;
}
-unsigned SanitizerSpecialCaseList::inSectionBlame(SanitizerMask Mask,
- StringRef Prefix,
- StringRef Query,
- StringRef Category) const {
+std::pair<unsigned, unsigned>
+SanitizerSpecialCaseList::inSectionBlame(SanitizerMask Mask, StringRef Prefix,
+ StringRef Query,
+ StringRef Category) const {
for (auto it = SanitizerSections.crbegin(); it != SanitizerSections.crend();
++it) {
if (it->Mask & Mask) {
unsigned lineNum =
SpecialCaseList::inSectionBlame(it->Entries, Prefix, Query, Category);
if (lineNum > 0)
- return lineNum;
+ return {it->FileIdx, lineNum};
}
}
- return 0;
+ return {0, 0};
}
diff --git a/clang/test/CodeGen/ubsan-src-ignorelist-category.test b/clang/test/CodeGen/ubsan-src-ignorelist-category.test
index 55967ec77c836..0ff2ea5c456a1 100644
--- a/clang/test/CodeGen/ubsan-src-ignorelist-category.test
+++ b/clang/test/CodeGen/ubsan-src-ignorelist-category.test
@@ -12,6 +12,9 @@
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict6 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,SANITIZE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict7 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,IGNORE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict8 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,SANITIZE
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict8 -fsanitize-ignorelist=%t/src.ignorelist.contradict9 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,IGNORE
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict9 -fsanitize-ignorelist=%t/src.ignorelist.contradict8 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,SANITIZE
+
// Verify ubsan only emits checks for files in the allowlist
@@ -77,6 +80,14 @@ src:*/tes*1.c=sanitize
src:*/te*t1.c
src:*/t*st1.c=sanitize
+//--- src.ignorelist.contradict9
+src:*
+src:*/test1.c=sanitize
+src:*/test1.c
+src:*/test1.c=sanitize
+src:*/te*t1.c
+src:*/test*.c
+
//--- test1.c
// CHECK1-LABEL: define dso_local i32 @add
|
Created using spr 1.3.6
Created using spr 1.3.6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with comments
…e-for-multiple-ignorelists
✅ With the latest revision this PR passed the C/C++ code formatter. |
Looks like conflicts resolving got wrong. I will fix them and address the comments tomorrow |
Created using spr 1.3.6
Created using spr 1.3.6
Created using spr 1.3.6
Created using spr 1.3.6
Created using spr 1.3.6
…e-for-multiple-ignorelists
// Return NotFound (0,0) if not found. If found, return the file index number | ||
// and the line number (FileIdx, LineNo) (FileIdx starts with 1 and LineNo | ||
// starts with 0). | ||
std::pair<unsigned, unsigned> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this is almost an anti-pattern, using std::pair like this. We have two opaque values that we now have to go back to the function to understand what they mean.
If you had a type (class) with two members (FileIdx
and LineNo
) then it would make sense at each use as opposed to first
and second
which is not informative.
I actually don't see the values explicitly used but I am assuming they will be in a later change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this is almost an anti-pattern
LLVM code is full of them, and here they should not be exposed beyond NoSanitizeList, so I ignore on review.
But I will not argue if you are asking to improve.
It's used for comparison "return NoSan > San" in NoSanitizeList::containsFile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't insist, but if we won't actually (maybe I am misunderstanding) use the value stored in the pair then a flag makes way more sense. Otherwise, having to use first
and second
is just prone to error.
I do get that this is local and that we (llvm) is not free of anti-patterns but we should strive to do better in new code. One because future new users might use this to learn and outside developers (even LLMs) will learn this is a good pattern and propagate its use.
See: llvm#139128 and llvm#140529 for the background. The introduction of these new tests (ubsan-src-ignorelist-category.test) `-fsanitize-ignorelist=%t/src.ignorelist -fsanitize-ignorelist=%t/src.ignorelist.contradict9` in this PR will not lead to failures in the previous implementation (without this PR). This is because the existing logic distinguishes between Sections in different ignorelists, even if their names are identical. The order of these Sections is preserved using a `vector`.
See: #139128 and #140529 for the background.
The introduction of these new tests (ubsan-src-ignorelist-category.test)
-fsanitize-ignorelist=%t/src.ignorelist -fsanitize-ignorelist=%t/src.ignorelist.contradict9
in this PR will not lead to failures in the previous implementation (without this PR). This is because the existing logic distinguishes between Sections in different ignorelists, even if their names are identical. The order of these Sections is preserved using avector
.