Skip to content

[UBSan] Support src:*=sanitize for multiple ignorelists. #141640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
14 changes: 9 additions & 5 deletions clang/include/clang/Basic/SanitizerSpecialCaseList.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/SpecialCaseList.h"
#include <memory>
#include <utility>
#include <vector>

namespace llvm {
Expand All @@ -44,20 +45,23 @@ class SanitizerSpecialCaseList : public llvm::SpecialCaseList {
StringRef Category = StringRef()) const;

// Query ignorelisted entries if any bit in Mask matches the entry's section.
// Return 0 if not found. If found, return the line number (starts with 1).
unsigned inSectionBlame(SanitizerMask Mask, StringRef Prefix, StringRef Query,
StringRef Category = StringRef()) const;
// Return (0,0 if not found. If found, return the file index number and the
// line number (FileIdx, LineNo) (all start with 1).
std::pair<unsigned, unsigned>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is almost an anti-pattern, using std::pair like this. We have two opaque values that we now have to go back to the function to understand what they mean.

If you had a type (class) with two members (FileIdx and LineNo) then it would make sense at each use as opposed to first and second which is not informative.

I actually don't see the values explicitly used but I am assuming they will be in a later change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is almost an anti-pattern

LLVM code is full of them, and here they should not be exposed beyond NoSanitizeList, so I ignore on review.
But I will not argue if you are asking to improve.

It's used for comparison "return NoSan > San" in NoSanitizeList::containsFile.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't insist, but if we won't actually (maybe I am misunderstanding) use the value stored in the pair then a flag makes way more sense. Otherwise, having to use first and second is just prone to error.

I do get that this is local and that we (llvm) is not free of anti-patterns but we should strive to do better in new code. One because future new users might use this to learn and outside developers (even LLMs) will learn this is a good pattern and propagate its use.

inSectionBlame(SanitizerMask Mask, StringRef Prefix, StringRef Query,
StringRef Category = StringRef()) const;

protected:
// Initialize SanitizerSections.
void createSanitizerSections();

struct SanitizerSection {
SanitizerSection(SanitizerMask SM, SectionEntries &E)
: Mask(SM), Entries(E) {};
SanitizerSection(SanitizerMask SM, SectionEntries &E, unsigned idx)
: Mask(SM), Entries(E), FileIdx(idx) {};

SanitizerMask Mask;
SectionEntries &Entries;
unsigned FileIdx;
};

std::vector<SanitizerSection> SanitizerSections;
Expand Down
12 changes: 6 additions & 6 deletions clang/lib/Basic/NoSanitizeList.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -44,13 +44,13 @@ bool NoSanitizeList::containsFunction(SanitizerMask Mask,

bool NoSanitizeList::containsFile(SanitizerMask Mask, StringRef FileName,
StringRef Category) const {
unsigned NoSanLine = SSCL->inSectionBlame(Mask, "src", FileName, Category);
if (NoSanLine == 0)
auto NoSan = SSCL->inSectionBlame(Mask, "src", FileName, Category);
if (NoSan == NotFound)
return false;
unsigned SanLine = SSCL->inSectionBlame(Mask, "src", FileName, "sanitize");
// If we have two cases such as `src:a.cpp=sanitize` and `src:a.cpp`, the
// current entry override the previous entry.
return !SanLine || NoSanLine > SanLine;
auto San =
SSCL->inSectionBlame(Mask, "src", FileName, "sanitize");

return San == NotFound || NoSan > San;
}

bool NoSanitizeList::containsMainFile(SanitizerMask Mask, StringRef FileName,
Expand Down
24 changes: 13 additions & 11 deletions clang/lib/Basic/SanitizerSpecialCaseList.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -50,27 +50,29 @@ void SanitizerSpecialCaseList::createSanitizerSections() {
#undef SANITIZER
#undef SANITIZER_GROUP

SanitizerSections.emplace_back(Mask, S.Entries);
SanitizerSections.emplace_back(Mask, S.Entries, S.FileIdx);
}
}

bool SanitizerSpecialCaseList::inSection(SanitizerMask Mask, StringRef Prefix,
StringRef Query,
StringRef Category) const {
return inSectionBlame(Mask, Prefix, Query, Category);
auto [FileIdx, LineNo] = inSectionBlame(Mask, Prefix, Query, Category);
return FileIdx;
}

unsigned SanitizerSpecialCaseList::inSectionBlame(SanitizerMask Mask,
StringRef Prefix,
StringRef Query,
StringRef Category) const {
for (const auto &S : llvm::reverse(SanitizerSections)) {
if (S.Mask & Mask) {
std::pair<unsigned, unsigned>
SanitizerSpecialCaseList::inSectionBlame(SanitizerMask Mask, StringRef Prefix,
StringRef Query,
StringRef Category) const {
for (auto it = SanitizerSections.crbegin(); it != SanitizerSections.crend();
++it) {
if (it->Mask & Mask) {
unsigned lineNum =
SpecialCaseList::inSectionBlame(S.Entries, Prefix, Query, Category);
SpecialCaseList::inSectionBlame(it->Entries, Prefix, Query, Category);
if (lineNum > 0)
return lineNum;
return {it->FileIdx, lineNum};
}
}
return 0;
return NotFound;
}
11 changes: 11 additions & 0 deletions clang/test/CodeGen/ubsan-src-ignorelist-category.test
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict6 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,SANITIZE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict7 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,IGNORE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict8 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,SANITIZE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist -fsanitize-ignorelist=%t/src.ignorelist.contradict9 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,IGNORE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict9 -fsanitize-ignorelist=%t/src.ignorelist -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,SANITIZE


// Verify ubsan only emits checks for files in the allowlist

Expand Down Expand Up @@ -77,6 +80,14 @@ src:*/tes*1.c=sanitize
src:*/te*t1.c
src:*/t*st1.c=sanitize

//--- src.ignorelist.contradict9
src:*
src:*/test1.c=sanitize
src:*/test1.c
src:*/test1.c=sanitize
src:*/te*t1.c
src:*/test*.c


//--- test1.c
// CHECK1-LABEL: define dso_local i32 @add
Expand Down
3 changes: 2 additions & 1 deletion llvm/include/llvm/Support/SpecialCaseList.h
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ class SpecialCaseList {
LLVM_ABI bool inSection(StringRef Section, StringRef Prefix, StringRef Query,
StringRef Category = StringRef()) const;

/// Returns the file index and the line number <FileIdx, LineNo> corresponding
/// Returns the file index and the line numebr <FileIdx, LineNo> corresponding
/// to the special case list entry if the special case list contains a line
/// \code
/// @Prefix:<E>=@Category
Expand Down Expand Up @@ -154,6 +154,7 @@ class SpecialCaseList {
};

std::vector<Section> Sections;
unsigned currFileIdx;

LLVM_ABI Expected<Section *> addSection(StringRef SectionStr,
unsigned FileIdx, unsigned LineNo,
Expand Down
Loading