Skip to content

[UBSan] Support src:*=sanitize for multiple ignorelists. #141640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1009,6 +1009,7 @@ Sanitizers
----------

- ``-fsanitize=vptr`` is no longer a part of ``-fsanitize=undefined``.
- Sanitizer ignorelists now support the syntax ``src:*=sanitize``.

Python Binding Changes
----------------------
Expand Down
26 changes: 21 additions & 5 deletions clang/docs/SanitizerSpecialCaseList.rst
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ Usage with UndefinedBehaviorSanitizer
ability to adjust instrumentation based on type.

By default, supported sanitizers will have their instrumentation disabled for
types specified within an ignorelist.
entries specified within an ignorelist.

.. code-block:: bash

Expand All @@ -77,10 +77,9 @@ For example, supplying the above ``ignorelist.txt`` to
``-fsanitize-ignorelist=ignorelist.txt`` disables overflow sanitizer
instrumentation for arithmetic operations containing values of type ``int``.

The ``=sanitize`` category is also supported. Any types assigned to the
``sanitize`` category will have their sanitizer instrumentation remain. If the
same type appears within or across ignorelists with different categories the
``sanitize`` category takes precedence -- regardless of order.
The ``=sanitize`` category is also supported. Any ``=sanitize`` category
entries enable sanitizer instrumentation, even if it was ignored by entries
before.

With this, one may disable instrumentation for some or all types and
specifically allow instrumentation for one or many types -- including types
Expand All @@ -103,6 +102,23 @@ supported sanitizers.
char c = toobig; // also not instrumented
}

If multiple entries match the source, than the latest entry takes the
precedence.

.. code-block:: bash

$ cat ignorelist1.txt
# test.cc will be instrumented.
src:*
src:*/mylib/*=sanitize
src:*/mylib/test.cc

$ cat ignorelist2.txt
# test.cc will not be instrumented.
src:*
src:*/mylib/test.cc
src:*/mylib/*=sanitize

Format
======

Expand Down
13 changes: 11 additions & 2 deletions clang/include/clang/Basic/SanitizerSpecialCaseList.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/SpecialCaseList.h"
#include <memory>
#include <utility>
#include <vector>

namespace llvm {
Expand All @@ -43,16 +44,24 @@ class SanitizerSpecialCaseList : public llvm::SpecialCaseList {
bool inSection(SanitizerMask Mask, StringRef Prefix, StringRef Query,
StringRef Category = StringRef()) const;

// Query ignorelisted entries if any bit in Mask matches the entry's section.
// Return (0,0 if not found. If found, return the file index number and the
// line number (FileIdx, LineNo) (all start with 1).
std::pair<unsigned, unsigned>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is almost an anti-pattern, using std::pair like this. We have two opaque values that we now have to go back to the function to understand what they mean.

If you had a type (class) with two members (FileIdx and LineNo) then it would make sense at each use as opposed to first and second which is not informative.

I actually don't see the values explicitly used but I am assuming they will be in a later change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is almost an anti-pattern

LLVM code is full of them, and here they should not be exposed beyond NoSanitizeList, so I ignore on review.
But I will not argue if you are asking to improve.

It's used for comparison "return NoSan > San" in NoSanitizeList::containsFile.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't insist, but if we won't actually (maybe I am misunderstanding) use the value stored in the pair then a flag makes way more sense. Otherwise, having to use first and second is just prone to error.

I do get that this is local and that we (llvm) is not free of anti-patterns but we should strive to do better in new code. One because future new users might use this to learn and outside developers (even LLMs) will learn this is a good pattern and propagate its use.

inSectionBlame(SanitizerMask Mask, StringRef Prefix, StringRef Query,
StringRef Category = StringRef()) const;

protected:
// Initialize SanitizerSections.
void createSanitizerSections();

struct SanitizerSection {
SanitizerSection(SanitizerMask SM, SectionEntries &E)
: Mask(SM), Entries(E){};
SanitizerSection(SanitizerMask SM, SectionEntries &E, unsigned idx)
: Mask(SM), Entries(E), FileIdx(idx) {};

SanitizerMask Mask;
SectionEntries &Entries;
unsigned FileIdx;
};

std::vector<SanitizerSection> SanitizerSections;
Expand Down
15 changes: 14 additions & 1 deletion clang/lib/Basic/NoSanitizeList.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,20 @@ bool NoSanitizeList::containsFunction(SanitizerMask Mask,

bool NoSanitizeList::containsFile(SanitizerMask Mask, StringRef FileName,
StringRef Category) const {
return SSCL->inSection(Mask, "src", FileName, Category);
auto [NoSanFileIdx, NoSanLine] =
SSCL->inSectionBlame(Mask, "src", FileName, Category);
if (NoSanLine == 0)
return false;
auto [SanFileIdx, SanLine] =
SSCL->inSectionBlame(Mask, "src", FileName, "sanitize");
// If we have two cases such as `src:a.cpp=sanitize` and `src:a.cpp`, the
// current entry override the previous entry.
if (SanLine > 0)
// std::pair uses lexicographic comparison. It will compare the file index
// first and then comapre the line number.
return std::make_pair(NoSanFileIdx, NoSanLine) >
std::make_pair(SanFileIdx, SanLine);
return true;
}

bool NoSanitizeList::containsMainFile(SanitizerMask Mask, StringRef FileName,
Expand Down
24 changes: 18 additions & 6 deletions clang/lib/Basic/SanitizerSpecialCaseList.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -49,17 +49,29 @@ void SanitizerSpecialCaseList::createSanitizerSections() {
#undef SANITIZER
#undef SANITIZER_GROUP

SanitizerSections.emplace_back(Mask, S.Entries);
SanitizerSections.emplace_back(Mask, S.Entries, S.FileIdx);
}
}

bool SanitizerSpecialCaseList::inSection(SanitizerMask Mask, StringRef Prefix,
StringRef Query,
StringRef Category) const {
for (auto &S : SanitizerSections)
if ((S.Mask & Mask) &&
SpecialCaseList::inSectionBlame(S.Entries, Prefix, Query, Category))
return true;
auto [FileIdx, LineNo] = inSectionBlame(Mask, Prefix, Query, Category);
return FileIdx;
}

return false;
std::pair<unsigned, unsigned>
SanitizerSpecialCaseList::inSectionBlame(SanitizerMask Mask, StringRef Prefix,
StringRef Query,
StringRef Category) const {
for (auto it = SanitizerSections.crbegin(); it != SanitizerSections.crend();
++it) {
if (it->Mask & Mask) {
unsigned lineNum =
SpecialCaseList::inSectionBlame(it->Entries, Prefix, Query, Category);
if (lineNum > 0)
return {it->FileIdx, lineNum};
}
}
return {0, 0};
}
45 changes: 41 additions & 4 deletions clang/test/CodeGen/ubsan-src-ignorelist-category.test
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,17 @@

// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist -emit-llvm %t/test2.c -o - | FileCheck %s --check-prefixes=CHECK2

// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,IGNORE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,SANITIZE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict1 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,IGNORE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict2 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,IGNORE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict2 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,SANITIZE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict3 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,IGNORE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict4 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,IGNORE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict4 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,SANITIZE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict5 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,IGNORE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict6 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,IGNORE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict6 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,SANITIZE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict7 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,IGNORE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict8 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,SANITIZE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist -fsanitize-ignorelist=%t/src.ignorelist.contradict9 -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,IGNORE
// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist.contradict9 -fsanitize-ignorelist=%t/src.ignorelist -emit-llvm %t/test1.c -o - | FileCheck %s --check-prefixes=CHECK1,SANITIZE


// Verify ubsan only emits checks for files in the allowlist
Expand Down Expand Up @@ -52,6 +56,39 @@ src:*/tes*1.c=sanitize
src:*/te*t1.c
src:*/t*st1.c=sanitize

//--- src.ignorelist.contradict7
[{unsigned-integer-overflow,signed-integer-overflow}]
src:*
src:*/tes*1.c=sanitize
src:*/te*t1.c
src:*/t*st1.c=sanitize
[{unsigned-integer-overflow,signed-integer-overflow}]
src:*
src:*/te*t1.c
src:*/tes*1.c=sanitize
src:*/test1.c

//--- src.ignorelist.contradict8
[{unsigned-integer-overflow,signed-integer-overflow}]
src:*
src:*/te*t1.c
src:*/tes*1.c=sanitize
src:*/test1.c
[{unsigned-integer-overflow,signed-integer-overflow}]
src:*
src:*/tes*1.c=sanitize
src:*/te*t1.c
src:*/t*st1.c=sanitize

//--- src.ignorelist.contradict9
src:*
src:*/test1.c=sanitize
src:*/test1.c
src:*/test1.c=sanitize
src:*/te*t1.c
src:*/test*.c


//--- test1.c
// CHECK1-LABEL: define dso_local i32 @add
int add(int a, int b) {
Expand Down
15 changes: 9 additions & 6 deletions llvm/include/llvm/Support/SpecialCaseList.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#include "llvm/Support/Regex.h"
#include <memory>
#include <string>
#include <utility>
#include <vector>

namespace llvm {
Expand Down Expand Up @@ -93,17 +94,17 @@ class SpecialCaseList {
LLVM_ABI bool inSection(StringRef Section, StringRef Prefix, StringRef Query,
StringRef Category = StringRef()) const;

/// Returns the line number corresponding to the special case list entry if
/// the special case list contains a line
/// Returns the file index and the line numebr <FileIdx, LineNo> corresponding
/// to the special case list entry if the special case list contains a line
/// \code
/// @Prefix:<E>=@Category
/// \endcode
/// where @Query satisfies the glob <E> in a given @Section.
/// Returns zero if there is no exclusion entry corresponding to this
/// Returns (zero, zero) if there is no exclusion entry corresponding to this
/// expression.
LLVM_ABI unsigned inSectionBlame(StringRef Section, StringRef Prefix,
StringRef Query,
StringRef Category = StringRef()) const;
LLVM_ABI std::pair<unsigned, unsigned>
inSectionBlame(StringRef Section, StringRef Prefix, StringRef Query,
StringRef Category = StringRef()) const;

protected:
// Implementations of the create*() functions that can also be used by derived
Expand Down Expand Up @@ -145,12 +146,14 @@ class SpecialCaseList {
Section(std::unique_ptr<Matcher> M) : SectionMatcher(std::move(M)) {};
Section() : Section(std::make_unique<Matcher>()) {};

unsigned FileIdx;
std::unique_ptr<Matcher> SectionMatcher;
SectionEntries Entries;
std::string SectionStr;
};

std::vector<Section> Sections;
unsigned currFileIdx;

LLVM_ABI Expected<Section *> addSection(StringRef SectionStr, unsigned LineNo,
bool UseGlobs = true);
Expand Down
34 changes: 19 additions & 15 deletions llvm/lib/Support/SpecialCaseList.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -66,12 +66,12 @@ Error SpecialCaseList::Matcher::insert(StringRef Pattern, unsigned LineNumber,
}

unsigned SpecialCaseList::Matcher::match(StringRef Query) const {
for (const auto &Glob : Globs)
if (Glob->Pattern.match(Query))
return Glob->LineNo;
for (const auto &[Regex, LineNumber] : RegExes)
if (Regex->match(Query))
return LineNumber;
for (auto it = Globs.crbegin(); it != Globs.crend(); ++it)
if ((*it)->Pattern.match(Query))
return (*it)->LineNo;
for (auto it = RegExes.crbegin(); it != RegExes.crend(); ++it)
if (it->first->match(Query))
return it->second;
return 0;
}

Expand Down Expand Up @@ -112,6 +112,7 @@ bool SpecialCaseList::createInternal(const std::vector<std::string> &Paths,
return false;
}
std::string ParseError;
++currFileIdx;
if (!parse(FileOrErr.get().get(), ParseError)) {
Error = (Twine("error parsing file '") + Path + "': " + ParseError).str();
return false;
Expand All @@ -122,6 +123,7 @@ bool SpecialCaseList::createInternal(const std::vector<std::string> &Paths,

bool SpecialCaseList::createInternal(const MemoryBuffer *MB,
std::string &Error) {
++currFileIdx;
if (!parse(MB, Error))
return false;
return true;
Expand All @@ -133,6 +135,7 @@ SpecialCaseList::addSection(StringRef SectionStr, unsigned LineNo,
Sections.emplace_back();
auto &Section = Sections.back();
Section.SectionStr = SectionStr;
Section.FileIdx = currFileIdx;

if (auto Err = Section.SectionMatcher->insert(SectionStr, LineNo, UseGlobs)) {
return createStringError(errc::invalid_argument,
Expand Down Expand Up @@ -207,20 +210,21 @@ SpecialCaseList::~SpecialCaseList() = default;

bool SpecialCaseList::inSection(StringRef Section, StringRef Prefix,
StringRef Query, StringRef Category) const {
return inSectionBlame(Section, Prefix, Query, Category);
auto [FileIdx, LineNo] = inSectionBlame(Section, Prefix, Query, Category);
return LineNo;
}

unsigned SpecialCaseList::inSectionBlame(StringRef Section, StringRef Prefix,
StringRef Query,
StringRef Category) const {
for (const auto &S : Sections) {
if (S.SectionMatcher->match(Section)) {
unsigned Blame = inSectionBlame(S.Entries, Prefix, Query, Category);
std::pair<unsigned, unsigned>
SpecialCaseList::inSectionBlame(StringRef Section, StringRef Prefix,
StringRef Query, StringRef Category) const {
for (auto it = Sections.crbegin(); it != Sections.crend(); ++it) {
if (it->SectionMatcher->match(Section)) {
unsigned Blame = inSectionBlame(it->Entries, Prefix, Query, Category);
if (Blame)
return Blame;
return {it->FileIdx, Blame};
}
}
return 0;
return {0, 0};
}

unsigned SpecialCaseList::inSectionBlame(const SectionEntries &Entries,
Expand Down
15 changes: 9 additions & 6 deletions llvm/tools/llvm-cfi-verify/llvm-cfi-verify.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -78,8 +78,7 @@ static void printBlameContext(const DILineInfo &LineInfo, unsigned Context) {
File->getBuffer().split(Lines, '\n');

for (unsigned i = std::max<size_t>(1, LineInfo.Line - Context);
i <
std::min<size_t>(Lines.size() + 1, LineInfo.Line + Context + 1);
i < std::min<size_t>(Lines.size() + 1, LineInfo.Line + Context + 1);
++i) {
if (i == LineInfo.Line)
outs() << ">";
Expand Down Expand Up @@ -193,12 +192,16 @@ printIndirectCFInstructions(FileAnalysis &Analysis,

unsigned BlameLine = 0;
for (auto &K : {"cfi-icall", "cfi-vcall"}) {
if (!BlameLine)
BlameLine =
if (!BlameLine) {
auto [FileIdx, Line] =
SpecialCaseList->inSectionBlame(K, "src", LineInfo.FileName);
if (!BlameLine)
BlameLine =
BlameLine = Line;
}
if (!BlameLine) {
auto [FileIdx, Line] =
SpecialCaseList->inSectionBlame(K, "fun", LineInfo.FunctionName);
BlameLine = Line;
}
}

if (BlameLine) {
Expand Down
Loading