Skip to content

Commit

Permalink
[analyzer][ctu] Fix wrong 'multiple definitions' errors caused by spa…
Browse files Browse the repository at this point in the history
…ce characters in lookup names when parsing the ctu index file

This error was found when analyzing MySQL with CTU enabled.

When there are space characters in the lookup name, the current
delimiter searching strategy will make the file path wrongly parsed.
And when two lookup names have the same prefix before their first space
characters, a 'multiple definitions' error will be wrongly reported.

e.g. The lookup names for the two lambda exprs in the test case are
`c:@s@G@F@G#@sa@F@operator int (*)(char)#1` and
`c:@s@G@F@G#@sa@F@operator bool (*)(char)#1` respectively. And their
prefixes are both `c:@s@G@F@G#@sa@F@operator` when using the first space
character as the delimiter.

Solving the problem by adding a length for the lookup name, making the
index items in the format of `USR-Length:USR File-Path`.

Reviewed By: steakhal

Differential Revision: https://reviews.llvm.org/D102669
  • Loading branch information
Snape3058 authored and Balazs Benics committed Dec 16, 2021
1 parent 3a1eb1c commit 333d66b
Show file tree
Hide file tree
Showing 12 changed files with 154 additions and 75 deletions.
8 changes: 4 additions & 4 deletions clang/docs/analyzer/user-docs/CrossTranslationUnit.rst
Expand Up @@ -81,12 +81,12 @@ of `foo.cpp`:
$
The next step is to create a CTU index file which holds the `USR` name and location of external definitions in the
source files:
source files in format `<USR-Length>:<USR> <File-Path>`:

.. code-block:: bash
$ clang-extdef-mapping -p . foo.cpp
c:@F@foo# /path/to/your/project/foo.cpp
9:c:@F@foo# /path/to/your/project/foo.cpp
$ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt
We have to modify `externalDefMap.txt` to contain the name of the `.ast` files instead of the source files:
Expand Down Expand Up @@ -278,12 +278,12 @@ The `invocation list`:
We'd like to analyze `main.cpp` and discover the division by zero bug.
As we are using On-demand mode, we only need to create a CTU index file which holds the `USR` name and location of
external definitions in the source files:
external definitions in the source files in format `<USR-Length>:<USR> <File-Path>`:

.. code-block:: bash
$ clang-extdef-mapping -p . foo.cpp
c:@F@foo# /path/to/your/project/foo.cpp
9:c:@F@foo# /path/to/your/project/foo.cpp
$ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt
Now everything is available for the CTU analysis.
Expand Down
4 changes: 2 additions & 2 deletions clang/include/clang/Basic/DiagnosticCrossTUKinds.td
Expand Up @@ -12,8 +12,8 @@ def err_ctu_error_opening : Error<
"error opening '%0': required by the CrossTU functionality">;

def err_extdefmap_parsing : Error<
"error parsing index file: '%0' line: %1 'UniqueID filename' format "
"expected">;
"error parsing index file: '%0' line: %1 '<USR-Length>:<USR> <File-Path>' "
"format expected">;

def err_multiple_def_index : Error<
"multiple definitions are found for the same key in index ">;
Expand Down
65 changes: 47 additions & 18 deletions clang/lib/CrossTU/CrossTranslationUnit.cpp
Expand Up @@ -149,6 +149,35 @@ std::error_code IndexError::convertToErrorCode() const {
return std::error_code(static_cast<int>(Code), *Category);
}

/// Parse one line of the input CTU index file.
///
/// @param[in] LineRef The input CTU index item in format
/// "<USR-Length>:<USR> <File-Path>".
/// @param[out] LookupName The lookup name in format "<USR-Length>:<USR>".
/// @param[out] FilePath The file path "<File-Path>".
static bool parseCrossTUIndexItem(StringRef LineRef, StringRef &LookupName,
StringRef &FilePath) {
// `LineRef` is "<USR-Length>:<USR> <File-Path>" now.

size_t USRLength = 0;
if (LineRef.consumeInteger(10, USRLength))
return false;
assert(USRLength && "USRLength should be greater than zero.");

if (!LineRef.consume_front(":"))
return false;

// `LineRef` is now just "<USR> <File-Path>".

// Check LookupName length out of bound and incorrect delimiter.
if (USRLength >= LineRef.size() || ' ' != LineRef[USRLength])
return false;

LookupName = LineRef.substr(0, USRLength);
FilePath = LineRef.substr(USRLength + 1);
return true;
}

llvm::Expected<llvm::StringMap<std::string>>
parseCrossTUIndex(StringRef IndexPath) {
std::ifstream ExternalMapFile{std::string(IndexPath)};
Expand All @@ -160,24 +189,23 @@ parseCrossTUIndex(StringRef IndexPath) {
std::string Line;
unsigned LineNo = 1;
while (std::getline(ExternalMapFile, Line)) {
StringRef LineRef{Line};
const size_t Delimiter = LineRef.find(' ');
if (Delimiter > 0 && Delimiter != std::string::npos) {
StringRef LookupName = LineRef.substr(0, Delimiter);

// Store paths with posix-style directory separator.
SmallString<32> FilePath(LineRef.substr(Delimiter + 1));
llvm::sys::path::native(FilePath, llvm::sys::path::Style::posix);

bool InsertionOccured;
std::tie(std::ignore, InsertionOccured) =
Result.try_emplace(LookupName, FilePath.begin(), FilePath.end());
if (!InsertionOccured)
return llvm::make_error<IndexError>(
index_error_code::multiple_definitions, IndexPath.str(), LineNo);
} else
// Split lookup name and file path
StringRef LookupName, FilePathInIndex;
if (!parseCrossTUIndexItem(Line, LookupName, FilePathInIndex))
return llvm::make_error<IndexError>(
index_error_code::invalid_index_format, IndexPath.str(), LineNo);

// Store paths with posix-style directory separator.
SmallString<32> FilePath(FilePathInIndex);
llvm::sys::path::native(FilePath, llvm::sys::path::Style::posix);

bool InsertionOccured;
std::tie(std::ignore, InsertionOccured) =
Result.try_emplace(LookupName, FilePath.begin(), FilePath.end());
if (!InsertionOccured)
return llvm::make_error<IndexError>(
index_error_code::multiple_definitions, IndexPath.str(), LineNo);

++LineNo;
}
return Result;
Expand All @@ -187,7 +215,8 @@ std::string
createCrossTUIndexString(const llvm::StringMap<std::string> &Index) {
std::ostringstream Result;
for (const auto &E : Index)
Result << E.getKey().str() << " " << E.getValue() << '\n';
Result << E.getKey().size() << ':' << E.getKey().str() << ' '
<< E.getValue() << '\n';
return Result.str();
}

Expand Down Expand Up @@ -428,7 +457,7 @@ CrossTranslationUnitContext::ASTUnitStorage::getASTUnitForFunction(
ensureCTUIndexLoaded(CrossTUDir, IndexName))
return std::move(IndexLoadError);

// Check if there is and entry in the index for the function.
// Check if there is an entry in the index for the function.
if (!NameFileMap.count(FunctionName)) {
++NumNotInOtherTU;
return llvm::make_error<IndexError>(index_error_code::missing_definition);
Expand Down
@@ -1 +1 @@
c:@F@testStaticImplicit ctu-import.c.ast
23:c:@F@testStaticImplicit ctu-import.c.ast
17 changes: 17 additions & 0 deletions clang/test/Analysis/Inputs/ctu-lookup-name-with-space.cpp
@@ -0,0 +1,17 @@
void f(void (*)());
void f(void (*)(int));

struct G {
G() {
// multiple definitions are found for the same key in index
f([]() -> void {}); // USR: c:@S@G@F@G#@Sa@F@operator void (*)()#1
f([](int) -> void {}); // USR: c:@S@G@F@G#@Sa@F@operator void (*)(int)#1

// As both lambda exprs have the same prefix, if the CTU index parser uses
// the first space character as the delimiter between USR and file path, a
// "multiple definitions are found for the same key in index" error will
// be reported.
}
};

void importee() {}
@@ -1,7 +1,7 @@
c:@F@inlineAsm ctu-other.c.ast
c:@F@g ctu-other.c.ast
c:@F@f ctu-other.c.ast
c:@F@enumCheck ctu-other.c.ast
c:@F@identImplicit ctu-other.c.ast
c:@F@structInProto ctu-other.c.ast
c:@F@switchWithoutCases ctu-other.c.ast
14:c:@F@inlineAsm ctu-other.c.ast
6:c:@F@g ctu-other.c.ast
6:c:@F@f ctu-other.c.ast
14:c:@F@enumCheck ctu-other.c.ast
18:c:@F@identImplicit ctu-other.c.ast
18:c:@F@structInProto ctu-other.c.ast
23:c:@F@switchWithoutCases ctu-other.c.ast
@@ -1,30 +1,30 @@
c:@N@chns@F@chf1#I# ctu-other.cpp.ast
c:@N@myns@N@embed_ns@F@fens#I# ctu-other.cpp.ast
c:@F@g#I# ctu-other.cpp.ast
c:@S@mycls@F@fscl#I#S ctu-other.cpp.ast
c:@S@mycls@F@fcl#I# ctu-other.cpp.ast
c:@S@mycls@F@fvcl#I# ctu-other.cpp.ast
c:@N@myns@S@embed_cls@F@fecl#I# ctu-other.cpp.ast
c:@S@mycls@S@embed_cls2@F@fecl2#I# ctu-other.cpp.ast
c:@S@derived@F@fvcl#I# ctu-other.cpp.ast
c:@F@f#I# ctu-other.cpp.ast
c:@N@myns@F@fns#I# ctu-other.cpp.ast
c:@F@h#I# ctu-other.cpp.ast
c:@F@h_chain#I# ctu-chain.cpp.ast
c:@N@chns@S@chcls@F@chf4#I# ctu-chain.cpp.ast
c:@N@chns@F@chf2#I# ctu-chain.cpp.ast
c:@F@fun_using_anon_struct#I# ctu-other.cpp.ast
c:@F@other_macro_diag#I# ctu-other.cpp.ast
c:@extInt ctu-other.cpp.ast
c:@N@intns@extInt ctu-other.cpp.ast
c:@extS ctu-other.cpp.ast
c:@S@A@a ctu-other.cpp.ast
c:@extSC ctu-other.cpp.ast
c:@S@ST@sc ctu-other.cpp.ast
c:@extSCN ctu-other.cpp.ast
c:@extSubSCN ctu-other.cpp.ast
c:@extSCC ctu-other.cpp.ast
c:@extU ctu-other.cpp.ast
c:@S@TestAnonUnionUSR@Test ctu-other.cpp.ast
c:@F@testImportOfIncompleteDefaultParmDuringImport#I# ctu-other.cpp.ast
c:@F@testImportOfDelegateConstructor#I# ctu-other.cpp.ast
19:c:@N@chns@F@chf1#I# ctu-other.cpp.ast
30:c:@N@myns@N@embed_ns@F@fens#I# ctu-other.cpp.ast
9:c:@F@g#I# ctu-other.cpp.ast
21:c:@S@mycls@F@fscl#I#S ctu-other.cpp.ast
19:c:@S@mycls@F@fcl#I# ctu-other.cpp.ast
20:c:@S@mycls@F@fvcl#I# ctu-other.cpp.ast
31:c:@N@myns@S@embed_cls@F@fecl#I# ctu-other.cpp.ast
34:c:@S@mycls@S@embed_cls2@F@fecl2#I# ctu-other.cpp.ast
22:c:@S@derived@F@fvcl#I# ctu-other.cpp.ast
9:c:@F@f#I# ctu-other.cpp.ast
18:c:@N@myns@F@fns#I# ctu-other.cpp.ast
9:c:@F@h#I# ctu-other.cpp.ast
15:c:@F@h_chain#I# ctu-chain.cpp.ast
27:c:@N@chns@S@chcls@F@chf4#I# ctu-chain.cpp.ast
19:c:@N@chns@F@chf2#I# ctu-chain.cpp.ast
29:c:@F@fun_using_anon_struct#I# ctu-other.cpp.ast
24:c:@F@other_macro_diag#I# ctu-other.cpp.ast
9:c:@extInt ctu-other.cpp.ast
17:c:@N@intns@extInt ctu-other.cpp.ast
7:c:@extS ctu-other.cpp.ast
8:c:@S@A@a ctu-other.cpp.ast
8:c:@extSC ctu-other.cpp.ast
10:c:@S@ST@sc ctu-other.cpp.ast
9:c:@extSCN ctu-other.cpp.ast
12:c:@extSubSCN ctu-other.cpp.ast
9:c:@extSCC ctu-other.cpp.ast
7:c:@extU ctu-other.cpp.ast
26:c:@S@TestAnonUnionUSR@Test ctu-other.cpp.ast
53:c:@F@testImportOfIncompleteDefaultParmDuringImport#I# ctu-other.cpp.ast
39:c:@F@testImportOfDelegateConstructor#I# ctu-other.cpp.ast
@@ -1,4 +1,4 @@
c:@F@F1 plist-macros-ctu.c.ast
c:@F@F2 plist-macros-ctu.c.ast
c:@F@F3 plist-macros-ctu.c.ast
c:@F@F_H plist-macros-ctu.c.ast
7:c:@F@F1 plist-macros-ctu.c.ast
7:c:@F@F2 plist-macros-ctu.c.ast
7:c:@F@F3 plist-macros-ctu.c.ast
8:c:@F@F_H plist-macros-ctu.c.ast
2 changes: 1 addition & 1 deletion clang/test/Analysis/ctu-inherited-default-ctor.cpp
Expand Up @@ -4,7 +4,7 @@
// RUN: %clang_cc1 -std=c++14 -triple x86_64-pc-linux-gnu \
// RUN: -emit-pch -o %t/ctudir/ctu-inherited-default-ctor-other.cpp.ast \
// RUN: %S/Inputs/ctu-inherited-default-ctor-other.cpp
// RUN: echo "c:@N@clang@S@DeclContextLookupResult@SingleElementDummyList ctu-inherited-default-ctor-other.cpp.ast" \
// RUN: echo "59:c:@N@clang@S@DeclContextLookupResult@SingleElementDummyList ctu-inherited-default-ctor-other.cpp.ast" \
// RUN: > %t/ctudir/externalDefMap.txt
//
// RUN: %clang_analyze_cc1 -std=c++14 -triple x86_64-pc-linux-gnu \
Expand Down
21 changes: 21 additions & 0 deletions clang/test/Analysis/ctu-lookup-name-with-space.cpp
@@ -0,0 +1,21 @@
// RUN: rm -rf %t
// RUN: mkdir %t

// RUN: echo '"%/S/Inputs/ctu-lookup-name-with-space.cpp" : ["g++", "-c", "%/S/Inputs/ctu-lookup-name-with-space.cpp"]' > %t/invocations.yaml
// RUN: %clang_extdef_map %S/Inputs/ctu-lookup-name-with-space.cpp -- > %t/externalDefMap.txt

// RUN: cd %t && %clang_cc1 -fsyntax-only -analyze \
// RUN: -analyzer-checker=core,debug.ExprInspection \
// RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \
// RUN: -analyzer-config ctu-dir=. \
// RUN: -analyzer-config ctu-invocation-list=invocations.yaml \
// RUN: -verify %s

void importee();

void trigger() {
// Call an external function to trigger the parsing process of CTU index.
// Refer to file Inputs/ctu-lookup-name-with-space.cpp for more details.

importee(); // expected-no-diagnostics
}
26 changes: 19 additions & 7 deletions clang/test/Analysis/func-mapping-test.cpp
Expand Up @@ -3,10 +3,10 @@
int f(int) {
return 0;
}
// CHECK-DAG: c:@F@f#I#
// CHECK-DAG: 9:c:@F@f#I#

extern const int x = 5;
// CHECK-DAG: c:@x
// CHECK-DAG: 4:c:@x

// Non-const variables should not be collected.
int y = 5;
Expand All @@ -18,33 +18,45 @@ struct S {
int a;
};
extern S const s = {.a = 2};
// CHECK-DAG: c:@s
// CHECK-DAG: 4:c:@s

struct SF {
const int a;
};
SF sf = {.a = 2};
// CHECK-DAG: c:@sf
// CHECK-DAG: 5:c:@sf

struct SStatic {
static const int a = 4;
};
const int SStatic::a;
// CHECK-DAG: c:@S@SStatic@a
// CHECK-DAG: 14:c:@S@SStatic@a

extern int const arr[5] = { 0, 1 };
// CHECK-DAG: c:@arr
// CHECK-DAG: 6:c:@arr

union U {
const int a;
const unsigned int b;
};
U u = {.a = 6};
// CHECK-DAG: c:@u
// CHECK-DAG: 4:c:@u

// No USR can be generated for this.
// Check for no crash in this case.
static union {
float uf;
const int ui;
};

void f(int (*)(char));
void f(bool (*)(char));

struct G {
G() {
f([](char) -> int { return 42; });
// CHECK-DAG: 41:c:@S@G@F@G#@Sa@F@operator int (*)(char)#1
f([](char) -> bool { return true; });
// CHECK-DAG: 42:c:@S@G@F@G#@Sa@F@operator bool (*)(char)#1
}
};
2 changes: 1 addition & 1 deletion clang/unittests/CrossTU/CrossTranslationUnitTest.cpp
Expand Up @@ -61,7 +61,7 @@ class CTUASTConsumer : public clang::ASTConsumer {
ASSERT_FALSE(llvm::sys::fs::createTemporaryFile("index", "txt", IndexFD,
IndexFileName));
llvm::ToolOutputFile IndexFile(IndexFileName, IndexFD);
IndexFile.os() << "c:@F@f#I# " << ASTFileName << "\n";
IndexFile.os() << "9:c:@F@f#I# " << ASTFileName << "\n";
IndexFile.os().flush();
EXPECT_TRUE(llvm::sys::fs::exists(IndexFileName));

Expand Down

0 comments on commit 333d66b

Please sign in to comment.