Skip to content

[NFCI][Sanitizer] Convert Matcher::Globs from StringMap to vector. #140964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

qinkunbao
Copy link
Member

@qinkunbao qinkunbao commented May 21, 2025

As discussed in #139772 and #140529, Matcher::Globs can keep the order when parsing the case list.

Created using spr 1.3.6
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" llvm:support labels May 21, 2025
@llvmbot
Copy link
Member

llvmbot commented May 21, 2025

@llvm/pr-subscribers-llvm-support

@llvm/pr-subscribers-clang

Author: Qinkun Bao (qinkunbao)

Changes

As discussed in #139772, Matcher::Globs can keep the order when parsing the case list.


Full diff: https://github.com/llvm/llvm-project/pull/140964.diff

3 Files Affected:

  • (modified) clang/lib/Basic/Diagnostic.cpp (+4-1)
  • (modified) llvm/include/llvm/Support/SpecialCaseList.h (+1-1)
  • (modified) llvm/lib/Support/SpecialCaseList.cpp (+10-11)
diff --git a/clang/lib/Basic/Diagnostic.cpp b/clang/lib/Basic/Diagnostic.cpp
index b48eed8650672..8168c36d9ccce 100644
--- a/clang/lib/Basic/Diagnostic.cpp
+++ b/clang/lib/Basic/Diagnostic.cpp
@@ -553,7 +553,10 @@ void WarningsSpecialCaseList::processSections(DiagnosticsEngine &Diags) {
     // Each section has a matcher with that section's name, attached to that
     // line.
     const auto &DiagSectionMatcher = Entry.SectionMatcher;
-    unsigned DiagLine = DiagSectionMatcher->Globs.at(DiagName).second;
+    unsigned DiagLine = 0;
+    for (const auto &[Pattern, Pair] : DiagSectionMatcher->Globs)
+      if (Pattern == DiagName)
+        DiagLine = Pair.second;
     LineAndSectionEntry.emplace_back(DiagLine, &Entry);
   }
   llvm::sort(LineAndSectionEntry);
diff --git a/llvm/include/llvm/Support/SpecialCaseList.h b/llvm/include/llvm/Support/SpecialCaseList.h
index fc6dc93651f38..14d83e64ec28c 100644
--- a/llvm/include/llvm/Support/SpecialCaseList.h
+++ b/llvm/include/llvm/Support/SpecialCaseList.h
@@ -125,7 +125,7 @@ class SpecialCaseList {
     // Returns zero if no match is found.
     LLVM_ABI unsigned match(StringRef Query) const;
 
-    StringMap<std::pair<GlobPattern, unsigned>> Globs;
+    std::vector<std::pair<std::string, std::pair<GlobPattern, unsigned>>> Globs;
     std::vector<std::pair<std::unique_ptr<Regex>, unsigned>> RegExes;
   };
 
diff --git a/llvm/lib/Support/SpecialCaseList.cpp b/llvm/lib/Support/SpecialCaseList.cpp
index dddf84cbb1ced..3e43f7bc93eff 100644
--- a/llvm/lib/Support/SpecialCaseList.cpp
+++ b/llvm/lib/Support/SpecialCaseList.cpp
@@ -53,17 +53,16 @@ Error SpecialCaseList::Matcher::insert(StringRef Pattern, unsigned LineNumber,
     return Error::success();
   }
 
-  auto [It, DidEmplace] = Globs.try_emplace(Pattern);
-  if (DidEmplace) {
-    // We must be sure to use the string in the map rather than the provided
-    // reference which could be destroyed before match() is called
-    Pattern = It->getKey();
-    auto &Pair = It->getValue();
-    if (auto Err = GlobPattern::create(Pattern, /*MaxSubPatterns=*/1024)
-                       .moveInto(Pair.first))
-      return Err;
-    Pair.second = LineNumber;
-  }
+  Globs.emplace_back();
+  auto &Glob = Globs.back();
+  Glob.first = Pattern;
+  auto &Pair = Glob.second;
+  // We must be sure to use the string in the map rather than the provided
+  // reference which could be destroyed before match() is called
+  if (auto Err = GlobPattern::create(Glob.first, /*MaxSubPatterns=*/1024)
+                     .moveInto(Pair.first))
+    return Err;
+  Pair.second = LineNumber;
   return Error::success();
 }
 

@qinkunbao qinkunbao marked this pull request as draft May 21, 2025 22:20
Created using spr 1.3.6
@qinkunbao qinkunbao requested a review from vitalybuka May 22, 2025 20:46
Copy link

github-actions bot commented May 22, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

qinkunbao added 4 commits May 22, 2025 21:21
Created using spr 1.3.6
Created using spr 1.3.6
Created using spr 1.3.6
Created using spr 1.3.6
@qinkunbao
Copy link
Member Author

#141270 can solve the issue but I don't know what is the reason.

@qinkunbao
Copy link
Member Author

qinkunbao commented May 23, 2025

This PR change converts Matcher::Globs from StringMap to vector. It is supposed to be a very simple change.

However, after the change, some unit tests are broken.
For example,

// build/unittests/Support/./SupportTests --gtest_filter=SpecialCaseListTest.Basic
TEST_F(SpecialCaseListTest, Basic) {
  std::unique_ptr<SpecialCaseList> SCL =
      makeSpecialCaseList("# This is a comment.\n"
                          "\n"
                          "src:hello\n"
                          "src:bye\n"
                          "src:hi=category\n"
                          "src:z*=category\n");
  EXPECT_TRUE(SCL->inSection("", "src", "hello"));
}

Investigation revealed that bool GlobPattern::match(StringRef S) fails to match the input "hello". Logs indicate the content of Prefix within GlobPattern changes from "hello" to "hilo" at some points. GlobPattern::Prefix is a StringRef but Glob.first is still hello. We can address the issue with #141270 but I don't know why Prefix was changed.

Logs

build/unittests/Support/./SupportTests --gtest_filter=SpecialCaseListTest.Basic
Note: Google Test filter = SpecialCaseListTest.Basic
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from SpecialCaseListTest
[ RUN      ] SpecialCaseListTest.Basic

### Insert the entry

insert GlobPattern::create: *
GlobPattern::create:  Prefix:
insert GlobPattern::create: hello
GlobPattern::create:  Prefix: hello
insert GlobPattern::create: bye
GlobPattern::create:  Prefix: bye
insert GlobPattern::create: hi
GlobPattern::create:  Prefix: hi
insert GlobPattern::create: z*
GlobPattern::create:  Prefix: z
Inside match: * Line number: 1

### Call SpecialCaseList::inSectionBlame

Input Arguments. Prefix:  src Query: hello Category:
Inside match: hello Line number: 3
Prefix: hilo
consume_front: hilo
Inside match: bye Line number: 4
Prefix: bye
consume_front: bye
/usr/local/google/home/qinkun/github/sanitizer/llvm-project/llvm/unittests/Support/SpecialCaseListTest.cpp:65: Failure
Value of: SCL->inSection("", "src", "hello")
  Actual: false
Expected: true

[  FAILED  ] SpecialCaseListTest.Basic (0 ms)
[----------] 1 test from SpecialCaseListTest (0 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (0 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] SpecialCaseListTest.Basic

 1 FAILED TEST

qinkunbao added 5 commits May 23, 2025 20:55
Created using spr 1.3.6
Created using spr 1.3.6
Created using spr 1.3.6
Created using spr 1.3.6
Created using spr 1.3.6
@qinkunbao qinkunbao marked this pull request as ready for review May 23, 2025 21:16
@qinkunbao qinkunbao requested a review from vitalybuka May 23, 2025 21:17
Created using spr 1.3.6
@qinkunbao qinkunbao merged commit e9dbf31 into main May 24, 2025
10 of 11 checks passed
@qinkunbao qinkunbao deleted the users/qinkunbao/spr/nfcisanitizer-convert-matcherglobs-from-stringmap-to-vector branch May 24, 2025 00:23
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
…lvm#140964)

As discussed in llvm#139772 and
llvm#140529, Matcher::Globs can
keep the order when parsing the case list.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants