Skip to content

[LLVM][Coverage][Unittest] Fix dangling reference in unittest #147118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kikairoya
Copy link
Contributor

In loop of writeAndReadCoverageRegions, OutputFunctions[I].Filenames references to contents of Filenames after returning from readCoverageRegions but Filenames will be cleared in next call of readCoverageRegions, causes dangling reference.
The lifetime of the contents of Filenames must be equal or longer than OutputFunctions[I], thus it has been moved into OutputFunctions[I] (typed OutputFunctionCoverageData).

In loop of `writeAndReadCoverageRegions`, `OutputFunctions[I].Filenames` references to
contents of `Filenames` after returning from `readCoverageRegions` but `Filenames` will be
cleared in next call of `readCoverageRegions`, causes dangling reference.
The lifetime of the contents of `Filenames` must be equal or longer than `OutputFunctions[I]`,
thus it has been moved into `OutputFunctions[I]` (typed `OutputFunctionCoverageData`).
@llvmbot llvmbot added the PGO Profile Guided Optimizations label Jul 4, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 4, 2025

@llvm/pr-subscribers-pgo

Author: Tomohiro Kashiwada (kikairoya)

Changes

In loop of writeAndReadCoverageRegions, OutputFunctions[I].Filenames references to contents of Filenames after returning from readCoverageRegions but Filenames will be cleared in next call of readCoverageRegions, causes dangling reference.
The lifetime of the contents of Filenames must be equal or longer than OutputFunctions[I], thus it has been moved into OutputFunctions[I] (typed OutputFunctionCoverageData).


Full diff: https://github.com/llvm/llvm-project/pull/147118.diff

1 Files Affected:

  • (modified) llvm/unittests/ProfileData/CoverageMappingTest.cpp (+8-9)
diff --git a/llvm/unittests/ProfileData/CoverageMappingTest.cpp b/llvm/unittests/ProfileData/CoverageMappingTest.cpp
index ec81e5f274efa..59c381a1b4e82 100644
--- a/llvm/unittests/ProfileData/CoverageMappingTest.cpp
+++ b/llvm/unittests/ProfileData/CoverageMappingTest.cpp
@@ -64,6 +64,7 @@ namespace {
 struct OutputFunctionCoverageData {
   StringRef Name;
   uint64_t Hash;
+  std::vector<std::string> FilenamesStorage;
   std::vector<StringRef> Filenames;
   std::vector<CounterMappingRegion> Regions;
   std::vector<CounterExpression> Expressions;
@@ -71,8 +72,10 @@ struct OutputFunctionCoverageData {
   OutputFunctionCoverageData() : Hash(0) {}
 
   OutputFunctionCoverageData(OutputFunctionCoverageData &&OFCD)
-      : Name(OFCD.Name), Hash(OFCD.Hash), Filenames(std::move(OFCD.Filenames)),
-        Regions(std::move(OFCD.Regions)) {}
+      : Name(OFCD.Name), Hash(OFCD.Hash),
+        FilenamesStorage(std::move(OFCD.FilenamesStorage)),
+        Filenames(std::move(OFCD.Filenames)), Regions(std::move(OFCD.Regions)) {
+  }
 
   OutputFunctionCoverageData(const OutputFunctionCoverageData &) = delete;
   OutputFunctionCoverageData &
@@ -135,7 +138,6 @@ struct InputFunctionCoverageData {
 struct CoverageMappingTest : ::testing::TestWithParam<std::tuple<bool, bool>> {
   bool UseMultipleReaders;
   StringMap<unsigned> Files;
-  std::vector<std::string> Filenames;
   std::vector<InputFunctionCoverageData> InputFunctions;
   std::vector<OutputFunctionCoverageData> OutputFunctions;
 
@@ -233,13 +235,10 @@ struct CoverageMappingTest : ::testing::TestWithParam<std::tuple<bool, bool>> {
 
   void readCoverageRegions(const std::string &Coverage,
                            OutputFunctionCoverageData &Data) {
-    // We will re-use the StringRef in duplicate tests, clear it to avoid
-    // clobber previous ones.
-    Filenames.clear();
-    Filenames.resize(Files.size() + 1);
+    Data.FilenamesStorage.resize(Files.size() + 1);
     for (const auto &E : Files)
-      Filenames[E.getValue()] = E.getKey().str();
-    ArrayRef<std::string> FilenameRefs = llvm::ArrayRef(Filenames);
+      Data.FilenamesStorage[E.getValue()] = E.getKey().str();
+    ArrayRef<std::string> FilenameRefs = llvm::ArrayRef(Data.FilenamesStorage);
     RawCoverageMappingReader Reader(Coverage, FilenameRefs, Data.Filenames,
                                     Data.Expressions, Data.Regions);
     EXPECT_THAT_ERROR(Reader.read(), Succeeded());

@kikairoya
Copy link
Contributor Author

Gentle ping.
This causes the test to fail unexpectedly on Cygwin, which uses old CoW std::string.

@jeremyd2019
Copy link
Contributor

Oh, is this the cause of the failures of the ProfileData test failures on Cygwin?

@jeremyd2019 jeremyd2019 requested a review from mstorsjo August 11, 2025 03:32
@kikairoya
Copy link
Contributor Author

Yes. I got an expectation error, 3 vs 5.

Copy link
Contributor

@jeremyd2019 jeremyd2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'd appreciate another review from someone else because I don't have a firm handle on what's going on here

@mstorsjo
Copy link
Member

Gentle ping. This causes the test to fail unexpectedly on Cygwin, which uses old CoW std::string.

The fix looks reasonable, but I wonder why this issue hasn't been observed elsewhere. If I understand it correctly, this issue should be observable anywhere by building and running this unit test with address sanitizer, no? Or does it specifically require having the old libstdc++ CoW std::string implementation in order to observe the issue?

If the issue is having an ArrayRef() pointing to an std::vector<std::string> which has been cleared, it sounds to me like it indeed would be observeable anywhere.

@mstorsjo mstorsjo requested a review from MaskRay August 11, 2025 10:49
@kikairoya
Copy link
Contributor Author

If I understand it correctly, this issue should be observable anywhere by building and running this unit test with address sanitizer, no?

Ideally, yes.
However, in my understanding by reading implementation of libstdc++ and libc++, std::vector::clear doesn't deallocate memory, but only tells the address sanitizer that the valid area has been shrunk.
Therefore, asan can't catch use-after-free if resize (it tells asan valid area will be expanded) immediately follows clear because there is no one accessing the freed memory.

If there were a sanitizer capable of detecting access to destroyed objects via a pseudo-destructor call, this issue should be detected.

@@ -64,15 +64,18 @@ namespace {
struct OutputFunctionCoverageData {
StringRef Name;
uint64_t Hash;
std::vector<std::string> FilenamesStorage;
std::vector<StringRef> Filenames;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not store the strings directly in Filenames rather than having a separate storage?

Suggested change
std::vector<StringRef> Filenames;
std::vector<std::string> Filenames;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filenames will be referenced by CoverageMappingRecord::Filenames, which is ArrayRef<StringRef>.

// clobber previous ones.
Filenames.clear();
Filenames.resize(Files.size() + 1);
Data.FilenamesStorage.resize(Files.size() + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why + 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Files (filename to index map) uses 1-based index. I don't know why it does so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants