Skip to content

Conversation

teresajohnson
Copy link
Contributor

A cycle profile of a thin link showed a lot of time spent in sort called
from the BitcodeWriter, which was being used to compute the unique
references to stack ids in the summaries emitted for each backend in a
distributed thinlto build. We were also frequently invoking lower_bound
to locate stack id indices in the resulting vector when writing out the
referencing memprof records.

Change this to use a map to uniquify the references, and to hold the
index of the corresponding stack id in the StackIds vector, which is
now populated at the same time.

This reduced the time of a large thin link by about 10%.

A cycle profile of a thin link showed a lot of time spent in sort called
from the BitcodeWriter, which was being used to compute the unique
references to stack ids in the summaries emitted for each backend in a
distributed thinlto build. We were also frequently invoking lower_bound
to locate stack id indices in the resulting vector when writing out the
referencing memprof records.

Change this to use a map to uniquify the references, and to hold the
index of the corresponding stack id in the StackIds vector, which is
now populated at the same time.

This reduced the time of a large thin link by about 10%.
Copy link

github-actions bot commented Jun 3, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 13f6797826faedb910c7e8b1dfc1f134f3cef342 eba69aa102319ab63382ece23fe146036d9dc361 -- llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
index a2007a268b..682a3fe54a 100644
--- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -4668,7 +4668,8 @@ void IndexBitcodeWriter::writeCombinedGlobalValueSummary() {
     writeFunctionHeapProfileRecords(
         Stream, FS, CallsiteAbbrev, AllocAbbrev,
         /*PerModule*/ false,
-        /*GetValueId*/ [&](const ValueInfo &VI) -> unsigned {
+        /*GetValueId*/
+        [&](const ValueInfo &VI) -> unsigned {
           std::optional<unsigned> ValueID = GetValueId(VI);
           // This can happen in shared index files for distributed ThinLTO if
           // the callee function summary is not included. Record 0 which we
@@ -4678,7 +4679,8 @@ void IndexBitcodeWriter::writeCombinedGlobalValueSummary() {
             return 0;
           return *ValueID;
         },
-        /*GetStackIndex*/ [&](unsigned I) {
+        /*GetStackIndex*/
+        [&](unsigned I) {
           // Get the corresponding index into the list of StackIds actually
           // being written for this combined index (which may be a subset in
           // the case of distributed indexes).

@teresajohnson teresajohnson merged commit f4d7058 into llvm:main Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants