Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm-exegesis] Make duplicate snippet repetitor produce whole snippets #77224

Conversation

boomanaiden154
Copy link
Contributor

@boomanaiden154 boomanaiden154 commented Jan 7, 2024

Currently, the duplicate snippet repetitor will truncate snippets that do not exactly divide the minimum number of instructions. This patch corrects that behavior by making the duplicate snippet repetitor duplicate the snippet in its entirety until the minimum number of instructions has been reached.

This makes the behavior consistent with the loop snippet repetitor, which will execute at least --num-repetitions (soon to be renamed --min-instructions) instructions.

Currently, the duplicate snippet repetitor will truncate snippets that
do not exactly divide the minimum number of instructions. This patch
corrects that behavior by making the duplicate snippet repetitor
duplicate the snippet in its entirety until the minimum number of
instructions has been reached.
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 7, 2024

@llvm/pr-subscribers-tools-llvm-exegesis

Author: Aiden Grossman (boomanaiden154)

Changes

Currently, the duplicate snippet repetitor will truncate snippets that do not exactly divide the minimum number of instructions. This patch corrects that behavior by making the duplicate snippet repetitor duplicate the snippet in its entirety until the minimum number of instructions has been reached.


Full diff: https://github.com/llvm/llvm-project/pull/77224.diff

2 Files Affected:

  • (modified) llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp (+5-1)
  • (modified) llvm/unittests/tools/llvm-exegesis/X86/SnippetRepetitorTest.cpp (+16-2)
diff --git a/llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp b/llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp
index cc5a045a8be5dd..5861d56c61cf6d 100644
--- a/llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp
+++ b/llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp
@@ -31,7 +31,11 @@ class DuplicateSnippetRepetitor : public SnippetRepetitor {
       if (!Instructions.empty()) {
         // Add the whole snippet at least once.
         Entry.addInstructions(Instructions);
-        for (unsigned I = Instructions.size(); I < MinInstructions; ++I) {
+        unsigned FullInstructionCount = MinInstructions;
+        if (FullInstructionCount % Instructions.size() != 0)
+          FullInstructionCount +=
+              Instructions.size() - (MinInstructions % Instructions.size());
+        for (unsigned I = Instructions.size(); I < FullInstructionCount; ++I) {
           Entry.addInstruction(Instructions[I % Instructions.size()]);
         }
       }
diff --git a/llvm/unittests/tools/llvm-exegesis/X86/SnippetRepetitorTest.cpp b/llvm/unittests/tools/llvm-exegesis/X86/SnippetRepetitorTest.cpp
index d2382ec0cddc49..25e8836087c15d 100644
--- a/llvm/unittests/tools/llvm-exegesis/X86/SnippetRepetitorTest.cpp
+++ b/llvm/unittests/tools/llvm-exegesis/X86/SnippetRepetitorTest.cpp
@@ -38,9 +38,11 @@ class X86SnippetRepetitorTest : public X86TestBase {
     MF = &createVoidVoidPtrMachineFunction("TestFn", Mod.get(), MMI.get());
   }
 
-  void TestCommon(Benchmark::RepetitionModeE RepetitionMode) {
+  void TestCommon(Benchmark::RepetitionModeE RepetitionMode,
+                  unsigned SnippetInstructions = 1) {
     const auto Repetitor = SnippetRepetitor::Create(RepetitionMode, State);
-    const std::vector<MCInst> Instructions = {MCInstBuilder(X86::NOOP)};
+    const std::vector<MCInst> Instructions(SnippetInstructions,
+                                           MCInstBuilder(X86::NOOP));
     FunctionFiller Sink(*MF, {X86::EAX});
     const auto Fill =
         Repetitor->Repeat(Instructions, kMinInstructions, kLoopBodySize, false);
@@ -74,6 +76,18 @@ TEST_F(X86SnippetRepetitorTest, Duplicate) {
                           HasOpcode(X86::NOOP), HasOpcode(X86::RET64)));
 }
 
+TEST_F(X86SnippetRepetitorTest, DuplicateSnippetInstructionCount) {
+  TestCommon(Benchmark::Duplicate, 2);
+  // Duplicating a snippet of two instructions with the minimum number of
+  // instructions set to three duplicates the snippet twice for a total of
+  // four instructions.
+  ASSERT_EQ(MF->getNumBlockIDs(), 1u);
+  EXPECT_THAT(MF->getBlockNumbered(0)->instrs(),
+              ElementsAre(HasOpcode(X86::NOOP), HasOpcode(X86::NOOP),
+                          HasOpcode(X86::NOOP), HasOpcode(X86::NOOP),
+                          HasOpcode(X86::RET64)));
+}
+
 TEST_F(X86SnippetRepetitorTest, Loop) {
   TestCommon(Benchmark::Loop);
   // Duplicating creates an entry block, a loop body and a ret block.

boomanaiden154 added a commit to boomanaiden154/llvm-project that referenced this pull request Jan 7, 2024
Currently, BenchmarkRunner scales the per snippet counters by
multiplying the raw counter values by the number of instructions (casted
to a double) divided by the minimum number of instructions. This is
incorrect for the loop repetition mode for snippets that don't fit a
whole number of times into the minimum instruction count. For example,
with 3 instructions in the snippet and the minimum number of
instructions set to 4, the loop repetitor will execute a total of six
instructions, but BenchmarkRunner will scale the raw count by 3/4
instead of 3/6=1/2. This will also be incorrect for the duplicate
snippet repetitor after llvm#77224.

This patch fixes this behavior by dividing the raw count by the ceiling
of the number of repetitions divided by the instruction count.
boomanaiden154 added a commit that referenced this pull request Jan 16, 2024
Currently, BenchmarkRunner scales the per snippet counters by
multiplying the raw counter values by the number of instructions (casted
to a double) divided by the minimum number of instructions. This is
incorrect for the loop repetition mode for snippets that don't fit a
whole number of times into the minimum instruction count. For example,
with 3 instructions in the snippet and the minimum number of
instructions set to 4, the loop repetitor will execute a total of six
instructions, but BenchmarkRunner will scale the raw count by 3/4
instead of 3/6=1/2. This will also be incorrect for the duplicate
snippet repetitor after #77224.

This patch fixes this behavior by dividing the raw count by the ceiling
of the number of repetitions divided by the instruction count.
Copy link
Contributor

@legrosbuffle legrosbuffle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@boomanaiden154 boomanaiden154 merged commit 2b31a67 into llvm:main Jan 19, 2024
3 of 4 checks passed
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
Currently, BenchmarkRunner scales the per snippet counters by
multiplying the raw counter values by the number of instructions (casted
to a double) divided by the minimum number of instructions. This is
incorrect for the loop repetition mode for snippets that don't fit a
whole number of times into the minimum instruction count. For example,
with 3 instructions in the snippet and the minimum number of
instructions set to 4, the loop repetitor will execute a total of six
instructions, but BenchmarkRunner will scale the raw count by 3/4
instead of 3/6=1/2. This will also be incorrect for the duplicate
snippet repetitor after llvm#77224.

This patch fixes this behavior by dividing the raw count by the ceiling
of the number of repetitions divided by the instruction count.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants