[llvm-mca] Add command line option `-use-load-latency` #94566

chinmaydd · 2024-06-06T04:17:32Z

Latency for load operations could be modeled better by using the LoadLatency field in the provided scheduling model for the target architecture. This PR adds a new command line flag use-load-latency that enables this behavior. The flag is off by default.

This attempts a fix for the comment here - LSUnit.h#403. I would be happy to know if there is a better way to approach this.

llvmbot · 2024-06-06T04:18:03Z

@llvm/pr-subscribers-tools-llvm-mca

@llvm/pr-subscribers-backend-x86

Author: Chinmay Deshpande (chinmaydd)

Changes

Latency for load operations could be modeled better by using the LoadLatency field in the provided scheduling model for the target architecture. This PR adds a new command line flag use-load-latency that enables this behavior. The flag is off by default.

This attempts a fix for the comment here - LSUnit.h#370. I would be happy to know if there is a better way to approach this.

Full diff: https://github.com/llvm/llvm-project/pull/94566.diff

6 Files Affected:

(modified) llvm/include/llvm/MCA/InstrBuilder.h (+3-1)
(modified) llvm/lib/MCA/InstrBuilder.cpp (+13-5)
(added) llvm/test/tools/llvm-mca/X86/use-load-latency.s (+58)
(modified) llvm/tools/llvm-mca/llvm-mca.cpp (+7-1)
(modified) llvm/unittests/tools/llvm-mca/MCATestBase.cpp (+2-1)
(modified) llvm/unittests/tools/llvm-mca/X86/TestIncrementalMCA.cpp (+4-2)

diff --git a/llvm/include/llvm/MCA/InstrBuilder.h b/llvm/include/llvm/MCA/InstrBuilder.h
index 00c7942e4fa16..6880a213d48b5 100644
--- a/llvm/include/llvm/MCA/InstrBuilder.h
+++ b/llvm/include/llvm/MCA/InstrBuilder.h
@@ -79,6 +79,7 @@ class InstrBuilder {
   bool FirstCallInst;
   bool FirstReturnInst;
   unsigned CallLatency;
+  bool UseLoadLatency;
 
   using InstRecycleCallback = std::function<Instruction *(const InstrDesc &)>;
   InstRecycleCallback InstRecycleCB;
@@ -99,7 +100,8 @@ class InstrBuilder {
 public:
   InstrBuilder(const MCSubtargetInfo &STI, const MCInstrInfo &MCII,
                const MCRegisterInfo &RI, const MCInstrAnalysis *IA,
-               const InstrumentManager &IM, unsigned CallLatency);
+               const InstrumentManager &IM, unsigned CallLatency,
+               bool UseLoadLatency);
 
   void clear() {
     Descriptors.clear();
diff --git a/llvm/lib/MCA/InstrBuilder.cpp b/llvm/lib/MCA/InstrBuilder.cpp
index d5cbdc5de0b84..db4eda40de9d0 100644
--- a/llvm/lib/MCA/InstrBuilder.cpp
+++ b/llvm/lib/MCA/InstrBuilder.cpp
@@ -31,9 +31,10 @@ InstrBuilder::InstrBuilder(const llvm::MCSubtargetInfo &sti,
                            const llvm::MCInstrInfo &mcii,
                            const llvm::MCRegisterInfo &mri,
                            const llvm::MCInstrAnalysis *mcia,
-                           const mca::InstrumentManager &im, unsigned cl)
+                           const mca::InstrumentManager &im, unsigned cl,
+                           bool ull)
     : STI(sti), MCII(mcii), MRI(mri), MCIA(mcia), IM(im), FirstCallInst(true),
-      FirstReturnInst(true), CallLatency(cl) {
+      FirstReturnInst(true), CallLatency(cl), UseLoadLatency(ull) {
   const MCSchedModel &SM = STI.getSchedModel();
   ProcResourceMasks.resize(SM.getNumProcResourceKinds());
   computeProcResourceMasks(STI.getSchedModel(), ProcResourceMasks);
@@ -220,8 +221,8 @@ static void initializeUsedResources(InstrDesc &ID,
 
 static void computeMaxLatency(InstrDesc &ID, const MCInstrDesc &MCDesc,
                               const MCSchedClassDesc &SCDesc,
-                              const MCSubtargetInfo &STI,
-                              unsigned CallLatency) {
+                              const MCSubtargetInfo &STI, unsigned CallLatency,
+                              bool UseLoadLatency) {
   if (MCDesc.isCall()) {
     // We cannot estimate how long this call will take.
     // Artificially set an arbitrarily high latency.
@@ -230,6 +231,13 @@ static void computeMaxLatency(InstrDesc &ID, const MCInstrDesc &MCDesc,
   }
 
   int Latency = MCSchedModel::computeInstrLatency(STI, SCDesc);
+
+  // If `UseLoadLatency` is set, we use the value in `MCSchedModel::LoadLatency`
+  // for load instructions.
+  if (MCDesc.mayLoad() && UseLoadLatency) {
+    const auto &SM = STI.getSchedModel();
+    Latency = std::max(int(SM.LoadLatency), Latency);
+  }
   // If latency is unknown, then conservatively assume the MaxLatency set for
   // calls.
   ID.MaxLatency = Latency < 0 ? CallLatency : static_cast<unsigned>(Latency);
@@ -582,7 +590,7 @@ InstrBuilder::createInstrDescImpl(const MCInst &MCI,
   }
 
   initializeUsedResources(*ID, SCDesc, STI, ProcResourceMasks);
-  computeMaxLatency(*ID, MCDesc, SCDesc, STI, CallLatency);
+  computeMaxLatency(*ID, MCDesc, SCDesc, STI, CallLatency, UseLoadLatency);
 
   if (Error Err = verifyOperands(MCDesc, MCI))
     return std::move(Err);
diff --git a/llvm/test/tools/llvm-mca/X86/use-load-latency.s b/llvm/test/tools/llvm-mca/X86/use-load-latency.s
new file mode 100644
index 0000000000000..54c37c5c8699f
--- /dev/null
+++ b/llvm/test/tools/llvm-mca/X86/use-load-latency.s
@@ -0,0 +1,58 @@
+# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
+# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=btver2                   -iterations=1 %s | FileCheck --check-prefixes=ALL,DEFAULT %s
+# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=btver2 -use-load-latency -iterations=1 %s | FileCheck --check-prefixes=ALL,CUSTOM %s
+
+movq      (%rsp), %rdi
+
+# ALL:          Iterations:        1
+# ALL-NEXT:     Instructions:      1
+
+# CUSTOM-NEXT:  Total Cycles:      8
+# DEFAULT-NEXT: Total Cycles:      6
+
+# ALL-NEXT:     Total uOps:        1
+
+# ALL:          Dispatch Width:    2
+
+# CUSTOM-NEXT:  uOps Per Cycle:    0.13
+# CUSTOM-NEXT:  IPC:               0.13
+
+# DEFAULT-NEXT: uOps Per Cycle:    0.17
+# DEFAULT-NEXT: IPC:               0.17
+
+# ALL-NEXT:     Block RThroughput: 1.0
+
+# ALL:          Instruction Info:
+# ALL-NEXT:     [1]: #uOps
+# ALL-NEXT:     [2]: Latency
+# ALL-NEXT:     [3]: RThroughput
+# ALL-NEXT:     [4]: MayLoad
+# ALL-NEXT:     [5]: MayStore
+# ALL-NEXT:     [6]: HasSideEffects (U)
+
+# ALL:          [1]    [2]    [3]    [4]    [5]    [6]    Instructions:
+# ALL-NEXT:      1      3     1.00    *                   movq	(%rsp), %rdi
+
+# ALL:          Resources:
+# ALL-NEXT:     [0]   - JALU0
+# ALL-NEXT:     [1]   - JALU1
+# ALL-NEXT:     [2]   - JDiv
+# ALL-NEXT:     [3]   - JFPA
+# ALL-NEXT:     [4]   - JFPM
+# ALL-NEXT:     [5]   - JFPU0
+# ALL-NEXT:     [6]   - JFPU1
+# ALL-NEXT:     [7]   - JLAGU
+# ALL-NEXT:     [8]   - JMul
+# ALL-NEXT:     [9]   - JSAGU
+# ALL-NEXT:     [10]  - JSTC
+# ALL-NEXT:     [11]  - JVALU0
+# ALL-NEXT:     [12]  - JVALU1
+# ALL-NEXT:     [13]  - JVIMUL
+
+# ALL:          Resource pressure per iteration:
+# ALL-NEXT:     [0]    [1]    [2]    [3]    [4]    [5]    [6]    [7]    [8]    [9]    [10]   [11]   [12]   [13]
+# ALL-NEXT:      -      -      -      -      -      -      -     1.00    -      -      -      -      -      -
+
+# ALL:          Resource pressure by instruction:
+# ALL-NEXT:     [0]    [1]    [2]    [3]    [4]    [5]    [6]    [7]    [8]    [9]    [10]   [11]   [12]   [13]   Instructions:
+# ALL-NEXT:      -      -      -      -      -      -      -     1.00    -      -      -      -      -      -     movq	(%rsp), %rdi
diff --git a/llvm/tools/llvm-mca/llvm-mca.cpp b/llvm/tools/llvm-mca/llvm-mca.cpp
index cc5d4f5fa05de..927b2b97c4d68 100644
--- a/llvm/tools/llvm-mca/llvm-mca.cpp
+++ b/llvm/tools/llvm-mca/llvm-mca.cpp
@@ -140,6 +140,11 @@ static cl::opt<unsigned>
                 cl::desc("Number of cycles to assume for a call instruction"),
                 cl::cat(ToolOptions), cl::init(100U));
 
+static cl::opt<bool> UseLoadLatency(
+    "use-load-latency", cl::Hidden,
+    cl::desc("Use target specific latency for load instructions"),
+    cl::cat(ToolOptions), cl::init(false));
+
 enum class SkipType { NONE, LACK_SCHED, PARSE_FAILURE, ANY_FAILURE };
 
 static cl::opt<enum SkipType> SkipUnsupportedInstructions(
@@ -573,7 +578,8 @@ int main(int argc, char **argv) {
   }
 
   // Create an instruction builder.
-  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, CallLatency);
+  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, CallLatency,
+                       UseLoadLatency);
 
   // Create a context to control ownership of the pipeline hardware.
   mca::Context MCA(*MRI, *STI);
diff --git a/llvm/unittests/tools/llvm-mca/MCATestBase.cpp b/llvm/unittests/tools/llvm-mca/MCATestBase.cpp
index 4a39f5e663f23..515fbe5bd6b69 100644
--- a/llvm/unittests/tools/llvm-mca/MCATestBase.cpp
+++ b/llvm/unittests/tools/llvm-mca/MCATestBase.cpp
@@ -66,7 +66,8 @@ Error MCATestBase::runBaselineMCA(json::Object &Result, ArrayRef<MCInst> Insts,
 
   // Default InstrumentManager
   auto IM = std::make_unique<mca::InstrumentManager>(*STI, *MCII);
-  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, /*CallLatency=*/100);
+  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, /*CallLatency=*/100,
+                       /*UseLoadLatency*/ false);
 
   const SmallVector<mca::Instrument *> Instruments;
   SmallVector<std::unique_ptr<mca::Instruction>> LoweredInsts;
diff --git a/llvm/unittests/tools/llvm-mca/X86/TestIncrementalMCA.cpp b/llvm/unittests/tools/llvm-mca/X86/TestIncrementalMCA.cpp
index ac35dce522ae1..80a2f49ca4aa4 100644
--- a/llvm/unittests/tools/llvm-mca/X86/TestIncrementalMCA.cpp
+++ b/llvm/unittests/tools/llvm-mca/X86/TestIncrementalMCA.cpp
@@ -33,7 +33,8 @@ TEST_F(X86TestBase, TestResumablePipeline) {
   P->addEventListener(SV.get());
 
   auto IM = std::make_unique<mca::InstrumentManager>(*STI, *MCII);
-  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, /*CallLatency=*/100);
+  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, /*CallLatency=*/100,
+                       /*UseLoadLatency*/ false);
 
   const SmallVector<mca::Instrument *> Instruments;
   // Tile size = 7
@@ -124,7 +125,8 @@ TEST_F(X86TestBase, TestInstructionRecycling) {
   // Default InstrumentManager
   auto IM = std::make_unique<mca::InstrumentManager>(*STI, *MCII);
 
-  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, /*CallLatency=*/100);
+  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, /*CallLatency=*/100,
+                       /*UseLoadLatency*/ false);
   IB.setInstRecycleCallback(GetRecycledInst);
 
   const SmallVector<mca::Instrument *> Instruments;

chinmaydd · 2024-06-07T18:25:51Z

@mshockwave @michaelmaitland - tagging the both of you since you helped review my last PR.

mshockwave · 2024-06-07T22:01:52Z

llvm/lib/MCA/InstrBuilder.cpp

+  // If `UseLoadLatency` is set, we use the value in `MCSchedModel::LoadLatency`
+  // for load instructions.
+  if (MCDesc.mayLoad() && UseLoadLatency) {
+    const auto &SM = STI.getSchedModel();


please spell out the type (rather than using auto) here.

mshockwave · 2024-06-07T22:04:42Z

llvm/unittests/tools/llvm-mca/MCATestBase.cpp

  auto IM = std::make_unique<mca::InstrumentManager>(*STI, *MCII);
-  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, /*CallLatency=*/100);
+  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, /*CallLatency=*/100,
+                       /*UseLoadLatency*/ false);


Per https://www.llvm.org/docs/CodingStandards.html#comment-formatting :

Suggested change

/*UseLoadLatency*/ false);

/*UseLoadLatency=*/false);

mshockwave · 2024-06-07T22:04:51Z

llvm/unittests/tools/llvm-mca/X86/TestIncrementalMCA.cpp

  auto IM = std::make_unique<mca::InstrumentManager>(*STI, *MCII);
-  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, /*CallLatency=*/100);
+  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, /*CallLatency=*/100,
+                       /*UseLoadLatency*/ false);


mshockwave · 2024-06-07T22:04:57Z

llvm/unittests/tools/llvm-mca/X86/TestIncrementalMCA.cpp


-  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, /*CallLatency=*/100);
+  mca::InstrBuilder IB(*STI, *MCII, *MRI, MCIA.get(), *IM, /*CallLatency=*/100,
+                       /*UseLoadLatency*/ false);


Latency for load operations could be modeled better by using the `LoadLatency` field in the provided scheduling model for the target architecture.

adibiagio · 2024-06-10T13:14:45Z

llvm/lib/MCA/InstrBuilder.cpp

+  if (MCDesc.mayLoad() && UseLoadLatency) {
+    const MCSchedModel &SM = STI.getSchedModel();
+    Latency = std::max(int(SM.LoadLatency), Latency);
+  }


Sorry but this doesn't work.

You cannot simply override Latency that way. Latency of Instructions with folded memory operands isn't just equivalent to the load latency.

To further complicate matters, it is wrong to assume a single load latency value for all memory operations.
For example, on AMD processors, the load-to-use latency changes depending on whether the opcode is INT or FPU.
That is because there is an extra delay in the data path when loaded values are forwarded to the floating point unit (often implemented as a coprocessor).

michaelmaitland · 2024-06-10T16:06:08Z

llvm/lib/MCA/InstrBuilder.cpp

+
+  // If `UseLoadLatency` is set, we use the value in `MCSchedModel::LoadLatency`
+  // for load instructions.
+  if (MCDesc.mayLoad() && UseLoadLatency) {


What if LoadLatency is not set (i.e. it is -1) but -use-load-latency is passed? Should a warning be printed if -use-load-latency is set?

michaelmaitland · 2024-06-10T16:24:29Z

llvm/lib/MCA/InstrBuilder.cpp

    return;
  }

  int Latency = MCSchedModel::computeInstrLatency(STI, SCDesc);


IIUC, there are two ways we use LoadLatency in LLVM today:

In TargetSchedModel::computeInstrLatency under the condition !hasInstrSchedModel && !SCDesc.isValid && MI.mayLoad() by making a call to defaultDefLatency. It seems like LoadLatency is really used as a backup in case something is misconfigured.

Scheduler models will use it as a helper variable in computing instruction latencies. I am not 100% sure because I am not an expert on these models, but I think this is modeling the worse case scenario of checking the cache, having a miss, and having to do the load. Or it is possible that these models are misusing LoadLatency as a BaseLatency and adjusting the latency accordingly. For example:

llvm/lib/Target/X86/X86ScheduleZnver2.td:252: let Latency = !add(Zn2WriteIMulH.Latency, Znver2Model.LoadLatency);

It seems like you are not trying to use it under the (1) or (2) interpretations of this variable. And here is why we shouldn't use it in these circumstances:

In the first case, if we are using llvm-mca with !hasInstrSchedModel, then something is probably wrong and a warning or error should have occurred sooner. This would mean that the entire report is not meaningful. If we are using llvm-mca where !SCDest.isValid, then again there is a bigger issue here. Using LoadLatency probably isn't the solution. So if either of these conditions are true, it does not seem like the right solution to use LoadLatency in the llvm-mca reports. It makes sense to error or warn earlier.

In the second case, the LoadLatency should be baked into int Latency = MCSchedModel::computeInstrLatency(STI, SCDesc);. I think this makes the assignment Latency = std::max(int(SM.LoadLatency), Latency); incorrect.

Prior to this patch, it does not seem like we are using LoadLatency to model the case you describe: as the cycles for loads to access the cache and assume that there is always a cache hit. It seems to be specified in such a way where it would make sense to use it in this way, so I think it is reasonable to have llvm-mca use it as such.

As @adibiagio points out, I don't think this is what the code added in this patch models. We would need to determine the depth of the load pipeline. I think the current interpretation of this patch is to use whatever latency is higher: latency assuming a cache hit but ignoring load pipeline depth or latency assigned to instruction. I think in most reasonable models, the second one should always be higher, so I'm not sure how useful this would be.

As @adibiagio pointed out load-to-use latency changes depending on whether the opcode type (int, fp, vector). I wonder if LoadLatency is too simple/naive of a value to be useful on processors that have multiple types of loads. One solution is to move LoadLatency field in ProcWriteResources that can be set per scheduler class. I would not be opposed if the solution proposed here came first and this improvement was added on as a follow up. This would allow models that do have a single LoadLatency to benefit from this kind of reporting.

As side note, I wonder if the scheduler could or should make more use of this interpretation in scheduling.

Scheduler models will use it as a helper variable in computing instruction latencies. I am not 100% sure because I am not an expert on these models, but I think this is modeling the worse case scenario of checking the cache, having a miss, and having to do the load. Or it is possible that these models are misusing LoadLatency as a BaseLatency and adjusting the latency accordingly. For example:

llvm/lib/Target/X86/X86ScheduleZnver2.td:252: let Latency = !add(Zn2WriteIMulH.Latency, Znver2Model.LoadLatency);

X86 scheduling models (optimistically) define LoadLatency as the L1D load-to-use latency.

Things are a bit more complicated for AMD processors, where load-to-use latency varies depending on whether loaded values are used by the FPU or INT.

According to the official AMD SoG for family 17h: "The L1 data cache has a 4- or 5-cycle integer load-to-use latency, and a 7- or 8-cycle FPU load-to-use latency". That is the reason why the Znver2Model optimistically defines it as 4.

If you look at the scheduling model for ZnVer2, you can see two multiclasses:

multiclass Zn2WriteResPair is used by integer instructions with a folded load operand

multiclass Zn2WriteResFpuPair is used by floating point/vector instructions with a folded load.

multiclass Zn2WriteResPair sets the default value for param loadLat to 4 (i.e. optimistic L1 data load-to-use INT latency according to the AMD docs).
On the other hand, multiclass Zn2WriteResFpuPair sets that param to 7 (cycles).

As far as I am aware of, Intel processors don't need to model multiple values for the load-to-use latency. On X86, only for AMD processors the load latency changes depending on whether the user is INT or FPU.

chinmaydd · 2024-06-11T03:28:55Z

@michaelmaitland and @adibiagio - thank you very much for your time and useful comments. I think I understand the fundamental problems with this patch.

It may make sense for a consumer of llvm-mca (looking to more precisely model latency counts for load instructions) to define custom HW units, such as the LSUnit and something like a CacheManager, tuned to their specific processor.

Closing the PR, thanks again !

llvmbot added backend:X86 tools:llvm-mca labels Jun 6, 2024

mshockwave reviewed Jun 7, 2024

View reviewed changes

chinmaydd added 2 commits June 7, 2024 15:26

[llvm-mca] Add command line option -use-load-latency

13c52f9

Latency for load operations could be modeled better by using the `LoadLatency` field in the provided scheduling model for the target architecture.

[llvm-mca] Fix formatting issues

0f3a127

chinmaydd force-pushed the mca-load-latency branch from 553c8b2 to 0f3a127 Compare June 7, 2024 22:27

adibiagio requested review from adibiagio and michaelmaitland June 10, 2024 12:49

adibiagio requested changes Jun 10, 2024

View reviewed changes

michaelmaitland reviewed Jun 10, 2024

View reviewed changes

chinmaydd closed this Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[llvm-mca] Add command line option `-use-load-latency` #94566

[llvm-mca] Add command line option `-use-load-latency` #94566

Uh oh!

chinmaydd commented Jun 6, 2024 •

edited

Loading

Uh oh!

llvmbot commented Jun 6, 2024 •

edited

Loading

Uh oh!

chinmaydd commented Jun 7, 2024

Uh oh!

mshockwave Jun 7, 2024

Uh oh!

mshockwave Jun 7, 2024

Uh oh!

mshockwave Jun 7, 2024

Uh oh!

mshockwave Jun 7, 2024

Uh oh!

adibiagio Jun 10, 2024

Uh oh!

michaelmaitland Jun 10, 2024

Uh oh!

michaelmaitland Jun 10, 2024

Uh oh!

adibiagio Jun 10, 2024 •

edited

Loading

Uh oh!

chinmaydd commented Jun 11, 2024

Uh oh!

Uh oh!

[llvm-mca] Add command line option -use-load-latency #94566

[llvm-mca] Add command line option -use-load-latency #94566

Uh oh!

Conversation

chinmaydd commented Jun 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chinmaydd commented Jun 7, 2024

Uh oh!

mshockwave Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

mshockwave Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

mshockwave Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

mshockwave Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

adibiagio Jun 10, 2024

Choose a reason for hiding this comment

Uh oh!

michaelmaitland Jun 10, 2024

Choose a reason for hiding this comment

Uh oh!

michaelmaitland Jun 10, 2024

Choose a reason for hiding this comment

Uh oh!

adibiagio Jun 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chinmaydd commented Jun 11, 2024

Uh oh!

Uh oh!

[llvm-mca] Add command line option `-use-load-latency` #94566

[llvm-mca] Add command line option `-use-load-latency` #94566

chinmaydd commented Jun 6, 2024 •

edited

Loading

llvmbot commented Jun 6, 2024 •

edited

Loading

adibiagio Jun 10, 2024 •

edited

Loading