[mlir][amdgpu] `memory_counter_wait` tensor counter support #171153

Hardcode84 · 2025-12-08T16:17:34Z

No description provided.

llvmbot · 2025-12-08T16:18:06Z

@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-mlir-amdgpu

Author: Ivan Butygin (Hardcode84)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/171153.diff

7 Files Affected:

(modified) mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td (+3-2)
(modified) mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp (+6)
(modified) mlir/lib/Dialect/AMDGPU/IR/AMDGPUDialect.cpp (+5-3)
(modified) mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait.mlir (+4-4)
(added) mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_tensor.mlir (+9)
(added) mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_unsupported.mlir (+11)
(modified) mlir/test/Dialect/AMDGPU/canonicalize.mlir (+3-3)

diff --git a/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td b/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
index ba078f52d24f6..56160d3e8fe85 100644
--- a/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
+++ b/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
@@ -906,7 +906,8 @@ def AMDGPU_MemoryCounterWaitOp :
       OptionalAttr<I32Attr>:$load,
       OptionalAttr<I32Attr>:$store,
       OptionalAttr<I32Attr>:$ds,
-      OptionalAttr<I32Attr>:$exp
+      OptionalAttr<I32Attr>:$exp,
+      OptionalAttr<I32Attr>:$tensor
     )>
   {
   let summary = "Wait for specified hardware counters";
@@ -919,7 +920,7 @@ def AMDGPU_MemoryCounterWaitOp :
     counters into one.
   }];
   let assemblyFormat = [{
-    oilist( `load` `(` $load `)` | `store` `(` $store `)` | `ds` `(` $ds `)` | `exp` `(` $exp `)` ) attr-dict
+    oilist( `load` `(` $load `)` | `store` `(` $store `)` | `ds` `(` $ds `)` | `exp` `(` $exp `)` | `tensor` `(` $tensor `)` ) attr-dict
   }];
 
   let hasCanonicalizer = 1;
diff --git a/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp b/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
index f3b0da0120998..7584b17075225 100644
--- a/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+++ b/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
@@ -506,10 +506,16 @@ struct MemoryCounterWaitOpLowering
       if (std::optional<int> exp = adaptor.getExp())
         ROCDL::WaitExpcntOp::create(rewriter, loc, *exp);
 
+      if (std::optional<int> tensor = adaptor.getTensor())
+        ROCDL::WaitTensorcntOp::create(rewriter, loc, *tensor);
+
       rewriter.eraseOp(op);
       return success();
     }
 
+    if (adaptor.getTensor())
+      return op.emitOpError("unsupported chipset");
+
     auto getVal = [](Attribute attr) -> unsigned {
       if (attr)
         return cast<IntegerAttr>(attr).getInt();
diff --git a/mlir/lib/Dialect/AMDGPU/IR/AMDGPUDialect.cpp b/mlir/lib/Dialect/AMDGPU/IR/AMDGPUDialect.cpp
index 4a85db3ecf6f8..b7a665b0f5367 100644
--- a/mlir/lib/Dialect/AMDGPU/IR/AMDGPUDialect.cpp
+++ b/mlir/lib/Dialect/AMDGPU/IR/AMDGPUDialect.cpp
@@ -614,10 +614,12 @@ struct FuseMemoryCounterWaitOp final : OpRewritePattern<MemoryCounterWaitOp> {
 
     auto setters = {&MemoryCounterWaitOp::setLoad,
                     &MemoryCounterWaitOp::setStore, &MemoryCounterWaitOp::setDs,
-                    &MemoryCounterWaitOp::setExp};
-    auto lhsVals = {op.getLoad(), op.getStore(), op.getDs(), op.getExp()};
+                    &MemoryCounterWaitOp::setExp,
+                    &MemoryCounterWaitOp::setTensor};
+    auto lhsVals = {op.getLoad(), op.getStore(), op.getDs(), op.getExp(),
+                    op.getTensor()};
     auto rhsVals = {next.getLoad(), next.getStore(), next.getDs(),
-                    next.getExp()};
+                    next.getExp(), next.getTensor()};
     rewriter.modifyOpInPlace(op, [&] {
       for (auto [setter, lhs, rhs] :
            llvm::zip_equal(setters, lhsVals, rhsVals)) {
diff --git a/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait.mlir b/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait.mlir
index 1016ee859e462..537ef59b503a6 100644
--- a/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait.mlir
+++ b/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait.mlir
@@ -1,7 +1,7 @@
-// RUN: mlir-opt %s -convert-amdgpu-to-rocdl=chipset=gfx942 | FileCheck %s --check-prefixes=CHECK,GFX9
-// RUN: mlir-opt %s -convert-amdgpu-to-rocdl=chipset=gfx1030 | FileCheck %s --check-prefixes=CHECK,GFX10
-// RUN: mlir-opt %s -convert-amdgpu-to-rocdl=chipset=gfx1100 | FileCheck %s --check-prefixes=CHECK,GFX11
-// RUN: mlir-opt %s -convert-amdgpu-to-rocdl=chipset=gfx1201 | FileCheck %s --check-prefixes=CHECK,GFX12
+// RUN: mlir-opt %s --convert-amdgpu-to-rocdl=chipset=gfx942 | FileCheck %s --check-prefixes=CHECK,GFX9
+// RUN: mlir-opt %s --convert-amdgpu-to-rocdl=chipset=gfx1030 | FileCheck %s --check-prefixes=CHECK,GFX10
+// RUN: mlir-opt %s --convert-amdgpu-to-rocdl=chipset=gfx1100 | FileCheck %s --check-prefixes=CHECK,GFX11
+// RUN: mlir-opt %s --convert-amdgpu-to-rocdl=chipset=gfx1201 | FileCheck %s --check-prefixes=CHECK,GFX12
 
 // CHECK-LABEL: func @memory_counter_wait
 func.func @memory_counter_wait() {
diff --git a/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_tensor.mlir b/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_tensor.mlir
new file mode 100644
index 0000000000000..5b29e01abebdb
--- /dev/null
+++ b/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_tensor.mlir
@@ -0,0 +1,9 @@
+// RUN: mlir-opt %s --convert-amdgpu-to-rocdl=chipset=gfx1250 | FileCheck %s
+
+// CHECK-LABEL: func @memory_counter_wait_tensor
+func.func @memory_counter_wait_tensor() {
+  // CHECK: rocdl.s.wait.tensorcnt 3
+  amdgpu.memory_counter_wait tensor(3)
+
+  return
+}
diff --git a/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_unsupported.mlir b/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_unsupported.mlir
new file mode 100644
index 0000000000000..1d2f692bee488
--- /dev/null
+++ b/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_unsupported.mlir
@@ -0,0 +1,11 @@
+// RUN: mlir-opt %s --verify-diagnostics --convert-amdgpu-to-rocdl=chipset=gfx942
+// RUN: mlir-opt %s --verify-diagnostics --convert-amdgpu-to-rocdl=chipset=gfx1030
+// RUN: mlir-opt %s --verify-diagnostics --convert-amdgpu-to-rocdl=chipset=gfx1100
+
+func.func @memory_counter_wait_tensor() {
+  // expected-error @below{{failed to legalize operation 'amdgpu.memory_counter_wait'}}
+  // expected-error @below{{'amdgpu.memory_counter_wait' op unsupported chipset}}
+  amdgpu.memory_counter_wait tensor(0)
+
+  return
+}
diff --git a/mlir/test/Dialect/AMDGPU/canonicalize.mlir b/mlir/test/Dialect/AMDGPU/canonicalize.mlir
index c66e9ed5d6f6d..cff1d3f2ac1fd 100644
--- a/mlir/test/Dialect/AMDGPU/canonicalize.mlir
+++ b/mlir/test/Dialect/AMDGPU/canonicalize.mlir
@@ -250,10 +250,10 @@ func.func @scaled_mfma_ugly_shapes(%opA: vector<32xf4E2M1FN>, %opB: vector<32xf4
 // CHECK-LABEL fuse_memory_counter_wait
 func.func @fuse_memory_counter_wait() {
   //      CHECK: amdgpu.memory_counter_wait
-  // CHECK-SAME: load(1) store(2) ds(2) exp(1)
+  // CHECK-SAME: load(1) store(2) ds(2) exp(1) tensor(0)
   // CHECK-NEXT: return
-  amdgpu.memory_counter_wait load(1) store(2) ds(3) exp(4)
-  amdgpu.memory_counter_wait load(4) store(3) ds(2) exp(1)
+  amdgpu.memory_counter_wait load(1) store(2) ds(3) exp(4) tensor(5)
+  amdgpu.memory_counter_wait load(4) store(3) ds(2) exp(1) tensor(0)
   return
 }

llvmbot · 2025-12-08T16:18:07Z

@llvm/pr-subscribers-mlir-gpu

Author: Ivan Butygin (Hardcode84)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/171153.diff

7 Files Affected:

(modified) mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td (+3-2)
(modified) mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp (+6)
(modified) mlir/lib/Dialect/AMDGPU/IR/AMDGPUDialect.cpp (+5-3)
(modified) mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait.mlir (+4-4)
(added) mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_tensor.mlir (+9)
(added) mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_unsupported.mlir (+11)
(modified) mlir/test/Dialect/AMDGPU/canonicalize.mlir (+3-3)

diff --git a/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td b/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
index ba078f52d24f6..56160d3e8fe85 100644
--- a/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
+++ b/mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
@@ -906,7 +906,8 @@ def AMDGPU_MemoryCounterWaitOp :
       OptionalAttr<I32Attr>:$load,
       OptionalAttr<I32Attr>:$store,
       OptionalAttr<I32Attr>:$ds,
-      OptionalAttr<I32Attr>:$exp
+      OptionalAttr<I32Attr>:$exp,
+      OptionalAttr<I32Attr>:$tensor
     )>
   {
   let summary = "Wait for specified hardware counters";
@@ -919,7 +920,7 @@ def AMDGPU_MemoryCounterWaitOp :
     counters into one.
   }];
   let assemblyFormat = [{
-    oilist( `load` `(` $load `)` | `store` `(` $store `)` | `ds` `(` $ds `)` | `exp` `(` $exp `)` ) attr-dict
+    oilist( `load` `(` $load `)` | `store` `(` $store `)` | `ds` `(` $ds `)` | `exp` `(` $exp `)` | `tensor` `(` $tensor `)` ) attr-dict
   }];
 
   let hasCanonicalizer = 1;
diff --git a/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp b/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
index f3b0da0120998..7584b17075225 100644
--- a/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+++ b/mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
@@ -506,10 +506,16 @@ struct MemoryCounterWaitOpLowering
       if (std::optional<int> exp = adaptor.getExp())
         ROCDL::WaitExpcntOp::create(rewriter, loc, *exp);
 
+      if (std::optional<int> tensor = adaptor.getTensor())
+        ROCDL::WaitTensorcntOp::create(rewriter, loc, *tensor);
+
       rewriter.eraseOp(op);
       return success();
     }
 
+    if (adaptor.getTensor())
+      return op.emitOpError("unsupported chipset");
+
     auto getVal = [](Attribute attr) -> unsigned {
       if (attr)
         return cast<IntegerAttr>(attr).getInt();
diff --git a/mlir/lib/Dialect/AMDGPU/IR/AMDGPUDialect.cpp b/mlir/lib/Dialect/AMDGPU/IR/AMDGPUDialect.cpp
index 4a85db3ecf6f8..b7a665b0f5367 100644
--- a/mlir/lib/Dialect/AMDGPU/IR/AMDGPUDialect.cpp
+++ b/mlir/lib/Dialect/AMDGPU/IR/AMDGPUDialect.cpp
@@ -614,10 +614,12 @@ struct FuseMemoryCounterWaitOp final : OpRewritePattern<MemoryCounterWaitOp> {
 
     auto setters = {&MemoryCounterWaitOp::setLoad,
                     &MemoryCounterWaitOp::setStore, &MemoryCounterWaitOp::setDs,
-                    &MemoryCounterWaitOp::setExp};
-    auto lhsVals = {op.getLoad(), op.getStore(), op.getDs(), op.getExp()};
+                    &MemoryCounterWaitOp::setExp,
+                    &MemoryCounterWaitOp::setTensor};
+    auto lhsVals = {op.getLoad(), op.getStore(), op.getDs(), op.getExp(),
+                    op.getTensor()};
     auto rhsVals = {next.getLoad(), next.getStore(), next.getDs(),
-                    next.getExp()};
+                    next.getExp(), next.getTensor()};
     rewriter.modifyOpInPlace(op, [&] {
       for (auto [setter, lhs, rhs] :
            llvm::zip_equal(setters, lhsVals, rhsVals)) {
diff --git a/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait.mlir b/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait.mlir
index 1016ee859e462..537ef59b503a6 100644
--- a/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait.mlir
+++ b/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait.mlir
@@ -1,7 +1,7 @@
-// RUN: mlir-opt %s -convert-amdgpu-to-rocdl=chipset=gfx942 | FileCheck %s --check-prefixes=CHECK,GFX9
-// RUN: mlir-opt %s -convert-amdgpu-to-rocdl=chipset=gfx1030 | FileCheck %s --check-prefixes=CHECK,GFX10
-// RUN: mlir-opt %s -convert-amdgpu-to-rocdl=chipset=gfx1100 | FileCheck %s --check-prefixes=CHECK,GFX11
-// RUN: mlir-opt %s -convert-amdgpu-to-rocdl=chipset=gfx1201 | FileCheck %s --check-prefixes=CHECK,GFX12
+// RUN: mlir-opt %s --convert-amdgpu-to-rocdl=chipset=gfx942 | FileCheck %s --check-prefixes=CHECK,GFX9
+// RUN: mlir-opt %s --convert-amdgpu-to-rocdl=chipset=gfx1030 | FileCheck %s --check-prefixes=CHECK,GFX10
+// RUN: mlir-opt %s --convert-amdgpu-to-rocdl=chipset=gfx1100 | FileCheck %s --check-prefixes=CHECK,GFX11
+// RUN: mlir-opt %s --convert-amdgpu-to-rocdl=chipset=gfx1201 | FileCheck %s --check-prefixes=CHECK,GFX12
 
 // CHECK-LABEL: func @memory_counter_wait
 func.func @memory_counter_wait() {
diff --git a/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_tensor.mlir b/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_tensor.mlir
new file mode 100644
index 0000000000000..5b29e01abebdb
--- /dev/null
+++ b/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_tensor.mlir
@@ -0,0 +1,9 @@
+// RUN: mlir-opt %s --convert-amdgpu-to-rocdl=chipset=gfx1250 | FileCheck %s
+
+// CHECK-LABEL: func @memory_counter_wait_tensor
+func.func @memory_counter_wait_tensor() {
+  // CHECK: rocdl.s.wait.tensorcnt 3
+  amdgpu.memory_counter_wait tensor(3)
+
+  return
+}
diff --git a/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_unsupported.mlir b/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_unsupported.mlir
new file mode 100644
index 0000000000000..1d2f692bee488
--- /dev/null
+++ b/mlir/test/Conversion/AMDGPUToROCDL/memory_counter_wait_unsupported.mlir
@@ -0,0 +1,11 @@
+// RUN: mlir-opt %s --verify-diagnostics --convert-amdgpu-to-rocdl=chipset=gfx942
+// RUN: mlir-opt %s --verify-diagnostics --convert-amdgpu-to-rocdl=chipset=gfx1030
+// RUN: mlir-opt %s --verify-diagnostics --convert-amdgpu-to-rocdl=chipset=gfx1100
+
+func.func @memory_counter_wait_tensor() {
+  // expected-error @below{{failed to legalize operation 'amdgpu.memory_counter_wait'}}
+  // expected-error @below{{'amdgpu.memory_counter_wait' op unsupported chipset}}
+  amdgpu.memory_counter_wait tensor(0)
+
+  return
+}
diff --git a/mlir/test/Dialect/AMDGPU/canonicalize.mlir b/mlir/test/Dialect/AMDGPU/canonicalize.mlir
index c66e9ed5d6f6d..cff1d3f2ac1fd 100644
--- a/mlir/test/Dialect/AMDGPU/canonicalize.mlir
+++ b/mlir/test/Dialect/AMDGPU/canonicalize.mlir
@@ -250,10 +250,10 @@ func.func @scaled_mfma_ugly_shapes(%opA: vector<32xf4E2M1FN>, %opB: vector<32xf4
 // CHECK-LABEL fuse_memory_counter_wait
 func.func @fuse_memory_counter_wait() {
   //      CHECK: amdgpu.memory_counter_wait
-  // CHECK-SAME: load(1) store(2) ds(2) exp(1)
+  // CHECK-SAME: load(1) store(2) ds(2) exp(1) tensor(0)
   // CHECK-NEXT: return
-  amdgpu.memory_counter_wait load(1) store(2) ds(3) exp(4)
-  amdgpu.memory_counter_wait load(4) store(3) ds(2) exp(1)
+  amdgpu.memory_counter_wait load(1) store(2) ds(3) exp(4) tensor(5)
+  amdgpu.memory_counter_wait load(4) store(3) ds(2) exp(1) tensor(0)
   return
 }

krzysz00

Approved, with a note that the compiler team's working on a marker/wait API for these things that'll eliminate the need to do do our own counting

llvm-ci · 2025-12-08T19:51:28Z

LLVM Buildbot has detected a new failure on builder ppc64le-mlir-rhel-clang running on ppc64le-mlir-rhel-test while building mlir at step 3 "clean-build-dir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/129/builds/34580

Here is the relevant piece of the build log for the reference

Step 3 (clean-build-dir) failure: Delete failed. (failure) (timed out)
Step 6 (test-build-check-mlir-build-only-check-mlir) failure: 1200 seconds without output running [b'ninja', b'check-mlir'], attempting to kill
...
PASS: MLIR :: Analysis/DataFlow/test-last-modified-callgraph.mlir (3729 of 3740)
PASS: MLIR :: Pass/ir-printing-file-tree.mlir (3730 of 3740)
PASS: MLIR-Unit :: IR/./MLIRIRTests/0/131 (3731 of 3740)
PASS: MLIR-Unit :: Interfaces/./MLIRInterfacesTests/11/22 (3732 of 3740)
PASS: MLIR-Unit :: Interfaces/./MLIRInterfacesTests/13/22 (3733 of 3740)
PASS: MLIR-Unit :: Pass/./MLIRPassTests/10/13 (3734 of 3740)
PASS: MLIR-Unit :: IR/./MLIRIRTests/39/131 (3735 of 3740)
PASS: MLIR-Unit :: Interfaces/./MLIRInterfacesTests/12/22 (3736 of 3740)
PASS: MLIR-Unit :: IR/./MLIRIRTests/38/131 (3737 of 3740)
PASS: MLIR :: mlir-reduce/dce-test.mlir (3738 of 3740)
command timed out: 1200 seconds without output running [b'ninja', b'check-mlir'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=2984.808123

)

Hardcode84 added 2 commits December 8, 2025 17:10

[mlir][amdgpu] memory_counter_wait tensor counter support

31a4218

test

ff9da12

Hardcode84 requested a review from tgymnich December 8, 2025 16:17

Hardcode84 requested review from krzysz00 and kuhar as code owners December 8, 2025 16:17

llvmbot added backend:AMDGPU mlir:gpu mlir mlir:amdgpu labels Dec 8, 2025

krzysz00 approved these changes Dec 8, 2025

View reviewed changes

tgymnich approved these changes Dec 8, 2025

View reviewed changes

kuhar approved these changes Dec 8, 2025

View reviewed changes

Hardcode84 merged commit f88d060 into llvm:main Dec 8, 2025
10 checks passed

Hardcode84 deleted the tensor-waitcnt branch December 8, 2025 17:02

honeygoyal pushed a commit to honeygoyal/llvm-project that referenced this pull request Dec 9, 2025

[mlir][amdgpu] memory_counter_wait tensor counter support (llvm#171153

d7597b5

)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][amdgpu] `memory_counter_wait` tensor counter support #171153

[mlir][amdgpu] `memory_counter_wait` tensor counter support #171153

Uh oh!

Hardcode84 commented Dec 8, 2025

Uh oh!

llvmbot commented Dec 8, 2025 •

edited

Loading

Uh oh!

llvmbot commented Dec 8, 2025

Uh oh!

krzysz00 left a comment

Uh oh!

Uh oh!

llvm-ci commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[mlir][amdgpu] memory_counter_wait tensor counter support #171153

[mlir][amdgpu] memory_counter_wait tensor counter support #171153

Uh oh!

Conversation

Hardcode84 commented Dec 8, 2025

Uh oh!

llvmbot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Dec 8, 2025

Uh oh!

krzysz00 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[mlir][amdgpu] `memory_counter_wait` tensor counter support #171153

[mlir][amdgpu] `memory_counter_wait` tensor counter support #171153

llvmbot commented Dec 8, 2025 •

edited

Loading