[mlir][gpu] Refactor GpuOpsToROCDLOps pass interface (NFC) #157402

pabloantoniom · 2025-09-08T08:03:27Z

This PR deletes the createLowerGpuOpsToROCDLOpsPass constructor from
the .td file, making the createConvertGpuOpsToROCDLOps pass available to
users. This has the following effects:

createLowerGpuOpsToROCDLOpsPass is not available anymore. Instead,
createConvertGpuOpsToROCDLOps should be used. This makes the interface
consistent with ConvertGpuOpsToNVVMOps.
To call createConvertGpuOpsToROCDLOps, the options must be passed
via ConvertGpuOpsToROCDLOpsOptions. This has the side effect of
making the allowed-dialects option available, which was not
accessible via C++ before.

The `convert-gpu-to-rocdl` pass provides the option `allowed-dialects`, which allows users to control which dialects can be used to populate conversions. This PR adds a C++ argument to createLowerGpuOpsToROCDLOpsPass, so that this option can also be controlled programatically when creating the pass.

llvmbot · 2025-09-08T08:04:00Z

@llvm/pr-subscribers-mlir-gpu

@llvm/pr-subscribers-mlir

Author: Pablo Antonio Martinez (pabloantoniom)

Changes

The convert-gpu-to-rocdl pass provides the option allowed-dialects, which allows users to control which dialects can be used to populate conversions.

This PR adds a C++ argument to createLowerGpuOpsToROCDLOpsPass, so that this option can also be controlled programatically when creating the pass.

cc: @dhernandez0

Full diff: https://github.com/llvm/llvm-project/pull/157402.diff

2 Files Affected:

(modified) mlir/include/mlir/Conversion/GPUToROCDL/GPUToROCDLPass.h (+5-1)
(modified) mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp (+15-8)

diff --git a/mlir/include/mlir/Conversion/GPUToROCDL/GPUToROCDLPass.h b/mlir/include/mlir/Conversion/GPUToROCDL/GPUToROCDLPass.h
index 291b809071ce9..a6099bde2a70e 100644
--- a/mlir/include/mlir/Conversion/GPUToROCDL/GPUToROCDLPass.h
+++ b/mlir/include/mlir/Conversion/GPUToROCDL/GPUToROCDLPass.h
@@ -10,6 +10,8 @@
 
 #include "mlir/Conversion/GPUToROCDL/Runtimes.h"
 #include "mlir/Conversion/LLVMCommon/LoweringOptions.h"
+#include "llvm/ADT/DenseSet.h"
+#include <cstddef>
 #include <memory>
 
 namespace mlir {
@@ -50,7 +52,9 @@ createLowerGpuOpsToROCDLOpsPass(
     const std::string &chipset = "gfx900",
     unsigned indexBitwidth = kDeriveIndexBitwidthFromDataLayout,
     bool useBarePtrCallConv = false,
-    gpu::amd::Runtime runtime = gpu::amd::Runtime::Unknown);
+    gpu::amd::Runtime runtime = gpu::amd::Runtime::Unknown,
+    const std::optional<llvm::SmallDenseSet<llvm::StringRef>> &allowedDialects =
+        std::nullopt);
 
 } // namespace mlir
 
diff --git a/mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp b/mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
index 807d1f52ee69b..965089df0303e 100644
--- a/mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
+++ b/mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
@@ -288,9 +288,10 @@ struct GPUShuffleOpLowering : public ConvertOpToLLVMPattern<gpu::ShuffleOp> {
 struct LowerGpuOpsToROCDLOpsPass final
     : public impl::ConvertGpuOpsToROCDLOpsBase<LowerGpuOpsToROCDLOpsPass> {
   LowerGpuOpsToROCDLOpsPass() = default;
-  LowerGpuOpsToROCDLOpsPass(const std::string &chipset, unsigned indexBitwidth,
-                            bool useBarePtrCallConv,
-                            gpu::amd::Runtime runtime) {
+  LowerGpuOpsToROCDLOpsPass(
+      const std::string &chipset, unsigned indexBitwidth,
+      bool useBarePtrCallConv, gpu::amd::Runtime runtime,
+      std::optional<llvm::SmallDenseSet<StringRef>> allowedDialects) {
     if (this->chipset.getNumOccurrences() == 0)
       this->chipset = chipset;
     if (this->indexBitwidth.getNumOccurrences() == 0)
@@ -299,6 +300,12 @@ struct LowerGpuOpsToROCDLOpsPass final
       this->useBarePtrCallConv = useBarePtrCallConv;
     if (this->runtime.getNumOccurrences() == 0)
       this->runtime = runtime;
+    if (this->allowedDialects.getNumOccurrences() == 0 &&
+        allowedDialects.has_value()) {
+      for (auto &str : allowedDialects.value()) {
+        this->allowedDialects.push_back(str.str());
+      }
+    }
   }
 
   void getDependentDialects(DialectRegistry &registry) const override {
@@ -501,10 +508,10 @@ void mlir::populateGpuToROCDLConversionPatterns(
 }
 
 std::unique_ptr<OperationPass<gpu::GPUModuleOp>>
-mlir::createLowerGpuOpsToROCDLOpsPass(const std::string &chipset,
-                                      unsigned indexBitwidth,
-                                      bool useBarePtrCallConv,
-                                      gpu::amd::Runtime runtime) {
+mlir::createLowerGpuOpsToROCDLOpsPass(
+    const std::string &chipset, unsigned indexBitwidth, bool useBarePtrCallConv,
+    gpu::amd::Runtime runtime,
+    const std::optional<llvm::SmallDenseSet<StringRef>> &allowedDialects) {
   return std::make_unique<LowerGpuOpsToROCDLOpsPass>(
-      chipset, indexBitwidth, useBarePtrCallConv, runtime);
+      chipset, indexBitwidth, useBarePtrCallConv, runtime, allowedDialects);
 }

mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp

mlir/include/mlir/Conversion/GPUToROCDL/GPUToROCDLPass.h

joker-eph · 2025-09-08T13:41:29Z

mlir/include/mlir/Conversion/GPUToROCDL/GPUToROCDLPass.h

-    gpu::amd::Runtime runtime = gpu::amd::Runtime::Unknown);
+    gpu::amd::Runtime runtime = gpu::amd::Runtime::Unknown,
+    const std::optional<llvm::SmallDenseSet<llvm::StringRef>> &allowedDialects =
+        std::nullopt);


TableGen already generates all the suitable creation function, we should be able to remove this entirely and use the generated one instead (what is missing?)

TableGen already generates all the suitable creation function, we should be able to remove this entirely and use the generated one instead (what is missing?)

I'm not sure I understand your suggestion, sorry. If you try to call createLowerGpuOpsToROCDLOpsPass (i.e., from the pass manager) in C++, prior to this PR, there was no way to specify allowedDialects, so this PR adds support for that (Unless I'm missing some TableGen functionality that I did not know about)

Mehdi is suggesting that you remove https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/Conversion/Passes.td#L627 and let tablegen generate the create* calls.

Right, I understood that tablegen was supposed to generate the same create* function as the handwritten one, thanks @fabianmcg for the clarification. I have pushed a commit to address this.

fabianmcg · 2025-09-08T14:52:16Z

@pabloantoniom do you have an use case for this? Or is just an improvement? I'm wondering if it's the latter, because I'm thinking on working in removing* these passes and only keep convert-to-llvm.

*There would be a transition period where the pass names would be available as pipelines.

…legen and delete handwritten version

pabloantoniom · 2025-09-08T16:12:20Z

@pabloantoniom do you have an use case for this? Or is just an improvement? I'm wondering if it's the latter, because I'm thinking on working in removing* these passes and only keep convert-to-llvm.

*There would be a transition period where the pass names would be available as pipelines.

Yes, there is an use case for this. You can check this downstream user.

fabianmcg · 2025-09-08T16:20:00Z

mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp

  LowerGpuOpsToROCDLOpsPass() = default;
-  LowerGpuOpsToROCDLOpsPass(const std::string &chipset, unsigned indexBitwidth,
-                            bool useBarePtrCallConv,
-                            gpu::amd::Runtime runtime) {
+  LowerGpuOpsToROCDLOpsPass(ConvertGpuOpsToROCDLOpsOptions options)
+      : ConvertGpuOpsToROCDLOpsBase(options) {}
+  LowerGpuOpsToROCDLOpsPass(
+      const std::string &chipset, unsigned indexBitwidth,
+      bool useBarePtrCallConv, gpu::amd::Runtime runtime,
+      std::optional<llvm::SmallDenseSet<StringRef>> allowedDialects) {
    if (this->chipset.getNumOccurrences() == 0)


Do you need these? Can't we remove?

Absolutely, thanks for the suggestion. By removing this we also match what LowerGpuOpsToNVVMOpsPass does so it should be easier for you if you want to do some cleanup later.

fabianmcg

Preemptively blocking while these get addressed:

Please remove the username from the description:
https://discourse.llvm.org/t/forbidding-username-in-commits/86997

Fix description, and the PR title.

And see my other comment.

joker-eph · 2025-09-08T17:41:35Z

mlir/include/mlir/Conversion/GPUToROCDL/GPUToROCDLPass.h

 #include "mlir/Conversion/LLVMCommon/LoweringOptions.h"
+#include "mlir/Pass/Pass.h"
+#include "llvm/ADT/DenseSet.h"
+#include <cstddef>


Now you're only removing something from the header, so I'm not sure why you need to add new includes?

Ah, I forgot to remove those after deleting the constructor in the .td. Only Pass is needed, since now TableGen is creating the create* function which uses mlir::Pass, but I have removed the include and added Pass to the namespace below instead.

joker-eph · 2025-09-08T17:42:53Z

@pabloantoniom do you have an use case for this? Or is just an improvement? I'm wondering if it's the latter, because I'm thinking on working in removing* these passes and only keep convert-to-llvm.
*There would be a transition period where the pass names would be available as pipelines.

Yes, there is an use case for this. You can check this downstream user.

I think the question behind the question is: what can you do with this pass that you can't do with convert-to-llvm and what would it take to make it possible to do with convert-to-llvm?

krzysz00 · 2025-09-08T19:25:13Z

I think the question behind the question is: what can you do with this pass that you can't do with convert-to-llvm and what would it take to make it possible to do with convert-to-llvm?

The pass loads in target-specific patterns - amdgpu-to-rocdl and the gpu to rocdl patterns themselves. Those could maybe be run before general convert-to-llvm, but they depend on the LLVM conversion infrastructure. However, the AMDGPU dialect to ROCDL dialect conversions add entries to the LLVM type converter (for pointer address spaces attributes), and so those would want to be run at the same time as the rest of the LLVM conversion patterns ... which there's no mechanism for with convert-to-llvm
This pass sets up other address space handling. There's a call to populateGpuMemorySpaceAttributeConversions in here, which maps memory spaces like #gpu.address_space<workgroup> to their correct platform-specific values. There's no generic mechanism that I know of for populating that mapping in a convert-to-llvm usage.

There's the section that starts

// Manually rewrite known block size attributes so the LLVMIR translation
// infrastructure can pick them up.

which compensates for limitations of the gpu-to-llvm rewrites being generic.

In short, convert-gpu-to-rocdl is a pass that does a bunch of non-trivial target-specific setup (and, in one case, post-processing, though maybe that's a bit of a hack) work and adds extra conversion patterns so that we're converting to AMDGPU LLVM, not generic LLVM.

If convert-to-llvm were set up in a way that would let us put this sort of setup on some entity in the context (ex. a target attribute) and that were plumbed through reliably, we wouldn't need this patt. For now, we do.

krzysz00 · 2025-09-08T19:26:48Z

Oh, and 4., it sets the data layout to the relevant AMDGPU string.

(or, to give the flippant answer, see LowerGpuOpsToROCDLOpsPass::runOnOperation() )

fabianmcg · 2025-09-08T19:35:39Z

@krzysz00 almost all the underlying technical issues to support what you describe have been solved for a while:
https://github.com/llvm/llvm-project/blob/main/mlir/test/Conversion/GPUToNVVM/gpu-to-nvvm-target-attr.mlir#L5-L24

It's just I haven't had the time to add it for ROCDL, and there were a couple of lingering issues on the data layout side that prevented me from doing it, but some of those were solved here #145899 .

My question was directed more towards, do I need to be aware of some extra complication I need to deal when removing GPU to ROCDL, or was this patch mostly a NFC quality of life improvement.

krzysz00 · 2025-09-08T23:14:23Z

This reads as NFC to me

pabloantoniom · 2025-09-09T06:18:45Z

Preemptively blocking while these get addressed:

Please remove the username from the description: https://discourse.llvm.org/t/forbidding-username-in-commits/86997

Fix description, and the PR title.

And see my other comment.

Good catch, thanks. Removed username from description, PR title is fine.

pabloantoniom · 2025-09-09T06:41:54Z

@krzysz00 almost all the underlying technical issues to support what you describe have been solved for a while: https://github.com/llvm/llvm-project/blob/main/mlir/test/Conversion/GPUToNVVM/gpu-to-nvvm-target-attr.mlir#L5-L24

It's just I haven't had the time to add it for ROCDL, and there were a couple of lingering issues on the data layout side that prevented me from doing it, but some of those were solved here #145899 .

My question was directed more towards, do I need to be aware of some extra complication I need to deal when removing GPU to ROCDL, or was this patch mostly a NFC quality of life improvement.

In my opinion, my original commit was indeed NFC. However, after Mehdi's suggestion, I'm not sure anymore, since it's changing the interface, thus forcing users of createLowerGpuOpsToROCDLOpsPass (with 4 arguments) to use createConvertGpuOpsToROCDLOps with 1 argument (ConvertGpuOpsToROCDLOpsOptions). This also has the good thing of making the ROCDL pass consistent with the NVVM one, as the latter does not have the constructor in TableGen, whereas the former had it prior this commit. I will make this clear in the commit description to make everyone aware.

Coming back to the NFC discussion, I guess it depends on where you draw the boundary of NFC. I looked up here, but it does not give enough context, maybe an opportunity to improve on this?

pabloantoniom · 2025-09-09T08:16:26Z

Thank you all for the reviews. I have updated PR title and description to make it more consistent with the latest changes. Hope this is compatible with what you were expecting @fabianmcg

fabianmcg

LGTM, thanks for the cleanup!

joker-eph · 2025-09-09T15:20:23Z

In my opinion, my original commit was indeed NFC. However, after Mehdi's suggestion, I'm not sure anymore, since it's changing the interface,

NFC is about the compiler behavior, not the API changes. Our APIs change all the time, but if there is no test change, it better be NFC.

krzysz00

LGTM here too

pabloantoniom requested a review from Hardcode84 September 8, 2025 08:03

pabloantoniom requested a review from fabianmcg as a code owner September 8, 2025 08:03

llvmbot added mlir:gpu mlir labels Sep 8, 2025

pabloantoniom requested review from krzysz00 and kuhar September 8, 2025 08:03

kuhar reviewed Sep 8, 2025

View reviewed changes

mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp Outdated Show resolved Hide resolved

mlir/include/mlir/Conversion/GPUToROCDL/GPUToROCDLPass.h Outdated Show resolved Hide resolved

joker-eph reviewed Sep 8, 2025

View reviewed changes

[Refactor] Force users to use create* function autogenerated from tab…

5572996

…legen and delete handwritten version

fabianmcg reviewed Sep 8, 2025

View reviewed changes

kuhar approved these changes Sep 8, 2025

View reviewed changes

fabianmcg self-requested a review September 8, 2025 16:25

fabianmcg requested changes Sep 8, 2025

View reviewed changes

joker-eph reviewed Sep 8, 2025

View reviewed changes

pabloantoniom added 2 commits September 9, 2025 02:00

Cleanup GPUToROCDLPass header

5dbb951

Delete constructors and use the ones from base

a118652

pabloantoniom changed the title ~~[mlir][gpu] GPUToROCDL: Add C++ argument to populate allowedDialects~~ [mlir][gpu] Refactor GpuOpsToROCDLOps pass interface Sep 9, 2025

pabloantoniom requested a review from fabianmcg September 9, 2025 08:16

pabloantoniom requested a review from joker-eph September 9, 2025 08:16

fabianmcg approved these changes Sep 9, 2025

View reviewed changes

joker-eph approved these changes Sep 9, 2025

View reviewed changes

krzysz00 approved these changes Sep 10, 2025

View reviewed changes

pabloantoniom changed the title ~~[mlir][gpu] Refactor GpuOpsToROCDLOps pass interface~~ [mlir][gpu] Refactor GpuOpsToROCDLOps pass interface (NFC) Sep 10, 2025

pabloantoniom merged commit dd04668 into llvm:main Sep 10, 2025
9 checks passed

[mlir][gpu] Refactor GpuOpsToROCDLOps pass interface (NFC) #157402

[mlir][gpu] Refactor GpuOpsToROCDLOps pass interface (NFC) #157402

Uh oh!

Conversation

pabloantoniom commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fabianmcg commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pabloantoniom commented Sep 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fabianmcg left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joker-eph commented Sep 8, 2025

Uh oh!

krzysz00 commented Sep 8, 2025

Uh oh!

krzysz00 commented Sep 8, 2025

Uh oh!

fabianmcg commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

krzysz00 commented Sep 8, 2025

Uh oh!

pabloantoniom commented Sep 9, 2025

Uh oh!

pabloantoniom commented Sep 9, 2025

Uh oh!

pabloantoniom commented Sep 9, 2025

Uh oh!

fabianmcg left a comment

Choose a reason for hiding this comment

Uh oh!

joker-eph commented Sep 9, 2025

Uh oh!

krzysz00 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pabloantoniom commented Sep 8, 2025 •

edited

Loading

llvmbot commented Sep 8, 2025 •

edited

Loading

fabianmcg commented Sep 8, 2025 •

edited

Loading

fabianmcg left a comment •

edited

Loading

fabianmcg commented Sep 8, 2025 •

edited

Loading