Skip to content

Conversation

ravil-mobile
Copy link
Contributor

@ravil-mobile ravil-mobile commented Oct 8, 2025

This patch introduces some missing s.barrier instructions in the ROCDL dialect handling named barriers

Specifically:

@llvm.amdgcn.s.barrier.init - s_barrier_init
@llvm.amdgcn.s.barrier.join - s_barrier_join
@llvm.amdgcn.s.barrier.leave - s_barrier_leave
@llvm.amdgcn.s.barrier.signal.isfirst - s_barrier_signal_isfirst
@llvm.amdgcn.s.get.barrier.state - s_get_barrier_state

Tests:

  • Added lit-tests to check MLIR -> LLVM lowering

cc @krzysz00, @amd-eochoalo

@llvmbot
Copy link
Member

llvmbot commented Oct 8, 2025

@llvm/pr-subscribers-mlir-llvm

@llvm/pr-subscribers-mlir

Author: None (ravil-mobile)

Changes

This patch introduces some missing s.barrier instructions in the ROCDL dialect handling named barriers

Specifically:

@<!-- -->llvm.amdgcn.s.barrier.init - s_barrier_init
@<!-- -->llvm.amdgcn.s.barrier.join - s_barrier_join
@<!-- -->llvm.amdgcn.s.barrier.leave - s_barrier_leave
@<!-- -->llvm.amdgcn.s.barrier.signal.isfirst - s_barrier_signal_isfirst

Tests:

  • Added lit-tests to check MLIR -> LLVM lowering

cc @krzysz00, @amd-eochoalo


Full diff: https://github.com/llvm/llvm-project/pull/162488.diff

3 Files Affected:

  • (modified) mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td (+32-1)
  • (modified) mlir/test/Dialect/LLVMIR/rocdl.mlir (+35)
  • (modified) mlir/test/Target/LLVMIR/rocdl.mlir (+35)
diff --git a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
index db1b7e3af62fd..7560b61137afc 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
@@ -292,18 +292,50 @@ def ROCDL_BarrierOp : ROCDL_Op<"barrier"> {
   let assemblyFormat = "attr-dict";
 }
 
+def ROCDLBufferLDS : LLVM_PointerInAddressSpace<3>;
+
+def ROCDL_BarrierInitOp : ROCDL_IntrOp<"s.barrier.init", [], [], [Pure], 0, 0, 0, 0, [1], ["id"]>,
+  Arguments<(ins Arg<ROCDLBufferLDS, "", [MemRead]>:$ptr, I32Attr:$id)> {
+  let results = (outs);
+  let assemblyFormat = "$ptr `,` $id attr-dict";
+}
+
 def ROCDL_BarrierSignalOp : ROCDL_ConcreteNonMemIntrOp<"s.barrier.signal", [], 0, [0], ["id"]>,
   Arguments<(ins I32Attr:$id)> {
   let results = (outs);
   let assemblyFormat = "$id attr-dict";
 }
 
+def ROCDL_BarrierSignalVarOp : ROCDL_IntrOp<"s.barrier.signal.var", [], [], [Pure], 0, 0, 0, 0, [1], ["id"]>,
+  Arguments<(ins Arg<ROCDLBufferLDS, "", [MemRead]>:$ptr, I32Attr:$id)> {
+  let results = (outs);
+  let assemblyFormat = "$ptr `,` $id attr-dict";
+}
+
+def ROCDL_BarrierJoinOp : ROCDL_IntrOp<"s.barrier.join", [], [], [Pure], 0>,
+  Arguments<(ins Arg<ROCDLBufferLDS, "", [MemRead]>:$ptr)> {
+  let results = (outs);
+  let assemblyFormat = "$ptr attr-dict";
+}
+
+def ROCDL_BarrierLeaveOp : ROCDL_ConcreteNonMemIntrOp<"s.barrier.leave", [], 0, [0], ["id"]>,
+  Arguments<(ins I16Attr:$id)> {
+  let results = (outs);
+  let assemblyFormat = "$id attr-dict";
+}
+
 def ROCDL_BarrierWaitOp : ROCDL_ConcreteNonMemIntrOp<"s.barrier.wait", [], 0, [0], ["id"]>,
   Arguments<(ins I16Attr:$id)> {
   let results = (outs);
   let assemblyFormat = "$id attr-dict";
 }
 
+def ROCDL_BarrierSignalIsfirstOp : ROCDL_ConcreteNonMemIntrOp<"s.barrier.signal.isfirst", [], 1, [0], ["id"]>,
+  Arguments<(ins I32Attr:$id)> {
+  let results = (outs I1:$res);
+  let assemblyFormat = "$id attr-dict `:` type($res)";
+}
+
 def ROCDL_WaitDscntOp: ROCDL_ConcreteNonMemIntrOp<"s.wait.dscnt", [], 0, [0], ["count"]>,
   Arguments<(ins I16Attr:$count)> {
   let summary = "Wait until DSCNT is less than or equal to `count`";
@@ -497,7 +529,6 @@ def ROCDL_wmma_i32_16x16x32_iu4 : ROCDL_Wmma_IntrOp<"wmma.i32.16x16x32.iu4", [1]
 // LDS transpose intrinsics (available in GFX950)
 
 def ROCDLGlobalBuffer : LLVM_PointerInAddressSpace<1>;
-def ROCDLBufferLDS : LLVM_PointerInAddressSpace<3>;
 
 class ROCDL_LDS_Read_Tr_IntrOp<string mnemonic> :
   ROCDL_IntrOp<mnemonic, [1], [], [], 1, 0, 1> {
diff --git a/mlir/test/Dialect/LLVMIR/rocdl.mlir b/mlir/test/Dialect/LLVMIR/rocdl.mlir
index a88b59aeb61b2..9e13baf77a689 100644
--- a/mlir/test/Dialect/LLVMIR/rocdl.mlir
+++ b/mlir/test/Dialect/LLVMIR/rocdl.mlir
@@ -951,6 +951,13 @@ llvm.func @rocdl.s.barrier() {
   llvm.return
 }
 
+llvm.func @rocdl.s.barrier.init(%ptr : !llvm.ptr<3>) {
+  // CHECK-LABEL: rocdl.s.barrier.init
+  // CHECK: rocdl.s.barrier.init %[[PTR:.+]], 1
+  rocdl.s.barrier.init %ptr, 1
+  llvm.return
+}
+
 llvm.func @rocdl.s.barrier.signal() {
   // CHECK-LABEL: rocdl.s.barrier.signal
   // CHECK: rocdl.s.barrier.signal -1
@@ -958,6 +965,27 @@ llvm.func @rocdl.s.barrier.signal() {
   llvm.return
 }
 
+llvm.func @rocdl.s.barrier.signal.var(%ptr : !llvm.ptr<3>) {
+  // CHECK-LABEL: rocdl.s.barrier.signal.var
+  // CHECK: rocdl.s.barrier.signal.var %[[PTR:.+]], 1
+  rocdl.s.barrier.signal.var %ptr, 1
+  llvm.return
+}
+
+llvm.func @rocdl.s.barrier.join(%ptr : !llvm.ptr<3>) {
+  // CHECK-LABEL: rocdl.s.barrier.join
+  // CHECK: rocdl.s.barrier.join %[[PTR:.+]]
+  rocdl.s.barrier.join %ptr
+  llvm.return
+}
+
+llvm.func @rocdl.s.barrier.leave() {
+  // CHECK-LABEL: rocdl.s.barrier.leave
+  // CHECK: rocdl.s.barrier.leave 1
+  rocdl.s.barrier.leave 1
+  llvm.return
+}
+
 llvm.func @rocdl.s.barrier.wait() {
   // CHECK-LABEL: rocdl.s.barrier.wait
   // CHECK: rocdl.s.barrier.wait -1
@@ -965,6 +993,13 @@ llvm.func @rocdl.s.barrier.wait() {
   llvm.return
 }
 
+llvm.func @rocdl.s.barrier.signal.isfirst() {
+  // CHECK-LABEL: rocdl.s.barrier.signal.isfirst
+  // CHECK: rocdl.s.barrier.signal.isfirst 1
+  %0 = rocdl.s.barrier.signal.isfirst 1 : i1
+  llvm.return
+}
+
 llvm.func @rocdl.s.wait.dscnt() {
   // CHECK-LABEL: rocdl.s.wait.dscnt
   // CHECK: rocdl.s.wait.dscnt 0
diff --git a/mlir/test/Target/LLVMIR/rocdl.mlir b/mlir/test/Target/LLVMIR/rocdl.mlir
index 1c0c2eba002aa..3eb404bcb8a40 100644
--- a/mlir/test/Target/LLVMIR/rocdl.mlir
+++ b/mlir/test/Target/LLVMIR/rocdl.mlir
@@ -192,6 +192,13 @@ llvm.func @rocdl.barrier() {
   llvm.return
 }
 
+llvm.func @rocdl.s.barrier.init(%ptr : !llvm.ptr<3>) {
+  // CHECK-LABEL: rocdl.s.barrier.init
+  // CHECK: call void @llvm.amdgcn.s.barrier.init(ptr addrspace(3) %[[PTR:.+]], i32 1)
+  rocdl.s.barrier.init %ptr, 1
+  llvm.return
+}
+
 llvm.func @rocdl.s.barrier.signal() {
   // CHECK-LABEL: rocdl.s.barrier.signal
   // CHECK-NEXT: call void @llvm.amdgcn.s.barrier.signal(i32 -1)
@@ -199,6 +206,27 @@ llvm.func @rocdl.s.barrier.signal() {
   llvm.return
 }
 
+llvm.func @rocdl.s.barrier.signal.var(%ptr : !llvm.ptr<3>) {
+  // CHECK-LABEL: rocdl.s.barrier.signal.var
+  // CHECK: call void @llvm.amdgcn.s.barrier.signal.var(ptr addrspace(3) %[[PTR:.+]], i32 1)
+  rocdl.s.barrier.signal.var %ptr, 1
+  llvm.return
+}
+
+llvm.func @rocdl.s.barrier.join(%ptr : !llvm.ptr<3>) {
+  // CHECK-LABEL: rocdl.s.barrier.join
+  // CHECK: call void @llvm.amdgcn.s.barrier.join(ptr addrspace(3) %[[PTR:.+]])
+  rocdl.s.barrier.join %ptr
+  llvm.return
+}
+
+llvm.func @rocdl.s.barrier.leave() {
+  // CHECK-LABEL: rocdl.s.barrier.leave
+  // CHECK: call void @llvm.amdgcn.s.barrier.leave(i16 1)
+  rocdl.s.barrier.leave 1
+  llvm.return
+}
+
 llvm.func @rocdl.s.barrier.wait() {
   // CHECK-LABEL: rocdl.s.barrier.wait
   // CHECK-NEXT: call void @llvm.amdgcn.s.barrier.wait(i16 -1)
@@ -206,6 +234,13 @@ llvm.func @rocdl.s.barrier.wait() {
   llvm.return
 }
 
+llvm.func @rocdl.s.barrier.signal.isfirst() {
+  // CHECK-LABEL: rocdl.s.barrier.signal.isfirst
+  // CHECK:  %[[OUT:.+]] = call i1 @llvm.amdgcn.s.barrier.signal.isfirst(i32 1)
+  %0 = rocdl.s.barrier.signal.isfirst 1 : i1
+  llvm.return
+}
+
 llvm.func @rocdl.s.wait.dscnt() {
   // CHECK-LABEL: rocdl.s.wait.dscnt
   // CHECK-NEXT: call void @llvm.amdgcn.s.wait.dscnt(i16 0)

@@ -292,18 +292,56 @@ def ROCDL_BarrierOp : ROCDL_Op<"barrier"> {
let assemblyFormat = "attr-dict";
}

def ROCDLBufferLDS : LLVM_PointerInAddressSpace<3>;

def ROCDL_BarrierInitOp : ROCDL_IntrOp<"s.barrier.init", [], [], [Pure], 0, 0, 0, 0, [1], ["id"]>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this pure? I don't see any results returned by this op

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think this is pure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... Heck, the LLVM has IntrHasSideEffects

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah should not be pure. this op does not have a return value so it will get eliminated if it is pure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Removed the trait. As I can see there is not HasSideEffect trait in MLIR. So, I left the list of traits empty

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

def ROCDL_BarrierJoinOp : ROCDL_IntrOp<"s.barrier.join", [], [], [Pure], 0>,
Arguments<(ins Arg<ROCDLBufferLDS, "", [MemRead]>:$ptr)> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary to mark this as MemRed - the LLVM intrinsics are marked NoMem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. You are right. All this functions are marked with IntrNoMem trait in LLVM. Removed it

@lialan
Copy link
Member

lialan commented Oct 9, 2025

All LLVM intrinsics added in this op have property IntrHasSideEffects.

def ROCDLBufferLDS : LLVM_PointerInAddressSpace<3>;

def ROCDL_BarrierInitOp : ROCDL_IntrOp<"s.barrier.init", [], [], [Pure], 0, 0, 0, 0, [1], ["id"]>,
Arguments<(ins Arg<ROCDLBufferLDS, "", [MemRead]>:$ptr, I32Attr:$id)> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See: https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/AMDGPU/s-barrier.ll#L83
so the s_barrier_init takes a single m0 so it should not read mem.

Same is signal.var, join.

@lialan
Copy link
Member

lialan commented Oct 9, 2025

The discrepancy here being: barriers need to be defined as a global variable at the LLVM IR level.
However those intrinsics do not rely on actually reading a global address in the emitted assembly (can see in the lit tests).

@ravil-mobile
Copy link
Contributor Author

Can you add descriptions saying which gfxip version introduced these? Example: https://github.com/ravil-mobile/llvm-project/blob/5135c4fdae71ef27192ec190298f3a00ba85ae64/mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td#L1043-L1045

Done

@lialan
Copy link
Member

lialan commented Oct 9, 2025

just want to make sure: do you know if on the rocdl level, is there any way we can declare a amdgcn.named.barrier?

Copy link
Contributor

@krzysz00 krzysz00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm

@krzysz00
Copy link
Contributor

krzysz00 commented Oct 9, 2025

@lialan To answer your question, the declarations I see in tests for these named barriers are

@bar3 = internal addrspace(3) global target("amdgcn.named.barrier", 0) poison`

So, at the MLIR level, you just need a poison global value, in LDS, of type
!llvm.target<"amdgcn.named.barrier", 0>

(No, I don't know what the 0 is for)

So future work will be adding amdgpu.named_barrier @symbol

@lialan
Copy link
Member

lialan commented Oct 9, 2025

So, at the MLIR level, you just need a poison global value, in LDS, of type !llvm.target<"amdgcn.named.barrier", 0>

@krzysz00 Yeah I do not see the wrapper for type amdgcn.named.barrier in rocdl dialect. Or actually I do not need this?

@krzysz00
Copy link
Contributor

krzysz00 commented Oct 9, 2025

@lialan You don't need such a wrapper, just use llvm.target types.

(We can add a wrapper to AMDGPU, but we don't need one in rocdl)

@kuhar kuhar merged commit 3c080ed into llvm:main Oct 10, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants