Skip to content

[AMDGPU][Clang] use ScopeModel to translate integer scopes [NFC]#198250

Merged
ssahasra merged 2 commits into
mainfrom
users/ssahasra/clang-scope-metadata
May 21, 2026
Merged

[AMDGPU][Clang] use ScopeModel to translate integer scopes [NFC]#198250
ssahasra merged 2 commits into
mainfrom
users/ssahasra/clang-scope-metadata

Conversation

@ssahasra
Copy link
Copy Markdown
Contributor

@ssahasra ssahasra commented May 18, 2026

The assumption here is that AMDGPU builtins (typically suffixed with __builtin_amdgcn) use the __MEMORY_SCOPE_* enumeration, and not the __HIP_MEMORY_SCOPE_* enumeration (which is how it should be).

Assisted-By: Claude Opus 4.6

@ssahasra ssahasra requested review from Pierre-vh and changpeng May 18, 2026 08:13
@llvmorg-github-actions llvmorg-github-actions Bot added clang Clang issues not falling into any other category backend:AMDGPU clang:codegen IR generation bugs: mangling, exceptions, etc. labels May 18, 2026
@llvmorg-github-actions
Copy link
Copy Markdown

llvmorg-github-actions Bot commented May 18, 2026

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-clang-codegen

Author: Sameer Sahasrabuddhe (ssahasra)

Changes

Assisted-By: Claude Opus 4.6


Full diff: https://github.com/llvm/llvm-project/pull/198250.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp (+17-9)
  • (modified) clang/test/CodeGenOpenCL/builtins-amdgcn-gfx1250-load-monitor.cl (+20)
diff --git a/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp b/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
index 751cd9847bd31..ce4a53c768d35 100644
--- a/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+++ b/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
@@ -383,6 +383,22 @@ static llvm::AtomicOrdering mapCABIAtomicOrdering(unsigned AO) {
   llvm_unreachable("Unknown AtomicOrderingCABI enum");
 }
 
+/// Convert a __MEMORY_SCOPE_* integer constant to a metadata node containing
+/// the target-specific sync scope string.
+static llvm::MetadataAsValue *
+emitScopeMD(CodeGenFunction &CGF, unsigned ScopeInt,
+            llvm::AtomicOrdering AO =
+                llvm::AtomicOrdering::SequentiallyConsistent) {
+  AtomicScopeGenericModel ScopeModel;
+  SyncScope Scope = ScopeModel.map(ScopeInt);
+  StringRef ScopeStr = CGF.CGM.getTargetCodeGenInfo().getLLVMSyncScopeStr(
+      CGF.CGM.getLangOpts(), Scope, AO);
+  llvm::LLVMContext &Ctx = CGF.CGM.getLLVMContext();
+  llvm::MDNode *MD =
+      llvm::MDNode::get(Ctx, {llvm::MDString::get(Ctx, ScopeStr)});
+  return llvm::MetadataAsValue::get(Ctx, MD);
+}
+
 // For processing memory ordering and memory scope arguments of various
 // amdgcn builtins.
 // \p Order takes a C++11 compatible memory-ordering specifier and converts
@@ -927,22 +943,14 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
       break;
     }
 
-    LLVMContext &Ctx = CGM.getLLVMContext();
     llvm::Type *LoadTy = ConvertType(E->getType());
     llvm::Value *Addr = EmitScalarExpr(E->getArg(0));
 
     auto *AOExpr = cast<llvm::ConstantInt>(EmitScalarExpr(E->getArg(1)));
     auto *ScopeExpr = cast<llvm::ConstantInt>(EmitScalarExpr(E->getArg(2)));
-
-    auto Scope = static_cast<SyncScope>(ScopeExpr->getZExtValue());
     llvm::AtomicOrdering AO = mapCABIAtomicOrdering(AOExpr->getZExtValue());
 
-    StringRef ScopeStr = CGM.getTargetCodeGenInfo().getLLVMSyncScopeStr(
-        CGM.getLangOpts(), Scope, AO);
-
-    llvm::MDNode *MD =
-        llvm::MDNode::get(Ctx, {llvm::MDString::get(Ctx, ScopeStr)});
-    llvm::Value *ScopeMD = llvm::MetadataAsValue::get(Ctx, MD);
+    llvm::Value *ScopeMD = emitScopeMD(*this, ScopeExpr->getZExtValue(), AO);
     llvm::Function *F = CGM.getIntrinsic(IID, {LoadTy});
     return Builder.CreateCall(F, {Addr, AOExpr, ScopeMD});
   }
diff --git a/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx1250-load-monitor.cl b/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx1250-load-monitor.cl
index 8ecd6ba61a03e..fcc246cecd1dd 100644
--- a/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx1250-load-monitor.cl
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn-gfx1250-load-monitor.cl
@@ -64,3 +64,23 @@ v4i test_amdgcn_flat_load_monitor_b128(v4i* inptr)
 {
   return __builtin_amdgcn_flat_load_monitor_b128(inptr, __ATOMIC_RELAXED, __MEMORY_SCOPE_SYSTEM);
 }
+
+// CHECK-GFX1250-LABEL: @test_amdgcn_global_load_monitor_b32_wavefront(
+// CHECK-GFX1250-NEXT:  entry:
+// CHECK-GFX1250-NEXT:    [[TMP0:%.*]] = tail call i32 @llvm.amdgcn.global.load.monitor.b32.i32(ptr addrspace(1) [[INPTR:%.*]], i32 0, metadata [[META12:![0-9]+]])
+// CHECK-GFX1250-NEXT:    ret i32 [[TMP0]]
+//
+int test_amdgcn_global_load_monitor_b32_wavefront(global int* inptr)
+{
+  return __builtin_amdgcn_global_load_monitor_b32(inptr, __ATOMIC_RELAXED, __MEMORY_SCOPE_WVFRNT);
+}
+
+// CHECK-GFX1250-LABEL: @test_amdgcn_global_load_monitor_b32_single(
+// CHECK-GFX1250-NEXT:  entry:
+// CHECK-GFX1250-NEXT:    [[TMP0:%.*]] = tail call i32 @llvm.amdgcn.global.load.monitor.b32.i32(ptr addrspace(1) [[INPTR:%.*]], i32 0, metadata [[META13:![0-9]+]])
+// CHECK-GFX1250-NEXT:    ret i32 [[TMP0]]
+//
+int test_amdgcn_global_load_monitor_b32_single(global int* inptr)
+{
+  return __builtin_amdgcn_global_load_monitor_b32(inptr, __ATOMIC_RELAXED, __MEMORY_SCOPE_SINGLE);
+}

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 18, 2026

✅ With the latest revision this PR passed the C/C++ code formatter.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 18, 2026

🐧 Linux x64 Test Results

  • 117983 tests passed
  • 4726 tests skipped

✅ The build succeeded and all tests passed.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 18, 2026

🪟 Windows x64 Test Results

  • 55604 tests passed
  • 2570 tests skipped

✅ The build succeeded and all tests passed.

@ssahasra ssahasra changed the base branch from users/ssahasra/load-monitor-scope to main May 19, 2026 02:13
@ssahasra ssahasra force-pushed the users/ssahasra/clang-scope-metadata branch from 2f966a9 to 39e36ba Compare May 19, 2026 02:20
@ssahasra
Copy link
Copy Markdown
Contributor Author

This change can now be reviewed independently from #198245

@ssahasra ssahasra changed the title [AMDGPU][Clang] use a ScopeModel when emitting load_monitor [NFC] [AMDGPU][Clang] use ScopeModel to translate integer scopes [NFC] May 19, 2026
@ssahasra
Copy link
Copy Markdown
Contributor Author

This is now a general refactoring that replaces special handling of integer scopes with a call to getLLVMSyncScopeStr(). This also eliminates a switch statement that was designed just for SPIRV on AMDGPU.

}

// Map a __MEMORY_SCOPE_* integer constant to the AMDGPU-specific syncscope.
// Invalid scope values are mapped to system scope (empty string).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this is the current behavior, though I wonder whether we should error out instead,

Copy link
Copy Markdown
Contributor Author

@ssahasra ssahasra May 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but perhaps there is a better place to error out, such as the AMDGPU verifier. This function being refactored here might not occur on all paths. It's just providing a reasonable default within its area of influence that could be changed to unreachable if there was a verifier.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the verifier doesn't see the value right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really haven't thought about it much. It's just a refactor that establishes a consistent handling of scopes for now.

@ssahasra ssahasra merged commit 6255ecd into main May 21, 2026
10 checks passed
@ssahasra ssahasra deleted the users/ssahasra/clang-scope-metadata branch May 21, 2026 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants