Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Set MaxAtomicSizeInBitsSupported. #75185

Merged
merged 2 commits into from
Dec 18, 2023

Conversation

jyknight
Copy link
Member

This will result in larger atomic operations getting expanded to __atomic_* libcalls via AtomicExpandPass, which matches what Clang already does in the frontend.

While AMDGPU currently disables the use of all libcalls, I've changed it to instead disable all of them except the atomic ones. Those are already be emitted by the Clang frontend, and enabling them in the backend allows the same behavior there.

This will result in larger atomic operations getting expanded to
`__atomic_*` libcalls via AtomicExpandPass, which matches what Clang
already does in the frontend.

While AMDGPU currently disables the use of all libcalls, I've changed
it to instead disable all of them _except_ the atomic ones. Those are
already be emitted by the Clang frontend, and by enabling them in the
backend, allows the same behavior there.
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 12, 2023

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-amdgpu

Author: James Y Knight (jyknight)

Changes

This will result in larger atomic operations getting expanded to __atomic_* libcalls via AtomicExpandPass, which matches what Clang already does in the frontend.

While AMDGPU currently disables the use of all libcalls, I've changed it to instead disable all of them except the atomic ones. Those are already be emitted by the Clang frontend, and enabling them in the backend allows the same behavior there.


Full diff: https://github.com/llvm/llvm-project/pull/75185.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (+7-3)
  • (added) llvm/test/CodeGen/AMDGPU/atomic-oversize.ll (+10)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
index fcbdf51b03c1f..78092675057df 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
@@ -506,9 +506,11 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(const TargetMachine &TM,
   setOperationAction(ISD::SELECT, MVT::v12f32, Promote);
   AddPromotedToType(ISD::SELECT, MVT::v12f32, MVT::v12i32);
 
-  // There are no libcalls of any kind.
-  for (int I = 0; I < RTLIB::UNKNOWN_LIBCALL; ++I)
-    setLibcallName(static_cast<RTLIB::Libcall>(I), nullptr);
+  // Disable most libcalls.
+  for (int I = 0; I < RTLIB::UNKNOWN_LIBCALL; ++I) {
+    if (I < RTLIB::ATOMIC_LOAD || I > RTLIB::ATOMIC_FETCH_NAND_16)
+      setLibcallName(static_cast<RTLIB::Libcall>(I), nullptr);
+  }
 
   setSchedulingPreference(Sched::RegPressure);
   setJumpIsExpensive(true);
@@ -556,6 +558,8 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(const TargetMachine &TM,
                        ISD::FSUB,       ISD::FNEG,
                        ISD::FABS,       ISD::AssertZext,
                        ISD::AssertSext, ISD::INTRINSIC_WO_CHAIN});
+
+  setMaxAtomicSizeInBitsSupported(64);
 }
 
 bool AMDGPUTargetLowering::mayIgnoreSignedZero(SDValue Op) const {
diff --git a/llvm/test/CodeGen/AMDGPU/atomic-oversize.ll b/llvm/test/CodeGen/AMDGPU/atomic-oversize.ll
new file mode 100644
index 0000000000000..f62a93f523365
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/atomic-oversize.ll
@@ -0,0 +1,10 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -verify-machineinstrs < %s | FileCheck %s
+
+define void @test(ptr %a) nounwind {
+; CHECK-LABEL: test:
+; CHECK: __atomic_load_16
+; CHECK: __atomic_store_16
+  %1 = load atomic i128, ptr %a seq_cst, align 16
+  store atomic i128 %1, ptr %a seq_cst, align 16
+  ret void
+}

@arsenm
Copy link
Contributor

arsenm commented Dec 18, 2023

What's the point of doing this? There aren't any lib calls, and the lib call handling is unaware of the address space anyway. This is just changing a hard error into producing nonworking code

@jyknight
Copy link
Member Author

The purpose is to match what's currently produced by Clang, with the eventual goal of deleting parts of the Clang atomic lowering, leaning on the backend instead.

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bet this breaks differently with 32-bit address space pointers

@jyknight
Copy link
Member Author

I don't know. I guess we can deal with that if it comes up in the future.

@jyknight jyknight merged commit 137f785 into llvm:main Dec 18, 2023
5 checks passed
@jyknight jyknight deleted the atomic-max-amdgpu branch December 18, 2023 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants