Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang][AMDGPU] fix the return type for ballot #73906

Merged
merged 3 commits into from
Dec 4, 2023

Conversation

ssahasra
Copy link
Collaborator

In the builtins declaration, "ULi" is a 32-bit integer on Windows. Use "WUi" instead to ensure a 64-bit integer on all platforms.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Nov 30, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 30, 2023

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-clang

Author: Sameer Sahasrabuddhe (ssahasra)

Changes

In the builtins declaration, "ULi" is a 32-bit integer on Windows. Use "WUi" instead to ensure a 64-bit integer on all platforms.


Full diff: https://github.com/llvm/llvm-project/pull/73906.diff

2 Files Affected:

  • (modified) clang/include/clang/Basic/BuiltinsAMDGPU.def (+2-2)
  • (added) clang/test/CodeGenHIP/ballot.cpp (+15)
diff --git a/clang/include/clang/Basic/BuiltinsAMDGPU.def b/clang/include/clang/Basic/BuiltinsAMDGPU.def
index a19c8bd5f219ec6..8b59b3790d7bc66 100644
--- a/clang/include/clang/Basic/BuiltinsAMDGPU.def
+++ b/clang/include/clang/Basic/BuiltinsAMDGPU.def
@@ -150,8 +150,8 @@ BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc")
 // Ballot builtins.
 //===----------------------------------------------------------------------===//
 
-TARGET_BUILTIN(__builtin_amdgcn_ballot_w32, "Uib", "nc", "wavefrontsize32")
-TARGET_BUILTIN(__builtin_amdgcn_ballot_w64, "LUib", "nc", "wavefrontsize64")
+TARGET_BUILTIN(__builtin_amdgcn_ballot_w32, "ZUib", "nc", "wavefrontsize32")
+TARGET_BUILTIN(__builtin_amdgcn_ballot_w64, "WUib", "nc", "wavefrontsize64")
 
 // Deprecated intrinsics in favor of __builtin_amdgn_ballot_{w32|w64}
 BUILTIN(__builtin_amdgcn_uicmp, "WUiUiUiIi", "nc")
diff --git a/clang/test/CodeGenHIP/ballot.cpp b/clang/test/CodeGenHIP/ballot.cpp
new file mode 100644
index 000000000000000..5685c5cad30d66a
--- /dev/null
+++ b/clang/test/CodeGenHIP/ballot.cpp
@@ -0,0 +1,15 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -target-cpu gfx900 -x hip -emit-llvm -fcuda-is-device -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -target-cpu gfx900 -x hip -S -fcuda-is-device -o - %s | FileCheck %s --check-prefix=GFX9
+
+// CHECK-LABEL: @_Z3fooi
+// CHECK: call i64 @llvm.amdgcn.ballot.i64
+
+// GFX9-LABEL: _Z3fooi:
+// GFX9: v_cmp_ne_u32_e64
+
+#define __device__ __attribute__((device))
+
+__device__ unsigned long long foo(int p) {
+  return __builtin_amdgcn_ballot_w64(p);
+}

TARGET_BUILTIN(__builtin_amdgcn_ballot_w32, "Uib", "nc", "wavefrontsize32")
TARGET_BUILTIN(__builtin_amdgcn_ballot_w64, "LUib", "nc", "wavefrontsize64")
TARGET_BUILTIN(__builtin_amdgcn_ballot_w32, "ZUib", "nc", "wavefrontsize32")
TARGET_BUILTIN(__builtin_amdgcn_ballot_w64, "WUib", "nc", "wavefrontsize64")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably do this for every single builtin

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked now. The tell is whether the builtin uses "L" in its type descriptor. None of them do except ballot.

@@ -0,0 +1,15 @@
// REQUIRES: amdgpu-registered-target
// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -target-cpu gfx900 -x hip -emit-llvm -fcuda-is-device -o - %s | FileCheck %s
// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -target-cpu gfx900 -x hip -S -fcuda-is-device -o - %s | FileCheck %s --check-prefix=GFX9
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this testing anything different than the existing builtin test without a windows host component?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. The precommit test needed fixing. The updated version uses an aux-triple to specify a Windows host. Now the test exposes the problem even on a Linux build when trying cross-compile to Windows.

Added another note in the test: The same problem does not arise with existing OpenCL tests because the language fully specifies that "long" is 64 bits on all platforms.

The Clang declaration of the wave-64 builtin uses "UL" as the return type, which
is interpreted as a 32-bit unsigned integer on Windows. This emits an incorrect
LLVM declaration with i32 return type instead of i64. The clang declaration
needs to be fixed to use "WU" instead.
In the builtins declaration, "ULi" is a 32-bit integer on Windows. Use "WUi"
instead to ensure a 64-bit integer on all platforms.
@ssahasra ssahasra merged commit 9db6423 into llvm:main Dec 4, 2023
3 checks passed
@ssahasra ssahasra deleted the ssahasra/ballot branch January 2, 2024 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants