-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[OpenCL][NVPTX] Don't set calling convention for OpenCL kernel #170170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-clang Author: Hongyu Chen (XChy) ChangesFixes #154772 Full diff: https://github.com/llvm/llvm-project/pull/170170.diff 2 Files Affected:
diff --git a/clang/lib/CodeGen/Targets/NVPTX.cpp b/clang/lib/CodeGen/Targets/NVPTX.cpp
index f6715861d91bc..5afef658c840b 100644
--- a/clang/lib/CodeGen/Targets/NVPTX.cpp
+++ b/clang/lib/CodeGen/Targets/NVPTX.cpp
@@ -277,7 +277,9 @@ void NVPTXTargetCodeGenInfo::setTargetAttributes(
}
}
// Attach kernel metadata directly if compiling for NVPTX.
- if (FD->hasAttr<DeviceKernelAttr>())
+ // NOTE: Don't set kernel calling convention for handled OpenCL kernel,
+ // otherwise the stub version of kernel would be incorrect.
+ if (FD->hasAttr<DeviceKernelAttr>() && !M.getLangOpts().OpenCL)
F->setCallingConv(getDeviceKernelCallingConv());
}
diff --git a/clang/test/CodeGenOpenCL/ptx-calls.cl b/clang/test/CodeGenOpenCL/ptx-calls.cl
index ae187173b1730..d5e27fce426a7 100644
--- a/clang/test/CodeGenOpenCL/ptx-calls.cl
+++ b/clang/test/CodeGenOpenCL/ptx-calls.cl
@@ -1,11 +1,22 @@
-// RUN: %clang_cc1 %s -triple nvptx-unknown-unknown -emit-llvm -O0 -o - | FileCheck %s
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 6
+// RUN: %clang_cc1 %s -triple nvptx-unknown-unknown -emit-llvm -O1 -o - | FileCheck %s
+// CHECK-LABEL: define dso_local void @device_function(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: ret void
+//
void device_function() {
}
-// CHECK-LABEL: define{{.*}} void @device_function()
+// CHECK-LABEL: define dso_local ptx_kernel void @kernel_function(
+// CHECK-SAME: ) local_unnamed_addr #[[ATTR1:[0-9]+]] !kernel_arg_addr_space [[META7:![0-9]+]] !kernel_arg_access_qual [[META7]] !kernel_arg_type [[META7]] !kernel_arg_base_type [[META7]] !kernel_arg_type_qual [[META7]] {
+// CHECK-NEXT: [[ENTRY:.*:]]
+// CHECK-NEXT: ret void
+//
__kernel void kernel_function() {
device_function();
}
-// CHECK-LABEL: define{{.*}} ptx_kernel void @kernel_function()
-// CHECK: call void @device_function()
+//.
+// CHECK: [[META7]] = !{}
+//.
|
🐧 Linux x64 Test Results
✅ The build succeeded and all tests passed. |
shiltian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem like a right fix. To me it seems like getDeviceKernelCallingConv should not return the wrong CC in the first place.
I am unable to determine whether a function is a stub in the context of |
clang/lib/CodeGen/Targets/NVPTX.cpp
Outdated
| if (FD->hasAttr<DeviceKernelAttr>()) | ||
| // NOTE: Don't set kernel calling convention for handled OpenCL kernel, | ||
| // otherwise the stub version of kernel would be incorrect. | ||
| if (FD->hasAttr<DeviceKernelAttr>() && !M.getLangOpts().OpenCL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Language check is certainly the wrong thing to be looking for. Isn't this code invoked twice for each of them? This should only apply to one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the context here, I didn't find a way to check whether it's a stub unless by the mangled name. So I apply this code to the ones without __clang_ocl_kern_imp_ prefix in the mangled name.
Fixes #154772
We previously set
ptx_kernelfor all kernels. But it's incorrect to addptx_kernelto the stub version of kernel introduced in #115821. This patch copies the workaround of AMDGPU.