-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[AMDGPU][ASAN] Remove amdgpu-no-flat-scratch-init attribute from lowered kernels in amdgpu-sw-lower-lds pass #160541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ed kernels in amdgpu-sw-lower-lds pass
@llvm/pr-subscribers-backend-amdgpu Author: Chaitanya (skc7) Changes
Full diff: https://github.com/llvm/llvm-project/pull/160541.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUSwLowerLDS.cpp b/llvm/lib/Target/AMDGPU/AMDGPUSwLowerLDS.cpp
index 4a9437b37aa39..d610fbf9bb06b 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUSwLowerLDS.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUSwLowerLDS.cpp
@@ -1213,7 +1213,8 @@ bool AMDGPUSwLowerLDS::run() {
removeFnAttrFromReachable(
CG, Func,
{"amdgpu-no-workitem-id-x", "amdgpu-no-workitem-id-y",
- "amdgpu-no-workitem-id-z", "amdgpu-no-heap-ptr"});
+ "amdgpu-no-workitem-id-z", "amdgpu-no-heap-ptr",
+ "amdgpu-no-flat-scratch-init"});
if (!LDSParams.IndirectAccess.StaticLDSGlobals.empty() ||
!LDSParams.IndirectAccess.DynamicLDSGlobals.empty())
removeFnAttrFromReachable(CG, Func, {"amdgpu-no-lds-kernel-id"});
diff --git a/llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-no-scratch-init.ll b/llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-no-scratch-init.ll
new file mode 100644
index 0000000000000..660b0f32590ba
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/amdgpu-sw-lower-lds-no-scratch-init.ll
@@ -0,0 +1,18 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals all --version 4
+; RUN: opt < %s -passes=amdgpu-sw-lower-lds -S -amdgpu-asan-instrument-lds=false -mtriple=amdgcn-amd-amdhsa | FileCheck %s
+; RUN: llc < %s -enable-new-pm -stop-after=amdgpu-sw-lower-lds -amdgpu-asan-instrument-lds=false -mtriple=amdgcn-amd-amdhsa | FileCheck %s
+
+; Test to check if amdgpu-no-flat-scratch-init is removed from the kernel.
+@lds_1 = internal addrspace(3) global [1 x i8] poison, align 4
+
+define amdgpu_kernel void @k0() #0 {
+ store i8 7, ptr addrspace(3) @lds_1, align 4
+ ret void
+}
+
+attributes #0 = { sanitize_address "amdgpu-no-flat-scratch-init"}
+
+!llvm.module.flags = !{!0}
+!0 = !{i32 4, !"nosanitize_address", i32 1}
+
+; CHECK: attributes #[[ATTR0:.*]] = { sanitize_address "amdgpu-lds-size"="8" }
|
I think it would be better to change the implementation to make sure that is not possible |
I haven't any idea where that could be happening. It was certainly not deliberate. @skc7 do you know where it's coming from? |
The runtime should be compiled as OpenCL 1.x to ensure this isn't possible. IIRC we are compiling as 2.x to work around the wrong address space numbering bug. I'm not sure what the behavior is if you use 1.x/3.x with the generic extension enabled. Can you try compiling the library in that mode and see what IR you get? |
![]() Above image is a diff of LLVM IR for |
You need to change the language in the runtime build. We also could do a better job emitting cleaner IR in clang, but you need to change the language to make it structurally impossible to require the addrspacecast |
I see now. That private pointer is probably the "expected" argument to the compare_exchange. It ought to be optimized away completely later. I don't understand why we would be casting it to generic. Is the builtin requiring it? |
We can't simply change the way the runtime is built. There are functions that must accept generic pointers. Removing an incorrectly added attribute seems like the right way to go. Do you disagree that the attribute has become incorrect? |
Which can be managed to minimize the use of generic pointers to that interface. The actual library content should not be using generic to access scratch
I think any compiler runtime library needs to have its properties carefully controlled to not cause trouble for the compiler. There is no reason for such a cast to exist in the library. We should have a custom verification that we can assume none of these functions rely on any implicit arguments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be addressed in the compiler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can separately and thoughtfully pursue changing the device library build and implementation. We need to fix this before the problem appears in yet another release. Let's land this for the time being.
Let's not. I have low confidence this will ever be removed |
The correct fix of this is in the runtime library build. There are 4 plausible fixes:
|
Thanks @arsenm for feedback. Raised PR #161319 for option 3. |
Closing this PR since #161319 is merged. |
LLVM IR of __asan_malloc_impl has generic pointers accessing private memory because of patch clang: Fix broken implicit cast to generic address space #138863
Call to __asan_malloc_impl is introduced late in the pass pipeline by "amdgpu-sw-lower-lds" pass to lowered kernels. Prior passes introduce "amdgpu-no-flat-scratch-init" to the kernels since this call is not present in the IR for kernel to start with.
So, to make sure "Flat Addressing for Scratch Memory" works, this PR removes "amdgpu-no-flat-scratch-init" from the lowered kernels.