Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[clang][CodeGen] Fix GPU-specific attributes being dropped by bitcode…
… linking Device libs make use of patterns like this: ``` __attribute__((target("gfx11-insts"))) static unsigned do_intrin_stuff(void) { return __builtin_amdgcn_s_sendmsg_rtnl(0x0); } ``` For functions that are assumed to be eliminated if the currennt GPU target doesn't support them. At O0 such functions aren't eliminated by common optimizations but often by AMDGPURemoveIncompatibleFunctions instead, which sees the "+gfx11-insts" attribute on, say, GFX9 and knows it's not valid, so it removes the function. D142907 accidentally made it so such attributes were dropped during bitcode linking, making it impossible for RemoveIncompatibleFunctions to catch the functions and causing ISel to catch fire eventually. This fixes the issue and adds a new test to ensure we don't accidentally fall into this trap again. Fixes SWDEV-403642 Reviewed By: arsenm, yaxunl Differential Revision: https://reviews.llvm.org/D152251
- Loading branch information
Showing
6 changed files
with
80 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
typedef unsigned long ulong; | ||
|
||
__attribute__((target("gfx11-insts"))) | ||
ulong do_intrin_stuff(void) | ||
{ | ||
return __builtin_amdgcn_s_sendmsg_rtnl(0x0); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
48 changes: 48 additions & 0 deletions
48
clang/test/CodeGenCUDA/link-builtin-bitcode-gpu-attrs-preserved.cu
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
// Verify the behavior of the +gfxN-insts in the way that | ||
// rocm-device-libs should be built with. e.g. If the device libraries has a function | ||
// with "+gfx11-insts", that attribute should still be present after linking and not | ||
// overwritten with the current target's settings. | ||
|
||
// This is important because at this time, many device-libs functions that are only | ||
// available on some GPUs put an attribute such as "+gfx11-insts" so that | ||
// AMDGPURemoveIncompatibleFunctions can detect & remove them if needed. | ||
|
||
// Build the fake device library in the way rocm-device-libs should be built. | ||
// | ||
// RUN: %clang_cc1 -x cl -triple amdgcn-amd-amdhsa\ | ||
// RUN: -mcode-object-version=none -emit-llvm-bc \ | ||
// RUN: %S/Inputs/ocml-sample-target-attrs.cl -o %t.bc | ||
|
||
// Check the default behavior | ||
// RUN: %clang_cc1 -x hip -triple amdgcn-amd-amdhsa -target-cpu gfx803 -fcuda-is-device \ | ||
// RUN: -mlink-builtin-bitcode %t.bc \ | ||
// RUN: -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,INTERNALIZE | ||
|
||
// RUN: %clang_cc1 -x hip -triple amdgcn-amd-amdhsa -target-cpu gfx1101 -fcuda-is-device \ | ||
// RUN: -mlink-builtin-bitcode %t.bc -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,INTERNALIZE | ||
|
||
// Check the case where no internalization is performed | ||
// RUN: %clang_cc1 -x hip -triple amdgcn-amd-amdhsa -target-cpu gfx803 \ | ||
// RUN: -fcuda-is-device -mlink-bitcode-file %t.bc -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,NOINTERNALIZE | ||
|
||
// Check the case where no internalization is performed | ||
// RUN: %clang_cc1 -x hip -triple amdgcn-amd-amdhsa -target-cpu gfx1101 \ | ||
// RUN: -fcuda-is-device -mlink-bitcode-file %t.bc -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,NOINTERNALIZE | ||
|
||
|
||
// CHECK: define {{.*}} i64 @do_intrin_stuff() #[[ATTR:[0-9]+]] | ||
// INTERNALIZE: attributes #[[ATTR]] = {{.*}} "target-cpu"="gfx{{.*}}" "target-features"="+gfx11-insts" | ||
// NOINTERNALIZE: attributes #[[ATTR]] = {{.*}} "target-features"="+gfx11-insts" | ||
|
||
#define __device__ __attribute__((device)) | ||
#define __global__ __attribute__((global)) | ||
|
||
typedef unsigned long ulong; | ||
|
||
extern "C" { | ||
__device__ ulong do_intrin_stuff(void); | ||
|
||
__global__ void kernel_f16(ulong* out) { | ||
*out = do_intrin_stuff(); | ||
} | ||
} |