Skip to content

Commit

Permalink
[Clang][AMDGPU] Permit language address spaces for AMDGPU globals (#6…
Browse files Browse the repository at this point in the history
…6205)

Summary:
Currently, there is an assertion that prevents us from emitting an
AMDGPU global with a non-target specific address space (i.e. numerical
attribute). I'm unsure what the original intentions of this assertion
were, but we should be able to use OpenCL address spaces when compiling
directly to AMDGPU from C++. This is permitted on NVPTX so I'm unsure
what this assertion is guarding. The patch simply removes the assertion
and adds a test to ensure that these emit the expected address spaces.

Fixes #65069
  • Loading branch information
jhuber6 committed Sep 13, 2023
1 parent ba81cd1 commit 1b7a095
Show file tree
Hide file tree
Showing 2 changed files with 62 additions and 1 deletion.
1 change: 0 additions & 1 deletion clang/lib/CodeGen/Targets/AMDGPU.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -446,7 +446,6 @@ AMDGPUTargetCodeGenInfo::getGlobalVarAddressSpace(CodeGenModule &CGM,
return DefaultGlobalAS;

LangAS AddrSpace = D->getType().getAddressSpace();
assert(AddrSpace == LangAS::Default || isTargetAddressSpace(AddrSpace));
if (AddrSpace != LangAS::Default)
return AddrSpace;

Expand Down
62 changes: 62 additions & 0 deletions clang/test/CodeGen/amdgpu-address-spaces.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --check-globals --version 3
// RUN: %clang_cc1 -cc1 -triple amdgcn-amd-amdhsa -emit-llvm %s -o - | FileCheck %s

int [[clang::opencl_global]] a = 100;
int [[clang::opencl_generic]] b = 42;
int [[clang::opencl_constant]] c = 999;
[[clang::loader_uninitialized]] int [[clang::opencl_local]] d;
[[clang::loader_uninitialized]] int [[clang::opencl_private]] e;

int [[clang::address_space(1)]] x = 100;
int [[clang::address_space(0)]] y = 42;
int [[clang::address_space(4)]] z = 999;
[[clang::loader_uninitialized]] int [[clang::address_space(3)]] w;
[[clang::loader_uninitialized]] int [[clang::address_space(5)]] u;

int [[clang::address_space(6)]] aaa = 1000;
int [[clang::address_space(999)]] bbb = 1234;

//.
// CHECK: @a = addrspace(1) global i32 100, align 4
// CHECK: @b = global i32 42, align 4
// CHECK: @c = addrspace(4) constant i32 999, align 4
// CHECK: @d = addrspace(3) global i32 undef, align 4
// CHECK: @e = addrspace(5) global i32 undef, align 4
// CHECK: @x = addrspace(1) global i32 100, align 4
// CHECK: @y = global i32 42, align 4
// CHECK: @z = addrspace(4) global i32 999, align 4
// CHECK: @w = addrspace(3) global i32 undef, align 4
// CHECK: @u = addrspace(5) global i32 undef, align 4
// CHECK: @aaa = addrspace(6) global i32 1000, align 4
// CHECK: @bbb = addrspace(999) global i32 1234, align 4
// CHECK: @llvm.amdgcn.abi.version = weak_odr hidden local_unnamed_addr addrspace(4) constant i32 400
//.
// CHECK-LABEL: define dso_local amdgpu_kernel void @foo(
// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: store i32 0, ptr addrspace(1) @a, align 4
// CHECK-NEXT: store i32 0, ptr @b, align 4
// CHECK-NEXT: store i32 0, ptr addrspace(3) @d, align 4
// CHECK-NEXT: store i32 0, ptr addrspace(5) @e, align 4
// CHECK-NEXT: store i32 0, ptr addrspace(1) @x, align 4
// CHECK-NEXT: store i32 0, ptr @y, align 4
// CHECK-NEXT: store i32 0, ptr addrspace(3) @d, align 4
// CHECK-NEXT: store i32 0, ptr addrspace(5) @u, align 4
// CHECK-NEXT: store i32 0, ptr addrspace(6) @aaa, align 4
// CHECK-NEXT: store i32 0, ptr addrspace(999) @bbb, align 4
// CHECK-NEXT: ret void
//
extern "C" [[clang::amdgpu_kernel]] void foo() {
a = 0;
b = 0;
d = 0;
e = 0;

x = 0;
y = 0;
d = 0;
u = 0;

aaa = 0;
bbb = 0;
}

0 comments on commit 1b7a095

Please sign in to comment.