Skip to content

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Nov 11, 2024

Summary:
Address spaces are used in several embedded and GPU targets to describe
accesses to different types of memory. Currently we use the address
space enumerations to control which address spaces are considered
supersets of eachother, however this is also a target level property as
described by the C standard's passing mentions. This patch allows the
address space checks to use the target information to decide if a
pointer conversion is legal. For AMDGPU and NVPTX, all supported address
spaces can be converted to the default address space.

More semantic checks can be added on top of this, for now I'm mainly
looking to get more standard semantics working for C/C++. Right now the
address space conversions must all be done explicitly in C/C++ unlike
the offloading languages which define their own custom address spaces
that just map to the same target specific ones anyway. The main question
is if this behavior is a function of the target or the language.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Nov 11, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 11, 2024

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Joseph Huber (jhuber6)

Changes

Summary:
GPU targets support several different address spaces which have
differing semantics. When targeting C/C++ we have a very pessimistic
view that these address spaces are completely incompatible. This has a
lot of unfortable effects that limit using address spaces in C++ as well
as making it more difficult to work with. Flat addressing is supported
by the major GPU targets, so it's highly desierable to use.

The C/C++ standard says nothing about address spaces, so we cannot make
any assumptions. However, OpenCL has an option that causes all pointers
to be seen as 'generic'. This patch adds support for making every
address space as __generic by default, similar to the CL extensions.
This allows us to use this behavior outside of OpenCL mode. I have
re-used the language option as it seemed easier than creating a second
one.

This works in most cases, however it does cause some problems for cases
like this, as the default pointer type is now __generic T so it fails
to bind to T. But since this is an opt-in thing it seems fine to force
the user to add an extra template, or remove the qualifiers.

template<typename T> void foo(T *, T);

Full diff: https://github.com/llvm/llvm-project/pull/115777.diff

7 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+3)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+3)
  • (modified) clang/lib/Frontend/CompilerInvocation.cpp (+4)
  • (modified) clang/lib/Sema/Sema.cpp (+1-1)
  • (modified) clang/lib/Sema/SemaDeclCXX.cpp (+2-1)
  • (modified) clang/lib/Sema/SemaType.cpp (+7-4)
  • (added) clang/test/CodeGen/generic-addrspace.cpp (+219)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 1304ef3c5a228b..0d6f2c3410e9a0 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3710,6 +3710,9 @@ def fopenmp_assume_no_nested_parallelism : Flag<["-"], "fopenmp-assume-no-nested
 } // let Visibility = [ClangOption, CC1Option, FC1Option]
 } // let Flags = [NoArgumentUnused, HelpHidden]
 
+def fdefault_generic_addrspace : Flag<["-"], "fdefault-generic-addrspace">, Group<f_Group>,
+  Flags<[NoArgumentUnused]>, Visibility<[ClangOption, CC1Option]>,
+  HelpText<"Allow pointers to be implicitly casted to other address spaces.">;
 def fopenmp_offload_mandatory : Flag<["-"], "fopenmp-offload-mandatory">, Group<f_Group>,
   Flags<[NoArgumentUnused]>, Visibility<[ClangOption, CC1Option]>,
   HelpText<"Do not create a host fallback if offloading to the device fails.">,
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 0952262c360185..d997a1d232e83d 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -7067,6 +7067,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
   if (Args.hasArg(options::OPT_nogpulib))
     CmdArgs.push_back("-nogpulib");
 
+  if (Args.hasArg(options::OPT_fdefault_generic_addrspace))
+    CmdArgs.push_back("-fdefault-generic-addrspace");
+
   if (Arg *A = Args.getLastArg(options::OPT_fcf_protection_EQ)) {
     CmdArgs.push_back(
         Args.MakeArgString(Twine("-fcf-protection=") + A->getValue()));
diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp
index b5fd35aaa1e841..b44f04d0f275e0 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -3662,6 +3662,9 @@ void CompilerInvocationBase::GenerateLangArgs(const LangOptions &Opts,
   if (Opts.Blocks && !(Opts.OpenCL && Opts.OpenCLVersion == 200))
     GenerateArg(Consumer, OPT_fblocks);
 
+  if (Opts.OpenCLGenericAddressSpace)
+    GenerateArg(Consumer, OPT_fdefault_generic_addrspace);
+
   if (Opts.ConvergentFunctions)
     GenerateArg(Consumer, OPT_fconvergent_functions);
   else
@@ -3939,6 +3942,7 @@ bool CompilerInvocation::ParseLangArgs(LangOptions &Opts, ArgList &Args,
   // These need to be parsed now. They are used to set OpenCL defaults.
   Opts.IncludeDefaultHeader = Args.hasArg(OPT_finclude_default_header);
   Opts.DeclareOpenCLBuiltins = Args.hasArg(OPT_fdeclare_opencl_builtins);
+  Opts.OpenCLGenericAddressSpace = Args.hasArg(OPT_fdefault_generic_addrspace);
 
   LangOptions::setLangDefaults(Opts, IK.getLanguage(), T, Includes, LangStd);
 
diff --git a/clang/lib/Sema/Sema.cpp b/clang/lib/Sema/Sema.cpp
index 2b51765e80864a..2920220948d145 100644
--- a/clang/lib/Sema/Sema.cpp
+++ b/clang/lib/Sema/Sema.cpp
@@ -1579,7 +1579,7 @@ NamedDecl *Sema::getCurFunctionOrMethodDecl() const {
 }
 
 LangAS Sema::getDefaultCXXMethodAddrSpace() const {
-  if (getLangOpts().OpenCL)
+  if (getLangOpts().OpenCL || getLangOpts().OpenCLGenericAddressSpace)
     return getASTContext().getDefaultOpenCLPointeeAddrSpace();
   return LangAS::Default;
 }
diff --git a/clang/lib/Sema/SemaDeclCXX.cpp b/clang/lib/Sema/SemaDeclCXX.cpp
index 8d76a35b2d2557..d855d9e3f49bc9 100644
--- a/clang/lib/Sema/SemaDeclCXX.cpp
+++ b/clang/lib/Sema/SemaDeclCXX.cpp
@@ -16178,7 +16178,8 @@ CheckOperatorNewDeleteTypes(Sema &SemaRef, const FunctionDecl *FnDecl,
       << FnDecl->getDeclName();
 
   QualType FirstParamType = FnDecl->getParamDecl(0)->getType();
-  if (SemaRef.getLangOpts().OpenCLCPlusPlus) {
+  if (SemaRef.getLangOpts().OpenCLCPlusPlus ||
+      SemaRef.getLangOpts().OpenCLGenericAddressSpace) {
     // The operator is valid on any address space for OpenCL.
     // Drop address space from actual and expected first parameter types.
     if (const auto *PtrTy =
diff --git a/clang/lib/Sema/SemaType.cpp b/clang/lib/Sema/SemaType.cpp
index eb7516b3ef1ece..7ea663eacd0452 100644
--- a/clang/lib/Sema/SemaType.cpp
+++ b/clang/lib/Sema/SemaType.cpp
@@ -1836,7 +1836,7 @@ QualType Sema::BuildPointerType(QualType T,
   if (getLangOpts().ObjCAutoRefCount)
     T = inferARCLifetimeForPointee(*this, T, Loc, /*reference*/ false);
 
-  if (getLangOpts().OpenCL)
+  if (getLangOpts().OpenCL || getLangOpts().OpenCLGenericAddressSpace)
     T = deduceOpenCLPointeeAddrSpace(*this, T);
 
   // In WebAssembly, pointers to reference types and pointers to tables are
@@ -1913,7 +1913,7 @@ QualType Sema::BuildReferenceType(QualType T, bool SpelledAsLValue,
   if (getLangOpts().ObjCAutoRefCount)
     T = inferARCLifetimeForPointee(*this, T, Loc, /*reference*/ true);
 
-  if (getLangOpts().OpenCL)
+  if (getLangOpts().OpenCL || getLangOpts().OpenCLGenericAddressSpace)
     T = deduceOpenCLPointeeAddrSpace(*this, T);
 
   // In WebAssembly, references to reference types and tables are illegal.
@@ -2741,7 +2741,7 @@ QualType Sema::BuildBlockPointerType(QualType T,
   if (checkQualifiedFunction(*this, T, Loc, QFK_BlockPointer))
     return QualType();
 
-  if (getLangOpts().OpenCL)
+  if (getLangOpts().OpenCL || getLangOpts().OpenCLGenericAddressSpace)
     T = deduceOpenCLPointeeAddrSpace(*this, T);
 
   return Context.getBlockPointerType(T);
@@ -5289,7 +5289,10 @@ static TypeSourceInfo *GetFullTypeForDeclarator(TypeProcessingState &state,
                      DeclaratorContext::LambdaExpr;
         };
 
-        if (state.getSema().getLangOpts().OpenCLCPlusPlus && IsClassMember()) {
+        if ((state.getSema().getLangOpts().OpenCLCPlusPlus ||
+             (!state.getSema().getLangOpts().OpenCL &&
+              state.getSema().getLangOpts().OpenCLGenericAddressSpace)) &&
+            IsClassMember()) {
           LangAS ASIdx = LangAS::Default;
           // Take address space attr if any and mark as invalid to avoid adding
           // them later while creating QualType.
diff --git a/clang/test/CodeGen/generic-addrspace.cpp b/clang/test/CodeGen/generic-addrspace.cpp
new file mode 100644
index 00000000000000..e400172e1c6bd1
--- /dev/null
+++ b/clang/test/CodeGen/generic-addrspace.cpp
@@ -0,0 +1,219 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fdefault-generic-addrspace -emit-llvm -o - %s \
+// RUN:   | FileCheck %s --check-prefix=NVPTX
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fdefault-generic-addrspace -emit-llvm -o - %s \
+// RUN:   | FileCheck %s --check-prefix=AMDGPU
+
+// NVPTX-LABEL: define dso_local void @_Z1fPv(
+// NVPTX-SAME: ptr noundef [[P:%.*]]) #[[ATTR0:[0-9]+]] {
+// NVPTX-NEXT:  [[ENTRY:.*:]]
+// NVPTX-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8
+// NVPTX-NEXT:    store ptr [[P]], ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    ret void
+//
+// AMDGPU-LABEL: define dso_local void @_Z1fPv(
+// AMDGPU-SAME: ptr noundef [[P:%.*]]) #[[ATTR0:[0-9]+]] {
+// AMDGPU-NEXT:  [[ENTRY:.*:]]
+// AMDGPU-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// AMDGPU-NEXT:    [[P_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[P_ADDR]] to ptr
+// AMDGPU-NEXT:    store ptr [[P]], ptr [[P_ADDR_ASCAST]], align 8
+// AMDGPU-NEXT:    ret void
+//
+void f(void *p) {}
+
+// NVPTX-LABEL: define dso_local void @_Z2p1Pv(
+// NVPTX-SAME: ptr noundef [[P:%.*]]) #[[ATTR0]] {
+// NVPTX-NEXT:  [[ENTRY:.*:]]
+// NVPTX-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8
+// NVPTX-NEXT:    store ptr [[P]], ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    call void @_Z1fPv(ptr noundef [[TMP0]]) #[[ATTR1:[0-9]+]]
+// NVPTX-NEXT:    ret void
+//
+// AMDGPU-LABEL: define dso_local void @_Z2p1Pv(
+// AMDGPU-SAME: ptr noundef [[P:%.*]]) #[[ATTR0]] {
+// AMDGPU-NEXT:  [[ENTRY:.*:]]
+// AMDGPU-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// AMDGPU-NEXT:    [[P_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[P_ADDR]] to ptr
+// AMDGPU-NEXT:    store ptr [[P]], ptr [[P_ADDR_ASCAST]], align 8
+// AMDGPU-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR_ASCAST]], align 8
+// AMDGPU-NEXT:    call void @_Z1fPv(ptr noundef [[TMP0]]) #[[ATTR1:[0-9]+]]
+// AMDGPU-NEXT:    ret void
+//
+void p1(void [[clang::opencl_generic]] * p) { f(p); }
+// NVPTX-LABEL: define dso_local noundef ptr @_Z2p2PU3AS3v(
+// NVPTX-SAME: ptr addrspace(3) noundef [[P:%.*]]) #[[ATTR0]] {
+// NVPTX-NEXT:  [[ENTRY:.*:]]
+// NVPTX-NEXT:    [[P_ADDR:%.*]] = alloca ptr addrspace(3), align 8
+// NVPTX-NEXT:    store ptr addrspace(3) [[P]], ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    [[TMP0:%.*]] = load ptr addrspace(3), ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    [[TMP1:%.*]] = addrspacecast ptr addrspace(3) [[TMP0]] to ptr
+// NVPTX-NEXT:    ret ptr [[TMP1]]
+//
+// AMDGPU-LABEL: define dso_local noundef ptr @_Z2p2PU3AS3v(
+// AMDGPU-SAME: ptr addrspace(3) noundef [[P:%.*]]) #[[ATTR0]] {
+// AMDGPU-NEXT:  [[ENTRY:.*:]]
+// AMDGPU-NEXT:    [[RETVAL:%.*]] = alloca ptr, align 8, addrspace(5)
+// AMDGPU-NEXT:    [[P_ADDR:%.*]] = alloca ptr addrspace(3), align 4, addrspace(5)
+// AMDGPU-NEXT:    [[RETVAL_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[RETVAL]] to ptr
+// AMDGPU-NEXT:    [[P_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[P_ADDR]] to ptr
+// AMDGPU-NEXT:    store ptr addrspace(3) [[P]], ptr [[P_ADDR_ASCAST]], align 4
+// AMDGPU-NEXT:    [[TMP0:%.*]] = load ptr addrspace(3), ptr [[P_ADDR_ASCAST]], align 4
+// AMDGPU-NEXT:    [[TMP1:%.*]] = addrspacecast ptr addrspace(3) [[TMP0]] to ptr
+// AMDGPU-NEXT:    ret ptr [[TMP1]]
+//
+void *p2(void [[clang::opencl_local]] * p) { return p; }
+// NVPTX-LABEL: define dso_local noundef ptr @_Z2p3PU3AS3v(
+// NVPTX-SAME: ptr addrspace(3) noundef [[P:%.*]]) #[[ATTR0]] {
+// NVPTX-NEXT:  [[ENTRY:.*:]]
+// NVPTX-NEXT:    [[P_ADDR:%.*]] = alloca ptr addrspace(3), align 8
+// NVPTX-NEXT:    store ptr addrspace(3) [[P]], ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    [[TMP0:%.*]] = load ptr addrspace(3), ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    [[TMP1:%.*]] = addrspacecast ptr addrspace(3) [[TMP0]] to ptr
+// NVPTX-NEXT:    ret ptr [[TMP1]]
+//
+// AMDGPU-LABEL: define dso_local noundef ptr @_Z2p3PU3AS3v(
+// AMDGPU-SAME: ptr addrspace(3) noundef [[P:%.*]]) #[[ATTR0]] {
+// AMDGPU-NEXT:  [[ENTRY:.*:]]
+// AMDGPU-NEXT:    [[RETVAL:%.*]] = alloca ptr, align 8, addrspace(5)
+// AMDGPU-NEXT:    [[P_ADDR:%.*]] = alloca ptr addrspace(3), align 4, addrspace(5)
+// AMDGPU-NEXT:    [[RETVAL_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[RETVAL]] to ptr
+// AMDGPU-NEXT:    [[P_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[P_ADDR]] to ptr
+// AMDGPU-NEXT:    store ptr addrspace(3) [[P]], ptr [[P_ADDR_ASCAST]], align 4
+// AMDGPU-NEXT:    [[TMP0:%.*]] = load ptr addrspace(3), ptr [[P_ADDR_ASCAST]], align 4
+// AMDGPU-NEXT:    [[TMP1:%.*]] = addrspacecast ptr addrspace(3) [[TMP0]] to ptr
+// AMDGPU-NEXT:    ret ptr [[TMP1]]
+//
+void *p3(void [[clang::address_space(3)]] * p) { return p; }
+
+struct S {
+  S() = default;
+  ~S() = default;
+// NVPTX-LABEL: define linkonce_odr void @_ZN1S3fooEv(
+// NVPTX-SAME: ptr noundef nonnull align 1 dereferenceable(1) [[THIS:%.*]]) #[[ATTR0]] comdat align 2 {
+// NVPTX-NEXT:  [[ENTRY:.*:]]
+// NVPTX-NEXT:    [[THIS_ADDR:%.*]] = alloca ptr, align 8
+// NVPTX-NEXT:    store ptr [[THIS]], ptr [[THIS_ADDR]], align 8
+// NVPTX-NEXT:    [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR]], align 8
+// NVPTX-NEXT:    ret void
+//
+// AMDGPU-LABEL: define linkonce_odr void @_ZN1S3fooEv(
+// AMDGPU-SAME: ptr noundef nonnull align 1 dereferenceable(1) [[THIS:%.*]]) #[[ATTR0]] comdat align 2 {
+// AMDGPU-NEXT:  [[ENTRY:.*:]]
+// AMDGPU-NEXT:    [[THIS_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// AMDGPU-NEXT:    [[THIS_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[THIS_ADDR]] to ptr
+// AMDGPU-NEXT:    store ptr [[THIS]], ptr [[THIS_ADDR_ASCAST]], align 8
+// AMDGPU-NEXT:    [[THIS1:%.*]] = load ptr, ptr [[THIS_ADDR_ASCAST]], align 8
+// AMDGPU-NEXT:    ret void
+//
+  void foo() {}
+};
+
+S s1;
+S [[clang::opencl_global]] s2;
+S [[clang::opencl_local]] s3;
+
+// NVPTX-LABEL: define dso_local void @_Z1gv(
+// NVPTX-SAME: ) #[[ATTR0]] {
+// NVPTX-NEXT:  [[ENTRY:.*:]]
+// NVPTX-NEXT:    call void @_ZN1S3fooEv(ptr noundef nonnull align 1 dereferenceable(1) @s1) #[[ATTR1]]
+// NVPTX-NEXT:    call void @_ZN1S3fooEv(ptr noundef nonnull align 1 dereferenceable(1) addrspacecast (ptr addrspace(1) @s2 to ptr)) #[[ATTR1]]
+// NVPTX-NEXT:    call void @_ZN1S3fooEv(ptr noundef nonnull align 1 dereferenceable(1) addrspacecast (ptr addrspace(3) @s3 to ptr)) #[[ATTR1]]
+// NVPTX-NEXT:    ret void
+//
+// AMDGPU-LABEL: define dso_local void @_Z1gv(
+// AMDGPU-SAME: ) #[[ATTR0]] {
+// AMDGPU-NEXT:  [[ENTRY:.*:]]
+// AMDGPU-NEXT:    call void @_ZN1S3fooEv(ptr noundef nonnull align 1 dereferenceable(1) addrspacecast (ptr addrspace(1) @s1 to ptr)) #[[ATTR1]]
+// AMDGPU-NEXT:    call void @_ZN1S3fooEv(ptr noundef nonnull align 1 dereferenceable(1) addrspacecast (ptr addrspace(1) @s2 to ptr)) #[[ATTR1]]
+// AMDGPU-NEXT:    call void @_ZN1S3fooEv(ptr noundef nonnull align 1 dereferenceable(1) addrspacecast (ptr addrspace(3) @s3 to ptr)) #[[ATTR1]]
+// AMDGPU-NEXT:    ret void
+//
+void g() {
+  s1.foo();
+  s2.foo();
+  s3.foo();
+}
+
+template <typename Ty> void foo(Ty *) {}
+
+// NVPTX-LABEL: define dso_local void @_Z2t1Pv(
+// NVPTX-SAME: ptr noundef [[P:%.*]]) #[[ATTR0]] {
+// NVPTX-NEXT:  [[ENTRY:.*:]]
+// NVPTX-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8
+// NVPTX-NEXT:    store ptr [[P]], ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    call void @_Z3fooIvEvPT_(ptr noundef [[TMP0]]) #[[ATTR1]]
+// NVPTX-NEXT:    ret void
+//
+// AMDGPU-LABEL: define dso_local void @_Z2t1Pv(
+// AMDGPU-SAME: ptr noundef [[P:%.*]]) #[[ATTR0]] {
+// AMDGPU-NEXT:  [[ENTRY:.*:]]
+// AMDGPU-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// AMDGPU-NEXT:    [[P_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[P_ADDR]] to ptr
+// AMDGPU-NEXT:    store ptr [[P]], ptr [[P_ADDR_ASCAST]], align 8
+// AMDGPU-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR_ASCAST]], align 8
+// AMDGPU-NEXT:    call void @_Z3fooIvEvPT_(ptr noundef [[TMP0]]) #[[ATTR1]]
+// AMDGPU-NEXT:    ret void
+//
+void t1(void *p) { foo(p); }
+// NVPTX-LABEL: define dso_local void @_Z2t2Pv(
+// NVPTX-SAME: ptr noundef [[P:%.*]]) #[[ATTR0]] {
+// NVPTX-NEXT:  [[ENTRY:.*:]]
+// NVPTX-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8
+// NVPTX-NEXT:    store ptr [[P]], ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    call void @_Z3fooIvEvPT_(ptr noundef [[TMP0]]) #[[ATTR1]]
+// NVPTX-NEXT:    ret void
+//
+// AMDGPU-LABEL: define dso_local void @_Z2t2Pv(
+// AMDGPU-SAME: ptr noundef [[P:%.*]]) #[[ATTR0]] {
+// AMDGPU-NEXT:  [[ENTRY:.*:]]
+// AMDGPU-NEXT:    [[P_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// AMDGPU-NEXT:    [[P_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[P_ADDR]] to ptr
+// AMDGPU-NEXT:    store ptr [[P]], ptr [[P_ADDR_ASCAST]], align 8
+// AMDGPU-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[P_ADDR_ASCAST]], align 8
+// AMDGPU-NEXT:    call void @_Z3fooIvEvPT_(ptr noundef [[TMP0]]) #[[ATTR1]]
+// AMDGPU-NEXT:    ret void
+//
+void t2(void [[clang::opencl_generic]] *p) { foo(p); }
+// NVPTX-LABEL: define dso_local void @_Z2t3PU3AS3v(
+// NVPTX-SAME: ptr addrspace(3) noundef [[P:%.*]]) #[[ATTR0]] {
+// NVPTX-NEXT:  [[ENTRY:.*:]]
+// NVPTX-NEXT:    [[P_ADDR:%.*]] = alloca ptr addrspace(3), align 8
+// NVPTX-NEXT:    store ptr addrspace(3) [[P]], ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    [[TMP0:%.*]] = load ptr addrspace(3), ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    call void @_Z3fooIU3AS3vEvPT_(ptr addrspace(3) noundef [[TMP0]]) #[[ATTR1]]
+// NVPTX-NEXT:    ret void
+//
+// AMDGPU-LABEL: define dso_local void @_Z2t3PU3AS3v(
+// AMDGPU-SAME: ptr addrspace(3) noundef [[P:%.*]]) #[[ATTR0]] {
+// AMDGPU-NEXT:  [[ENTRY:.*:]]
+// AMDGPU-NEXT:    [[P_ADDR:%.*]] = alloca ptr addrspace(3), align 4, addrspace(5)
+// AMDGPU-NEXT:    [[P_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[P_ADDR]] to ptr
+// AMDGPU-NEXT:    store ptr addrspace(3) [[P]], ptr [[P_ADDR_ASCAST]], align 4
+// AMDGPU-NEXT:    [[TMP0:%.*]] = load ptr addrspace(3), ptr [[P_ADDR_ASCAST]], align 4
+// AMDGPU-NEXT:    call void @_Z3fooIU3AS3vEvPT_(ptr addrspace(3) noundef [[TMP0]]) #[[ATTR1]]
+// AMDGPU-NEXT:    ret void
+//
+void t3(void [[clang::opencl_local]] *p) { foo(p); }
+// NVPTX-LABEL: define dso_local void @_Z2t4PU5AS999v(
+// NVPTX-SAME: ptr addrspace(999) noundef [[P:%.*]]) #[[ATTR0]] {
+// NVPTX-NEXT:  [[ENTRY:.*:]]
+// NVPTX-NEXT:    [[P_ADDR:%.*]] = alloca ptr addrspace(999), align 8
+// NVPTX-NEXT:    store ptr addrspace(999) [[P]], ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    [[TMP0:%.*]] = load ptr addrspace(999), ptr [[P_ADDR]], align 8
+// NVPTX-NEXT:    call void @_Z3fooIU5AS999vEvPT_(ptr addrspace(999) noundef [[TMP0]]) #[[ATTR1]]
+// NVPTX-NEXT:    ret void
+//
+// AMDGPU-LABEL: define dso_local void @_Z2t4PU5AS999v(
+// AMDGPU-SAME: ptr addrspace(999) noundef [[P:%.*]]) #[[ATTR0]] {
+// AMDGPU-NEXT:  [[ENTRY:.*:]]
+// AMDGPU-NEXT:    [[P_ADDR:%.*]] = alloca ptr addrspace(999), align 8, addrspace(5)
+// AMDGPU-NEXT:    [[P_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[P_ADDR]] to ptr
+// AMDGPU-NEXT:    store ptr addrspace(999) [[P]], ptr [[P_ADDR_ASCAST]], align 8
+// AMDGPU-NEXT:    [[TMP0:%.*]] = load ptr addrspace(999), ptr [[P_ADDR_ASCAST]], align 8
+// AMDGPU-NEXT:    call void @_Z3fooIU5AS999vEvPT_(ptr addrspace(999) noundef [[TMP0]]) #[[ATTR1]]
+// AMDGPU-NEXT:    ret void
+//
+void t4(void [[clang::address_space(999)]] *p) { foo(p); }

@@ -1579,7 +1579,7 @@ NamedDecl *Sema::getCurFunctionOrMethodDecl() const {
}

LangAS Sema::getDefaultCXXMethodAddrSpace() const {
if (getLangOpts().OpenCL)
if (getLangOpts().OpenCL || getLangOpts().OpenCLGenericAddressSpace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole thing is just working around a defect in how OpenCL was implemented. There should be no difference between generic and default address space

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because the DefaultAS applies to every single language and it wasn't deemed proper to modify C/C++ behavior? I'm not an expert, maybe if @AnastasiaStulova chimes in. Regardless, I'm mostly just working with the smallest number of changes with what we've got that gives me the desired result.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 11, 2024

https://godbolt.org/z/qWGaejTx9 for this case, I'm wondering if there's a way to resolve this by considering the AS as part of the pointer, since I don't know if AS means anything on a non-pointer type.

@Artem-B
Copy link
Member

Artem-B commented Nov 11, 2024

This has a lot of unfortable effects that limit using address spaces in C++ as well
as making it more difficult to work with.

Can you give some examples?

It sounds that what you really want is for address space qualifiers to not be part of a type signature. OpenCL sort of happens to avoid that by sticking __generic AS qualifier on all pointers without one, and thus make plain pointers become __generic ones, which gets some C++ code happy, but that does look like a quirk of OpenCL. Normally, we a) do not have an explicit generic AS (or it would be indistinguishable from a plain pointer as it is on LLVM level w/ AS(0)), and b) when we specialize a function with an AS-specific pointer type, we generally do want it to be that type.

Clang and C++ indeed still have issues with AS-qualified pointers.
E.g. attempts to define function overloads with AS(0) and plain pointers result in an error about conflicting name mangling: https://godbolt.org/z/fW3dP4an5

Yet C++ does not consider the types equivalent, so you can't pass plain pointer as AS(0)-qualified argument and vice versa: https://godbolt.org/z/K4PEv9e7j

It's a bug that needs fixing, IMO. We either treat unqualified ans AS(0) pointer types as different, and give them different mangling. Or we should treat AS(0) and plain pointer types the same, diagnose AS0/plain overloads as redefinitions (no need to wait til we run into mangling conflict) and allow using them interchangeably.

Would straightening this out help with the issue you're trying to solve?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 11, 2024

This has a lot of unfortable effects that limit using address spaces in C++ as well
as making it more difficult to work with.

Can you give some examples?

It sounds that what you really want is for address space qualifiers to not be part of a type signature. OpenCL sort of happens to avoid that by sticking __generic AS qualifier on all pointers without one, and thus make plain pointers become __generic ones, which gets some C++ code happy, but that does look like a quirk of OpenCL. Normally, we a) do not have an explicit generic AS (or it would be indistinguishable from a plain pointer as it is on LLVM level w/ AS(0)), and b) when we specialize a function with an AS-specific pointer type, we generally do want it to be that type.

Clang and C++ indeed still have issues with AS-qualified pointers. E.g. attempts to define function overloads with AS(0) and plain pointers result in an error about conflicting name mangling: https://godbolt.org/z/fW3dP4an5

Yet C++ does not consider the types equivalent, so you can't pass plain pointer as AS(0)-qualified argument and vice versa: https://godbolt.org/z/K4PEv9e7j

It's a bug that needs fixing, IMO. We either treat unqualified ans AS(0) pointer types as different, and give them different mangling. Or we should treat AS(0) and plain pointer types the same, diagnose AS0/plain overloads as redefinitions (no need to wait til we run into mangling conflict) and allow using them interchangeably.

Would straightening this out help with the issue you're trying to solve?

Stuff like this is mostly what I'm talking about https://godbolt.org/z/1K8KGdqe9. I previously wanted to relax the handling of the default address space for all targets, but that made @AlexVlx unhappy in #112248. It's not it not being the type signature, it's more just having default conversion rules. The case I provided with the template is just an edge case that this changes away from C++, which is a little weird.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 12, 2024

This has a lot of unfortable effects that limit using address spaces in C++ as well
as making it more difficult to work with.

Can you give some examples?
It sounds that what you really want is for address space qualifiers to not be part of a type signature. OpenCL sort of happens to avoid that by sticking __generic AS qualifier on all pointers without one, and thus make plain pointers become __generic ones, which gets some C++ code happy, but that does look like a quirk of OpenCL. Normally, we a) do not have an explicit generic AS (or it would be indistinguishable from a plain pointer as it is on LLVM level w/ AS(0)), and b) when we specialize a function with an AS-specific pointer type, we generally do want it to be that type.
Clang and C++ indeed still have issues with AS-qualified pointers. E.g. attempts to define function overloads with AS(0) and plain pointers result in an error about conflicting name mangling: https://godbolt.org/z/fW3dP4an5
Yet C++ does not consider the types equivalent, so you can't pass plain pointer as AS(0)-qualified argument and vice versa: https://godbolt.org/z/K4PEv9e7j
It's a bug that needs fixing, IMO. We either treat unqualified ans AS(0) pointer types as different, and give them different mangling. Or we should treat AS(0) and plain pointer types the same, diagnose AS0/plain overloads as redefinitions (no need to wait til we run into mangling conflict) and allow using them interchangeably.
Would straightening this out help with the issue you're trying to solve?

Stuff like this is mostly what I'm talking about https://godbolt.org/z/1K8KGdqe9. I previously wanted to relax the handling of the default address space for all targets, but that made @AlexVlx unhappy in #112248. It's not it not being the type signature, it's more just having default conversion rules. The case I provided with the template is just an edge case that this changes away from C++, which is a little weird.

Honestly I would prefer to just have an option that lets DefaultAS work as the generic one... Then we wouldn't have the weirdness that templates suddenly stop working because all pointers are void __generic *

@Artem-B
Copy link
Member

Artem-B commented Nov 12, 2024

I think I generally agree with @AlexVlx argument. While the patch may solve you immediate issue, I think it's not going to give you a usable compilation model for AS-qualified pointers.

If you are defining your own C++ extension along the lines of CUDA/HIP/OpenCL, you would have some wiggle room on how you want things to behave, but if you want it to be a regular C++ compilation, you've got to play by C++ rules. The diagnostics in your example looks quite sensible to me -- you're asking compiler to do with things of different types, but you do not provide implementation for it. and because C++ has no idea about what can and can't be done with non-default AS, it's all on you to do it.

I don't think you can have this cake (explicit AS-qualified pointers) and eat it (make them all usable out of the box in a consistent manner, magically compatible with everything C++). Sticking __generic on plain pointer types will likely lead to issues.

For example, it looks like the pointer promotion will be ABI-breaking. How will you mangle function names with pointer arguments? Normally AS is encoded in the mangled name. And both the caller and callee must be in sync in order to call the right function -> therefore everything must be compiled with the same setting of this flag, including libc, etc.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 12, 2024

I think I generally agree with @AlexVlx argument. While the patch may solve you immediate issue, I think it's not going to give you a usable compilation model for AS-qualified pointers.

If you are defining your own C++ extension along the lines of CUDA/HIP/OpenCL, you would have some wiggle room on how you want things to behave, but if you want it to be a regular C++ compilation, you've got to play by C++ rules. The diagnostics in your example looks quite sensible to me -- you're asking compiler to do with things of different types, but you do not provide implementation for it. and because C++ has no idea about what can and can't be done with non-default AS, it's all on you to do it.

I don't think you can have this cake (explicit AS-qualified pointers) and eat it (make them all usable out of the box in a consistent manner, magically compatible with everything C++). Sticking __generic on plain pointer types will likely lead to issues.

For example, it looks like the pointer promotion will be ABI-breaking. How will you mangle function names with pointer arguments? Normally AS is encoded in the mangled name. And both the caller and callee must be in sync in order to call the right function -> therefore everything must be compiled with the same setting of this flag, including libc, etc.

I agree in general that C/C++ has no semantic meaning ascribed to address spaces, so we cannot rely on intelligent semantic checking since C++ is strongly typed. Realizing that this literally changes all the address spaces makes it a bit difficult to argue that it's just C++, however it's pretty much impossible to use classes as-is since we cannot convert this to the object's address space.

I think this approach is fine as an opt-in to have really lenient rules, since that's what AMDGPU and CUDA support already... But doing it with opencl_generic does not seem the way as you mentioned, since now every function gets a different mangling, and templates all work different.

This was the 'easy' solution, alternatives would be to allow this for target specific address spaces (i.e. AMDGPU supports any addrspace to default). Another solution would be to do the same treatment here but with DefaultAS instead. I'd say the third solution is to just put C-style casts everywhere, but that doesn't really work for classes as evidenced here.

@AaronBallman
Copy link
Collaborator

I agree in general that C/C++ has no semantic meaning ascribed to address spaces.

Err, C does. Please see TR 18037 (https://standards.iso.org/ittf/PubliclyAvailableStandards/c051126_ISO_IEC_TR_18037_2008.zip)

AIUI, we've extended the C extension into C++, so we should be following the C semantics while mapping them on to C++'s stronger type system as closely as we can.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 12, 2024

I agree in general that C/C++ has no semantic meaning ascribed to address spaces.

Err, C does. Please see TR 18037 (https://standards.iso.org/ittf/PubliclyAvailableStandards/c051126_ISO_IEC_TR_18037_2008.zip)

AIUI, we've extended the C extension into C++, so we should be following the C semantics while mapping them on to C++'s stronger type system as closely as we can.

Interesting, though reading through that I didn't see any mentions of implicit casts. It's simply stating that they can do casting if the target lists them as a subset. However, that would mean that the program passing sema would be dependent on the target machine. I'm actually not opposed to that, we could tear out the isAddrSpaceSuperSetOf check to allow it to keep a reference to the TargetInfo when making its decisions.

Also it's worth noting that in the documentation it has stuff like __X char x, y, z which is a warning in clang due to how type level attributes work https://godbolt.org/z/Pfv36nEdq.

Maybe someone who reads better standardese than me can help with this. My ultimate goal is to make this legal when targeting NVPTX or AMDGPU, which is easily done by stating that DefaultAS is a superset of these other AS's. Let me know if making this target dependent is a good idea.

void *foo(void [[clang::addrspace(3) *p) { return p; }

@jhuber6 jhuber6 changed the title [Clang] Add -fdefault-generic-addrspace flag for targeting GPUs [Clang] Use TargetInfo when deciding is an address space is compatible Nov 12, 2024
@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 12, 2024

I updated the patch to use target info instead, this doesn't affect OpenCL because they don't use the DefaultAS. I would appreciate some information from the people who know the standards better than I.

@AlexVlx
Copy link
Contributor

AlexVlx commented Nov 13, 2024

I could see only applying these rules to the target specific (i.e. non-language AS's) so AS(3) * would convert to * but not __shared__ *. The above example doesn't work because we consider cuda_shared and AS(3) distinct even though they lower to the same thing for the target.

Which is the gist of what I am saying: linguistic constructs carry semantics (meaning), we should not work back from their disembodied lowering into a target specific quantity to generalise the latter into a linguistic construct beyond the initial one. CUDA made language design choices, it is an AS agnostic language, cuda_shared doesn't work like an address space in the OCL / Clang sense, even though it ends up lowered through the same mechanism / leads to a numbered AS being CodeGen-ed. TL;DR: this doesn't work by design, not by accident = those ARE distinct from a language perspective.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 13, 2024

Which is the gist of what I am saying: linguistic constructs carry semantics (meaning), we should not work back from their disembodied lowering into a target specific quantity to generalise the latter into a linguistic construct beyond the initial one. CUDA made language design choices, it is an AS agnostic language, cuda_shared doesn't work like an address space in the OCL / Clang sense, even though it ends up lowered through the same mechanism / leads to a numbered AS being CodeGen-ed. TL;DR: this doesn't work by design, not by accident = those ARE distinct from a language perspective.

I've updated this to make the language AS's and target AS's distinct so now this won't conflict at all. This patch just uses target information to ascribe some semantic meaning to the target AS's, which I believe is legitimate. The text that @AaronBallman provided says the following.

If a pointer into address space A is assigned to a pointer into a different address space B, a
constraint requires that A be a subset of B. (As usual, this constraint can be circumvented by a cast
of the source pointer before the assignment.)
...
For every pair of distinct address spaces A and B, it is implementation-defined whether A
encloses B.
..
If one address space encloses another, the two address spaces overlap, and their combined
address space is the one that encloses the other.

My interpretation of this is that it's fine to have target specific behavior for address spaces, and that it seems okay with B = A if we say they're subsets (which they are for this target).

@rjmccall
Copy link
Contributor

Address spaces from language dialects generally have required relationships and behaviors in the language, and that really shouldn't be overridden by targets. However, targets do need to be able to decide how target-specific address spaces work, including how they interact with language address spaces. We can square this circle in two ways that I can see. The first is that we could only defer to the TargetInfo when at least one of the address spaces is a target AS. The second is by having the default implementation implement the standard language rules and just expecting that the target will know to defer to it when its target address spaces aren't involved. I don't particularly care which way we go.

It seems to me that the AST-level comparison routines ought to take an ASTContext & instead of the TI directly, though.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 13, 2024

Address spaces from language dialects generally have required relationships and behaviors in the language, and that really shouldn't be overridden by targets. However, targets do need to be able to decide how target-specific address spaces work, including how they interact with language address spaces. We can square this circle in two ways that I can see. The first is that we could only defer to the TargetInfo when at least one of the address spaces is a target AS. The second is by having the default implementation implement the standard language rules and just expecting that the target will know to defer to it when its target address spaces aren't involved. I don't particularly care which way we go.

This is pretty much what I've got now, we only check the target specific address spaces in the target info, so the languages should be unaffected.

It seems to me that the AST-level comparison routines ought to take an ASTContext & instead of the TI directly, though.

I can make that change, might make it easier to expand in the future. See below.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 13, 2024

Okay the problem with using ASTContext here is that it creates some recursive includes. I can do this by moving the check into Type.cpp instead, so this will be function call instead of being inlined. This would require a lot of extra stuff so I'm going to defer it unless we really think it needs to be done.

@rjmccall
Copy link
Contributor

Okay the problem with using ASTContext here is that it creates some recursive includes. I can do this by moving the check into Type.cpp instead, so this will be function call instead of being inlined. This would require a lot of extra stuff so I'm going to defer it unless we really think it needs to be done.

Can you just outline the slow path where you have to actually call the target info hook?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 13, 2024

Okay the problem with using ASTContext here is that it creates some recursive includes. I can do this by moving the check into Type.cpp instead, so this will be function call instead of being inlined. This would require a lot of extra stuff so I'm going to defer it unless we really think it needs to be done.

Can you just outline the slow path where you have to actually call the target info hook?

It would require pushing some unrelated functions into .cpp files as well since the final check is shared between a lot of functions as shown in the diff, I'm not keen but if you think it's a blocker I can make it happen. Maybe we'd need it in the future if we wanted to control this with a language option? But for now I'm not sure it's worth the extra effort.

@rjmccall
Copy link
Contributor

rjmccall commented Nov 13, 2024

I'm not sure what unrelated code you're saying would need to pulled into a .cpp file. It looks like there's only one actual call to TI.isAddressSpaceSupersetOf, so if you just pass around an ASTContext & to that point, nothing else will need to drill into it. And frankly that entire expression in Qualifiers::isAddressSpaceSupersetOf after the equality check is so unwieldy that it probably ought to have been pulled into a .cpp file a long time ago.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 13, 2024

I'm not sure what unrelated code you're saying would need to pulled into a .cpp file. It looks like there's only one actual call to TI.isAddressSpaceSupersetOf, so if you just pass around an ASTContext & to that point, nothing else will need to drill into it. And frankly that entire expression in Qualifiers::isAddressSpaceSupersetOf after the equality check is so unwieldy that it probably ought to have been pulled into a .cpp file a long time ago.

The checks for bool isMoreQualifiedThan in clang/include/clang/AST/CanonicalType.h. This is included by ASTContext.h and doesn't have a .cpp file so I'd need to find somewhere to put it.

@rjmccall
Copy link
Contributor

Oh, you can just forward-declare class ASTContext at the top of that file. It's funny that that isn't already there.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 14, 2024

Oh, you can just forward-declare class ASTContext at the top of that file. It's funny that that isn't already there.

Done, and a sema test.

Copy link
Contributor

@rjmccall rjmccall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly LGTM. A few structural requests.

@jayfoad
Copy link
Contributor

jayfoad commented Nov 14, 2024

Use TargetInfo when deciding is an address space is compatible

Typo? "Use TargetInfo when deciding if an address space is compatible"

@jhuber6 jhuber6 changed the title [Clang] Use TargetInfo when deciding is an address space is compatible [Clang] Use TargetInfo when deciding if an address space is compatible Nov 14, 2024
Copy link
Contributor

@rjmccall rjmccall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM. I would still prefer this PR to just contain the NFC refactor, with a follow up that adds the semantic changes to the AMDGPU and NVPTX targets.

@jhuber6 jhuber6 merged commit b9d678d into llvm:main Nov 15, 2024
8 checks passed
nikic added a commit that referenced this pull request Nov 15, 2024
This was introduced in #115777, but isn't actually used.
@@ -31,6 +31,7 @@
#include "clang/Basic/PointerAuthOptions.h"
#include "clang/Basic/SourceLocation.h"
#include "clang/Basic/Specifiers.h"
#include "clang/Basic/TargetInfo.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped this include again in bc3b0fa because it was causing a significant build time regression. I assume this was a leftover from some earlier iteration of the patch.

Copy link
Contributor Author

@jhuber6 jhuber6 Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be necessary, I can make a PR to remove it again. Ah you meant you did it already, nevermind. Thanks for cleaning that up. Doing the change that was requested was really annoying so I did it off of the diff and forgot to delete that.

@@ -111,6 +111,18 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : public TargetInfo {
return getPointerWidthV(AddrSpace);
}

virtual bool isAddressSpaceSupersetOf(LangAS A, LangAS B) const override {
// The flat address space AS(0) is a superset of all the other address
// spaces used by the backend target.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should exclude several address spaces. Should exclude region. Arguably also the fat pointers

Copy link
Contributor Author

@jhuber6 jhuber6 Nov 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic should only permit 0-5, I was torn on rejecting AS(2), can edit that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should reject it, it's not flat addressable. We don't have a magic aperture constant for it and the addrspacecast codegen will fail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 backend:AMDGPU backend:ARM clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:openmp OpenMP related changes to Clang clang Clang issues not falling into any other category clang-tidy clang-tools-extra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants