-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[OpenCL] Add missing OpenCL 3.0 features to OpenCLExtensions.def; revert header-only macros #168016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ns.def Adds the remaining optional feature macros from the OpenCL C 3.0 spec (section 6.2.1 table). Targets can now enable these via OpenCLFeaturesMap returned by getSupportedOpenCLOpts().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds missing OpenCL C 3.0 optional feature macros to the OpenCLExtensions.def file, ensuring completeness with the OpenCL C 3.0 specification (section 6.2.1 table). This enables targets to properly advertise support for these optional features through the OpenCLFeaturesMap.
- Adds 11 missing optional feature macros for OpenCL C 3.0
- Ensures all features from the OpenCL C 3.0 spec table are represented in the codebase
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
I would expect clang/test/SemaOpenCL/features.cl tests will be extended to cover new features. |
|
@llvm/pr-subscribers-backend-amdgpu Author: Wenju He (wenju-he) ChangesAdds the remaining optional feature macros from the OpenCL C 3.0 spec (section 6.2.1 table). Targets can now enable these via OpenCLFeaturesMap returned by getSupportedOpenCLOpts(). Patch is 39.74 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/168016.diff 6 Files Affected:
diff --git a/clang/include/clang/Basic/OpenCLExtensions.def b/clang/include/clang/Basic/OpenCLExtensions.def
index 6f73b26137500..9fc8c31056f4d 100644
--- a/clang/include/clang/Basic/OpenCLExtensions.def
+++ b/clang/include/clang/Basic/OpenCLExtensions.def
@@ -78,10 +78,55 @@ OPENCL_EXTENSION(cl_khr_depth_images, true, 120)
OPENCL_EXTENSION(cl_khr_gl_msaa_sharing,true, 120)
// OpenCL 2.0.
+OPENCL_EXTENSION(cl_ext_float_atomics, true, 200)
+OPENCL_EXTENSION(cl_khr_extended_bit_ops, true, 200)
+OPENCL_EXTENSION(cl_khr_integer_dot_product, true, 200)
+OPENCL_EXTENSION(cl_khr_kernel_clock, true, 200)
OPENCL_EXTENSION(cl_khr_mipmap_image, true, 200)
OPENCL_EXTENSION(cl_khr_mipmap_image_writes, true, 200)
OPENCL_EXTENSION(cl_khr_srgb_image_writes, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_ballot, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_clustered_reduce, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_extended_types, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_non_uniform_arithmetic, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_non_uniform_vote, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_rotate, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_shuffle_relative, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_shuffle, true, 200)
OPENCL_EXTENSION(cl_khr_subgroups, true, 200)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_acq_rel, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_seq_cst, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_all_devices, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_device, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_device_enqueue, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_load_store, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_load_store, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_global_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_global_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_local_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_local_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_global_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_global_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_local_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_local_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_raw10_raw12, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_unorm_int_2_101010, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_unsigned_10x6_12x4_14x2, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_generic_address_space, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_images, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_integer_dot_product_input_4x8bit, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_integer_dot_product_input_4x8bit_packed, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_device, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_sub_group, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_work_group, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_pipes, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_program_scope_global_variables, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_read_write_images, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_work_group_collective_functions, false, 200, OCL_C_20)
// Clang Extensions.
OPENCL_EXTENSION(cl_clang_storage_class_specifiers, true, 100)
@@ -100,17 +145,9 @@ OPENCL_EXTENSION(cl_intel_subgroups_short, true, 120)
OPENCL_EXTENSION(cl_intel_device_side_avc_motion_estimation, true, 120)
// OpenCL C 3.0 features (6.2.1. Features)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_pipes, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_generic_address_space, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_acq_rel, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_seq_cst, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_subgroups, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_3d_image_writes, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_device_enqueue, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_read_write_images, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_program_scope_global_variables, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_fp64, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_images, false, 300, OCL_C_30)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_subgroups, false, 300, OCL_C_30)
#undef OPENCL_OPTIONALCOREFEATURE
#undef OPENCL_COREFEATURE
diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index dfcc79402257a..95a3d133c07ac 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -320,6 +320,8 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : public TargetInfo {
Opts["__opencl_c_3d_image_writes"] = true;
Opts["cl_khr_3d_image_writes"] = true;
Opts["__opencl_c_program_scope_global_variables"] = true;
+ Opts["__opencl_c_atomic_scope_all_devices"] = true;
+ Opts["__opencl_c_atomic_order_seq_cst"] = true;
if (GPUKind >= llvm::AMDGPU::GK_GFX700) {
Opts["__opencl_c_generic_address_space"] = true;
diff --git a/clang/lib/Headers/opencl-c-base.h b/clang/lib/Headers/opencl-c-base.h
index 414f10ad832ce..898026c66614a 100644
--- a/clang/lib/Headers/opencl-c-base.h
+++ b/clang/lib/Headers/opencl-c-base.h
@@ -9,105 +9,6 @@
#ifndef _OPENCL_BASE_H_
#define _OPENCL_BASE_H_
-// Define extension macros
-
-#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
-// For SPIR and SPIR-V all extensions are supported.
-#if defined(__SPIR__) || defined(__SPIRV__)
-#define cl_khr_subgroup_extended_types 1
-#define cl_khr_subgroup_non_uniform_vote 1
-#define cl_khr_subgroup_ballot 1
-#define cl_khr_subgroup_non_uniform_arithmetic 1
-#define cl_khr_subgroup_shuffle 1
-#define cl_khr_subgroup_shuffle_relative 1
-#define cl_khr_subgroup_clustered_reduce 1
-#define cl_khr_subgroup_rotate 1
-#define cl_khr_extended_bit_ops 1
-#define cl_khr_integer_dot_product 1
-#define __opencl_c_integer_dot_product_input_4x8bit 1
-#define __opencl_c_integer_dot_product_input_4x8bit_packed 1
-#define cl_ext_float_atomics 1
-#ifdef cl_khr_fp16
-#define __opencl_c_ext_fp16_global_atomic_load_store 1
-#define __opencl_c_ext_fp16_local_atomic_load_store 1
-#define __opencl_c_ext_fp16_global_atomic_add 1
-#define __opencl_c_ext_fp16_local_atomic_add 1
-#define __opencl_c_ext_fp16_global_atomic_min_max 1
-#define __opencl_c_ext_fp16_local_atomic_min_max 1
-#endif
-#ifdef cl_khr_fp64
-#define __opencl_c_ext_fp64_global_atomic_add 1
-#define __opencl_c_ext_fp64_local_atomic_add 1
-#define __opencl_c_ext_fp64_global_atomic_min_max 1
-#define __opencl_c_ext_fp64_local_atomic_min_max 1
-#endif
-#define __opencl_c_ext_fp32_global_atomic_add 1
-#define __opencl_c_ext_fp32_local_atomic_add 1
-#define __opencl_c_ext_fp32_global_atomic_min_max 1
-#define __opencl_c_ext_fp32_local_atomic_min_max 1
-#define __opencl_c_ext_image_raw10_raw12 1
-#define __opencl_c_ext_image_unorm_int_2_101010 1
-#define __opencl_c_ext_image_unsigned_10x6_12x4_14x2 1
-#define cl_khr_kernel_clock 1
-#define __opencl_c_kernel_clock_scope_device 1
-#define __opencl_c_kernel_clock_scope_work_group 1
-#define __opencl_c_kernel_clock_scope_sub_group 1
-
-#endif // defined(__SPIR__) || defined(__SPIRV__)
-#endif // (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
-
-// Define feature macros for OpenCL C 2.0
-#if (__OPENCL_CPP_VERSION__ == 100 || __OPENCL_C_VERSION__ == 200)
-#define __opencl_c_pipes 1
-#define __opencl_c_generic_address_space 1
-#define __opencl_c_work_group_collective_functions 1
-#define __opencl_c_atomic_order_acq_rel 1
-#define __opencl_c_atomic_order_seq_cst 1
-#define __opencl_c_atomic_scope_device 1
-#define __opencl_c_atomic_scope_all_devices 1
-#define __opencl_c_device_enqueue 1
-#define __opencl_c_read_write_images 1
-#define __opencl_c_program_scope_global_variables 1
-#define __opencl_c_images 1
-#endif
-
-// Define header-only feature macros for OpenCL C 3.0.
-#if (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)
-// For the SPIR and SPIR-V target all features are supported.
-#if defined(__SPIR__) || defined(__SPIRV__)
-#define __opencl_c_work_group_collective_functions 1
-#define __opencl_c_atomic_order_seq_cst 1
-#define __opencl_c_atomic_scope_device 1
-#define __opencl_c_atomic_scope_all_devices 1
-#define __opencl_c_read_write_images 1
-#endif // defined(__SPIR__)
-
-#endif // (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)
-
-// Undefine any feature macros that have been explicitly disabled using
-// an __undef_<feature> macro.
-#ifdef __undef___opencl_c_work_group_collective_functions
-#undef __opencl_c_work_group_collective_functions
-#endif
-#ifdef __undef___opencl_c_atomic_order_seq_cst
-#undef __opencl_c_atomic_order_seq_cst
-#endif
-#ifdef __undef___opencl_c_atomic_scope_device
-#undef __opencl_c_atomic_scope_device
-#endif
-#ifdef __undef___opencl_c_atomic_scope_all_devices
-#undef __opencl_c_atomic_scope_all_devices
-#endif
-#ifdef __undef___opencl_c_read_write_images
-#undef __opencl_c_read_write_images
-#endif
-#ifdef __undef___opencl_c_integer_dot_product_input_4x8bit
-#undef __opencl_c_integer_dot_product_input_4x8bit
-#endif
-#ifdef __undef___opencl_c_integer_dot_product_input_4x8bit_packed
-#undef __opencl_c_integer_dot_product_input_4x8bit_packed
-#endif
-
#if !defined(__opencl_c_generic_address_space)
// Internal feature macro to provide named (global, local, private) address
// space overloads for builtin functions that take a pointer argument.
diff --git a/clang/test/Headers/opencl-c-header.cl b/clang/test/Headers/opencl-c-header.cl
index 17cbb67f26038..e26f16827b20f 100644
--- a/clang/test/Headers/opencl-c-header.cl
+++ b/clang/test/Headers/opencl-c-header.cl
@@ -33,7 +33,7 @@
// ===
// Compile for OpenCL 2.0 for the first time. The module should change.
// RUN: %clang_cc1 -triple spir-unknown-unknown -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -fdisable-module-hash -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
-// RUN: not diff %t/1_0.pcm %t/opencl_c.pcm
+// RUN: not diff %t/1_0.pcm %t/opencl_c.pcm > /dev/null
// RUN: chmod u-w %t/opencl_c.pcm
// ===
@@ -44,10 +44,10 @@
// RUN: rm -rf %t
// RUN: mkdir -p %t
// RUN: %clang_cc1 -triple spir64-unknown-unknown -emit-llvm -o - -cl-std=CL1.2 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK --check-prefix=CHECK-MOD %s
-// RUN: %clang_cc1 -triple amdgcn--amdhsa -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
+// RUN: %clang_cc1 -triple amdgcn--amdhsa -target-cpu gfx700 -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
// RUN: chmod u-w %t
// RUN: %clang_cc1 -triple spir64-unknown-unknown -emit-llvm -o - -cl-std=CL1.2 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK --check-prefix=CHECK-MOD %s
-// RUN: %clang_cc1 -triple amdgcn--amdhsa -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
+// RUN: %clang_cc1 -triple amdgcn--amdhsa -target-cpu gfx700 -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
// RUN: chmod u+w %t
// Verify that called builtins occur in the generated IR.
diff --git a/clang/test/SemaOpenCL/extension-version.cl b/clang/test/SemaOpenCL/extension-version.cl
index 4d92eff6ae3ac..35a1818706f35 100644
--- a/clang/test/SemaOpenCL/extension-version.cl
+++ b/clang/test/SemaOpenCL/extension-version.cl
@@ -165,6 +165,150 @@
#endif
#pragma OPENCL EXTENSION cl_khr_subgroups : enable
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_ext_float_atomics
+#error "Missing cl_ext_float_atomics define"
+#endif
+#else
+#ifdef cl_ext_float_atomics
+#error "Incorrect cl_ext_float_atomics define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_ext_float_atomics' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_ext_float_atomics : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_extended_bit_ops
+#error "Missing cl_khr_extended_bit_ops define"
+#endif
+#else
+#ifdef cl_khr_extended_bit_ops
+#error "Incorrect cl_khr_extended_bit_ops define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_extended_bit_ops' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_extended_bit_ops : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_integer_dot_product
+#error "Missing cl_khr_integer_dot_product define"
+#endif
+#else
+#ifdef cl_khr_integer_dot_product
+#error "Incorrect cl_khr_integer_dot_product define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_integer_dot_product' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_integer_dot_product : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_kernel_clock
+#error "Missing cl_khr_kernel_clock define"
+#endif
+#else
+#ifdef cl_khr_kernel_clock
+#error "Incorrect cl_khr_kernel_clock define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_kernel_clock' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_kernel_clock : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_ballot
+#error "Missing cl_khr_subgroup_ballot define"
+#endif
+#else
+#ifdef cl_khr_subgroup_ballot
+#error "Incorrect cl_khr_subgroup_ballot define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_ballot' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_ballot : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_clustered_reduce
+#error "Missing cl_khr_subgroup_clustered_reduce define"
+#endif
+#else
+#ifdef cl_khr_subgroup_clustered_reduce
+#error "Incorrect cl_khr_subgroup_clustered_reduce define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_clustered_reduce' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_clustered_reduce : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_extended_types
+#error "Missing cl_khr_subgroup_extended_types define"
+#endif
+#else
+#ifdef cl_khr_subgroup_extended_types
+#error "Incorrect cl_khr_subgroup_extended_types define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_extended_types' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_extended_types : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_non_uniform_arithmetic
+#error "Missing cl_khr_subgroup_non_uniform_arithmetic define"
+#endif
+#else
+#ifdef cl_khr_subgroup_non_uniform_arithmetic
+#error "Incorrect cl_khr_subgroup_non_uniform_arithmetic define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_non_uniform_arithmetic' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_non_uniform_arithmetic : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_non_uniform_vote
+#error "Missing cl_khr_subgroup_non_uniform_vote define"
+#endif
+#else
+#ifdef cl_khr_subgroup_non_uniform_vote
+#error "Incorrect cl_khr_subgroup_non_uniform_vote define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_non_uniform_vote' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_non_uniform_vote : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_rotate
+#error "Missing cl_khr_subgroup_rotate define"
+#endif
+#else
+#ifdef cl_khr_subgroup_rotate
+#error "Incorrect cl_khr_subgroup_rotate define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_rotate' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_rotate : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_shuffle_relative
+#error "Missing cl_khr_subgroup_shuffle_relative define"
+#endif
+#else
+#ifdef cl_khr_subgroup_shuffle_relative
+#error "Incorrect cl_khr_subgroup_shuffle_relative define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_shuffle_relative' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_shuffle_relative : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_shuffle
+#error "Missing cl_khr_subgroup_shuffle define"
+#endif
+#else
+#ifdef cl_khr_subgroup_shuffle
+#error "Incorrect cl_khr_subgroup_shuffle define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_shuffle' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_shuffle : enable
+
#ifndef cl_amd_media_ops
#error "Missing cl_amd_media_ops define"
#endif
@@ -224,14 +368,62 @@
//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_order_acq_rel' unknown or does not require pragma - ignoring}}
#pragma OPENCL EXTENSION __opencl_c_atomic_order_seq_cst : disable
//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_order_seq_cst' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_atomic_scope_all_devices : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_scope_all_devices' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_atomic_scope_device : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_scope_device' unknown or does not require pragma - ignoring}}
#pragma OPENCL EXTENSION __opencl_c_device_enqueue : disable
//expected-warning@-1{{OpenCL extension '__opencl_c_device_enqueue' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_global_atomic_add : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_global_atomic_add' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_global_atomic_load_store : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_global_atomic_load_store' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_global_atomic_min_max : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_global_atomic_min_max' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_local_atomic_add : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_local_atomic_add' unknown or does not require pragma - ignoring}}...
[truncated]
|
|
@llvm/pr-subscribers-backend-x86 Author: Wenju He (wenju-he) ChangesAdds the remaining optional feature macros from the OpenCL C 3.0 spec (section 6.2.1 table). Targets can now enable these via OpenCLFeaturesMap returned by getSupportedOpenCLOpts(). Patch is 39.74 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/168016.diff 6 Files Affected:
diff --git a/clang/include/clang/Basic/OpenCLExtensions.def b/clang/include/clang/Basic/OpenCLExtensions.def
index 6f73b26137500..9fc8c31056f4d 100644
--- a/clang/include/clang/Basic/OpenCLExtensions.def
+++ b/clang/include/clang/Basic/OpenCLExtensions.def
@@ -78,10 +78,55 @@ OPENCL_EXTENSION(cl_khr_depth_images, true, 120)
OPENCL_EXTENSION(cl_khr_gl_msaa_sharing,true, 120)
// OpenCL 2.0.
+OPENCL_EXTENSION(cl_ext_float_atomics, true, 200)
+OPENCL_EXTENSION(cl_khr_extended_bit_ops, true, 200)
+OPENCL_EXTENSION(cl_khr_integer_dot_product, true, 200)
+OPENCL_EXTENSION(cl_khr_kernel_clock, true, 200)
OPENCL_EXTENSION(cl_khr_mipmap_image, true, 200)
OPENCL_EXTENSION(cl_khr_mipmap_image_writes, true, 200)
OPENCL_EXTENSION(cl_khr_srgb_image_writes, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_ballot, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_clustered_reduce, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_extended_types, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_non_uniform_arithmetic, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_non_uniform_vote, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_rotate, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_shuffle_relative, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_shuffle, true, 200)
OPENCL_EXTENSION(cl_khr_subgroups, true, 200)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_acq_rel, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_seq_cst, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_all_devices, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_device, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_device_enqueue, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_load_store, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_load_store, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_global_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_global_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_local_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_local_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_global_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_global_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_local_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_local_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_raw10_raw12, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_unorm_int_2_101010, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_unsigned_10x6_12x4_14x2, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_generic_address_space, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_images, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_integer_dot_product_input_4x8bit, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_integer_dot_product_input_4x8bit_packed, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_device, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_sub_group, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_work_group, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_pipes, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_program_scope_global_variables, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_read_write_images, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_work_group_collective_functions, false, 200, OCL_C_20)
// Clang Extensions.
OPENCL_EXTENSION(cl_clang_storage_class_specifiers, true, 100)
@@ -100,17 +145,9 @@ OPENCL_EXTENSION(cl_intel_subgroups_short, true, 120)
OPENCL_EXTENSION(cl_intel_device_side_avc_motion_estimation, true, 120)
// OpenCL C 3.0 features (6.2.1. Features)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_pipes, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_generic_address_space, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_acq_rel, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_seq_cst, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_subgroups, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_3d_image_writes, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_device_enqueue, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_read_write_images, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_program_scope_global_variables, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_fp64, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_images, false, 300, OCL_C_30)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_subgroups, false, 300, OCL_C_30)
#undef OPENCL_OPTIONALCOREFEATURE
#undef OPENCL_COREFEATURE
diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index dfcc79402257a..95a3d133c07ac 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -320,6 +320,8 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : public TargetInfo {
Opts["__opencl_c_3d_image_writes"] = true;
Opts["cl_khr_3d_image_writes"] = true;
Opts["__opencl_c_program_scope_global_variables"] = true;
+ Opts["__opencl_c_atomic_scope_all_devices"] = true;
+ Opts["__opencl_c_atomic_order_seq_cst"] = true;
if (GPUKind >= llvm::AMDGPU::GK_GFX700) {
Opts["__opencl_c_generic_address_space"] = true;
diff --git a/clang/lib/Headers/opencl-c-base.h b/clang/lib/Headers/opencl-c-base.h
index 414f10ad832ce..898026c66614a 100644
--- a/clang/lib/Headers/opencl-c-base.h
+++ b/clang/lib/Headers/opencl-c-base.h
@@ -9,105 +9,6 @@
#ifndef _OPENCL_BASE_H_
#define _OPENCL_BASE_H_
-// Define extension macros
-
-#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
-// For SPIR and SPIR-V all extensions are supported.
-#if defined(__SPIR__) || defined(__SPIRV__)
-#define cl_khr_subgroup_extended_types 1
-#define cl_khr_subgroup_non_uniform_vote 1
-#define cl_khr_subgroup_ballot 1
-#define cl_khr_subgroup_non_uniform_arithmetic 1
-#define cl_khr_subgroup_shuffle 1
-#define cl_khr_subgroup_shuffle_relative 1
-#define cl_khr_subgroup_clustered_reduce 1
-#define cl_khr_subgroup_rotate 1
-#define cl_khr_extended_bit_ops 1
-#define cl_khr_integer_dot_product 1
-#define __opencl_c_integer_dot_product_input_4x8bit 1
-#define __opencl_c_integer_dot_product_input_4x8bit_packed 1
-#define cl_ext_float_atomics 1
-#ifdef cl_khr_fp16
-#define __opencl_c_ext_fp16_global_atomic_load_store 1
-#define __opencl_c_ext_fp16_local_atomic_load_store 1
-#define __opencl_c_ext_fp16_global_atomic_add 1
-#define __opencl_c_ext_fp16_local_atomic_add 1
-#define __opencl_c_ext_fp16_global_atomic_min_max 1
-#define __opencl_c_ext_fp16_local_atomic_min_max 1
-#endif
-#ifdef cl_khr_fp64
-#define __opencl_c_ext_fp64_global_atomic_add 1
-#define __opencl_c_ext_fp64_local_atomic_add 1
-#define __opencl_c_ext_fp64_global_atomic_min_max 1
-#define __opencl_c_ext_fp64_local_atomic_min_max 1
-#endif
-#define __opencl_c_ext_fp32_global_atomic_add 1
-#define __opencl_c_ext_fp32_local_atomic_add 1
-#define __opencl_c_ext_fp32_global_atomic_min_max 1
-#define __opencl_c_ext_fp32_local_atomic_min_max 1
-#define __opencl_c_ext_image_raw10_raw12 1
-#define __opencl_c_ext_image_unorm_int_2_101010 1
-#define __opencl_c_ext_image_unsigned_10x6_12x4_14x2 1
-#define cl_khr_kernel_clock 1
-#define __opencl_c_kernel_clock_scope_device 1
-#define __opencl_c_kernel_clock_scope_work_group 1
-#define __opencl_c_kernel_clock_scope_sub_group 1
-
-#endif // defined(__SPIR__) || defined(__SPIRV__)
-#endif // (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
-
-// Define feature macros for OpenCL C 2.0
-#if (__OPENCL_CPP_VERSION__ == 100 || __OPENCL_C_VERSION__ == 200)
-#define __opencl_c_pipes 1
-#define __opencl_c_generic_address_space 1
-#define __opencl_c_work_group_collective_functions 1
-#define __opencl_c_atomic_order_acq_rel 1
-#define __opencl_c_atomic_order_seq_cst 1
-#define __opencl_c_atomic_scope_device 1
-#define __opencl_c_atomic_scope_all_devices 1
-#define __opencl_c_device_enqueue 1
-#define __opencl_c_read_write_images 1
-#define __opencl_c_program_scope_global_variables 1
-#define __opencl_c_images 1
-#endif
-
-// Define header-only feature macros for OpenCL C 3.0.
-#if (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)
-// For the SPIR and SPIR-V target all features are supported.
-#if defined(__SPIR__) || defined(__SPIRV__)
-#define __opencl_c_work_group_collective_functions 1
-#define __opencl_c_atomic_order_seq_cst 1
-#define __opencl_c_atomic_scope_device 1
-#define __opencl_c_atomic_scope_all_devices 1
-#define __opencl_c_read_write_images 1
-#endif // defined(__SPIR__)
-
-#endif // (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)
-
-// Undefine any feature macros that have been explicitly disabled using
-// an __undef_<feature> macro.
-#ifdef __undef___opencl_c_work_group_collective_functions
-#undef __opencl_c_work_group_collective_functions
-#endif
-#ifdef __undef___opencl_c_atomic_order_seq_cst
-#undef __opencl_c_atomic_order_seq_cst
-#endif
-#ifdef __undef___opencl_c_atomic_scope_device
-#undef __opencl_c_atomic_scope_device
-#endif
-#ifdef __undef___opencl_c_atomic_scope_all_devices
-#undef __opencl_c_atomic_scope_all_devices
-#endif
-#ifdef __undef___opencl_c_read_write_images
-#undef __opencl_c_read_write_images
-#endif
-#ifdef __undef___opencl_c_integer_dot_product_input_4x8bit
-#undef __opencl_c_integer_dot_product_input_4x8bit
-#endif
-#ifdef __undef___opencl_c_integer_dot_product_input_4x8bit_packed
-#undef __opencl_c_integer_dot_product_input_4x8bit_packed
-#endif
-
#if !defined(__opencl_c_generic_address_space)
// Internal feature macro to provide named (global, local, private) address
// space overloads for builtin functions that take a pointer argument.
diff --git a/clang/test/Headers/opencl-c-header.cl b/clang/test/Headers/opencl-c-header.cl
index 17cbb67f26038..e26f16827b20f 100644
--- a/clang/test/Headers/opencl-c-header.cl
+++ b/clang/test/Headers/opencl-c-header.cl
@@ -33,7 +33,7 @@
// ===
// Compile for OpenCL 2.0 for the first time. The module should change.
// RUN: %clang_cc1 -triple spir-unknown-unknown -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -fdisable-module-hash -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
-// RUN: not diff %t/1_0.pcm %t/opencl_c.pcm
+// RUN: not diff %t/1_0.pcm %t/opencl_c.pcm > /dev/null
// RUN: chmod u-w %t/opencl_c.pcm
// ===
@@ -44,10 +44,10 @@
// RUN: rm -rf %t
// RUN: mkdir -p %t
// RUN: %clang_cc1 -triple spir64-unknown-unknown -emit-llvm -o - -cl-std=CL1.2 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK --check-prefix=CHECK-MOD %s
-// RUN: %clang_cc1 -triple amdgcn--amdhsa -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
+// RUN: %clang_cc1 -triple amdgcn--amdhsa -target-cpu gfx700 -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
// RUN: chmod u-w %t
// RUN: %clang_cc1 -triple spir64-unknown-unknown -emit-llvm -o - -cl-std=CL1.2 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK --check-prefix=CHECK-MOD %s
-// RUN: %clang_cc1 -triple amdgcn--amdhsa -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
+// RUN: %clang_cc1 -triple amdgcn--amdhsa -target-cpu gfx700 -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
// RUN: chmod u+w %t
// Verify that called builtins occur in the generated IR.
diff --git a/clang/test/SemaOpenCL/extension-version.cl b/clang/test/SemaOpenCL/extension-version.cl
index 4d92eff6ae3ac..35a1818706f35 100644
--- a/clang/test/SemaOpenCL/extension-version.cl
+++ b/clang/test/SemaOpenCL/extension-version.cl
@@ -165,6 +165,150 @@
#endif
#pragma OPENCL EXTENSION cl_khr_subgroups : enable
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_ext_float_atomics
+#error "Missing cl_ext_float_atomics define"
+#endif
+#else
+#ifdef cl_ext_float_atomics
+#error "Incorrect cl_ext_float_atomics define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_ext_float_atomics' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_ext_float_atomics : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_extended_bit_ops
+#error "Missing cl_khr_extended_bit_ops define"
+#endif
+#else
+#ifdef cl_khr_extended_bit_ops
+#error "Incorrect cl_khr_extended_bit_ops define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_extended_bit_ops' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_extended_bit_ops : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_integer_dot_product
+#error "Missing cl_khr_integer_dot_product define"
+#endif
+#else
+#ifdef cl_khr_integer_dot_product
+#error "Incorrect cl_khr_integer_dot_product define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_integer_dot_product' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_integer_dot_product : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_kernel_clock
+#error "Missing cl_khr_kernel_clock define"
+#endif
+#else
+#ifdef cl_khr_kernel_clock
+#error "Incorrect cl_khr_kernel_clock define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_kernel_clock' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_kernel_clock : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_ballot
+#error "Missing cl_khr_subgroup_ballot define"
+#endif
+#else
+#ifdef cl_khr_subgroup_ballot
+#error "Incorrect cl_khr_subgroup_ballot define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_ballot' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_ballot : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_clustered_reduce
+#error "Missing cl_khr_subgroup_clustered_reduce define"
+#endif
+#else
+#ifdef cl_khr_subgroup_clustered_reduce
+#error "Incorrect cl_khr_subgroup_clustered_reduce define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_clustered_reduce' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_clustered_reduce : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_extended_types
+#error "Missing cl_khr_subgroup_extended_types define"
+#endif
+#else
+#ifdef cl_khr_subgroup_extended_types
+#error "Incorrect cl_khr_subgroup_extended_types define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_extended_types' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_extended_types : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_non_uniform_arithmetic
+#error "Missing cl_khr_subgroup_non_uniform_arithmetic define"
+#endif
+#else
+#ifdef cl_khr_subgroup_non_uniform_arithmetic
+#error "Incorrect cl_khr_subgroup_non_uniform_arithmetic define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_non_uniform_arithmetic' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_non_uniform_arithmetic : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_non_uniform_vote
+#error "Missing cl_khr_subgroup_non_uniform_vote define"
+#endif
+#else
+#ifdef cl_khr_subgroup_non_uniform_vote
+#error "Incorrect cl_khr_subgroup_non_uniform_vote define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_non_uniform_vote' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_non_uniform_vote : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_rotate
+#error "Missing cl_khr_subgroup_rotate define"
+#endif
+#else
+#ifdef cl_khr_subgroup_rotate
+#error "Incorrect cl_khr_subgroup_rotate define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_rotate' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_rotate : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_shuffle_relative
+#error "Missing cl_khr_subgroup_shuffle_relative define"
+#endif
+#else
+#ifdef cl_khr_subgroup_shuffle_relative
+#error "Incorrect cl_khr_subgroup_shuffle_relative define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_shuffle_relative' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_shuffle_relative : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_shuffle
+#error "Missing cl_khr_subgroup_shuffle define"
+#endif
+#else
+#ifdef cl_khr_subgroup_shuffle
+#error "Incorrect cl_khr_subgroup_shuffle define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_shuffle' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_shuffle : enable
+
#ifndef cl_amd_media_ops
#error "Missing cl_amd_media_ops define"
#endif
@@ -224,14 +368,62 @@
//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_order_acq_rel' unknown or does not require pragma - ignoring}}
#pragma OPENCL EXTENSION __opencl_c_atomic_order_seq_cst : disable
//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_order_seq_cst' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_atomic_scope_all_devices : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_scope_all_devices' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_atomic_scope_device : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_scope_device' unknown or does not require pragma - ignoring}}
#pragma OPENCL EXTENSION __opencl_c_device_enqueue : disable
//expected-warning@-1{{OpenCL extension '__opencl_c_device_enqueue' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_global_atomic_add : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_global_atomic_add' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_global_atomic_load_store : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_global_atomic_load_store' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_global_atomic_min_max : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_global_atomic_min_max' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_local_atomic_add : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_local_atomic_add' unknown or does not require pragma - ignoring}}...
[truncated]
|
🐧 Linux x64 Test Results
|
svenvh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some formatting nits.
cc: @arsenm for the AMDGPU.h changes.
Adds the remaining optional feature macros from the OpenCL C 3.0 spec (section 6.2.1 table). Targets can now enable these via OpenCLFeaturesMap returned by getSupportedOpenCLOpts().
Revert a84599f (header‑only feature macros).
Header‑only macros are difficult to disable on SPIR-V targets,
and the prior undef approach (a60b8f4) does not scale.
After this PR, they can be disabled via
-cl-ext=-<feature>.KhronosGroup/OpenCL-Docs#1328 also notes that unconditional
definition of the header‑only macros in opencl-c-base.h should be removed.