Skip to content

Conversation

@wenju-he
Copy link
Contributor

@wenju-he wenju-he commented Nov 14, 2025

Adds the remaining optional feature macros from the OpenCL C 3.0 spec (section 6.2.1 table). Targets can now enable these via OpenCLFeaturesMap returned by getSupportedOpenCLOpts().

Revert a84599f (header‑only feature macros).
Header‑only macros are difficult to disable on SPIR-V targets,
and the prior undef approach (a60b8f4) does not scale.
After this PR, they can be disabled via -cl-ext=-<feature>.

KhronosGroup/OpenCL-Docs#1328 also notes that unconditional
definition of the header‑only macros in opencl-c-base.h should be removed.

…ns.def

Adds the remaining optional feature macros from the OpenCL C 3.0 spec (section
6.2.1 table). Targets can now enable these via OpenCLFeaturesMap returned by
getSupportedOpenCLOpts().
@wenju-he wenju-he requested a review from Copilot November 14, 2025 04:43
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Nov 14, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds missing OpenCL C 3.0 optional feature macros to the OpenCLExtensions.def file, ensuring completeness with the OpenCL C 3.0 specification (section 6.2.1 table). This enables targets to properly advertise support for these optional features through the OpenCLFeaturesMap.

  • Adds 11 missing optional feature macros for OpenCL C 3.0
  • Ensures all features from the OpenCL C 3.0 spec table are represented in the codebase

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@wenju-he wenju-he requested review from bader and svenvh November 14, 2025 04:43
@bader
Copy link
Contributor

bader commented Nov 18, 2025

I would expect clang/test/SemaOpenCL/features.cl tests will be extended to cover new features.
Potentially clang/test/Headers/opencl-c-header.cl and clang/test/SemaOpenCL/extension-version.cl as well.

@llvmbot llvmbot added backend:AMDGPU backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics labels Nov 18, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 18, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Wenju He (wenju-he)

Changes

Adds the remaining optional feature macros from the OpenCL C 3.0 spec (section 6.2.1 table). Targets can now enable these via OpenCLFeaturesMap returned by getSupportedOpenCLOpts().


Patch is 39.74 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/168016.diff

6 Files Affected:

  • (modified) clang/include/clang/Basic/OpenCLExtensions.def (+46-9)
  • (modified) clang/lib/Basic/Targets/AMDGPU.h (+2)
  • (modified) clang/lib/Headers/opencl-c-base.h (-99)
  • (modified) clang/test/Headers/opencl-c-header.cl (+3-3)
  • (modified) clang/test/SemaOpenCL/extension-version.cl (+244)
  • (modified) clang/test/SemaOpenCL/features.cl (+66-34)
diff --git a/clang/include/clang/Basic/OpenCLExtensions.def b/clang/include/clang/Basic/OpenCLExtensions.def
index 6f73b26137500..9fc8c31056f4d 100644
--- a/clang/include/clang/Basic/OpenCLExtensions.def
+++ b/clang/include/clang/Basic/OpenCLExtensions.def
@@ -78,10 +78,55 @@ OPENCL_EXTENSION(cl_khr_depth_images, true, 120)
 OPENCL_EXTENSION(cl_khr_gl_msaa_sharing,true, 120)
 
 // OpenCL 2.0.
+OPENCL_EXTENSION(cl_ext_float_atomics, true, 200)
+OPENCL_EXTENSION(cl_khr_extended_bit_ops, true, 200)
+OPENCL_EXTENSION(cl_khr_integer_dot_product, true, 200)
+OPENCL_EXTENSION(cl_khr_kernel_clock, true, 200)
 OPENCL_EXTENSION(cl_khr_mipmap_image, true, 200)
 OPENCL_EXTENSION(cl_khr_mipmap_image_writes, true, 200)
 OPENCL_EXTENSION(cl_khr_srgb_image_writes, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_ballot, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_clustered_reduce, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_extended_types, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_non_uniform_arithmetic, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_non_uniform_vote, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_rotate, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_shuffle_relative, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_shuffle, true, 200)
 OPENCL_EXTENSION(cl_khr_subgroups, true, 200)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_acq_rel, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_seq_cst, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_all_devices, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_device, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_device_enqueue, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_load_store, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_load_store, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_global_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_global_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_local_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_local_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_global_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_global_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_local_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_local_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_raw10_raw12, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_unorm_int_2_101010, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_unsigned_10x6_12x4_14x2, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_generic_address_space, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_images, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_integer_dot_product_input_4x8bit, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_integer_dot_product_input_4x8bit_packed, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_device, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_sub_group, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_work_group, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_pipes, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_program_scope_global_variables, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_read_write_images, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_work_group_collective_functions, false, 200, OCL_C_20)
 
 // Clang Extensions.
 OPENCL_EXTENSION(cl_clang_storage_class_specifiers, true, 100)
@@ -100,17 +145,9 @@ OPENCL_EXTENSION(cl_intel_subgroups_short, true, 120)
 OPENCL_EXTENSION(cl_intel_device_side_avc_motion_estimation, true, 120)
 
 // OpenCL C 3.0 features (6.2.1. Features)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_pipes, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_generic_address_space, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_acq_rel, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_seq_cst, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_subgroups, false, 300, OCL_C_30)
 OPENCL_OPTIONALCOREFEATURE(__opencl_c_3d_image_writes, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_device_enqueue, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_read_write_images, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_program_scope_global_variables, false, 300, OCL_C_30)
 OPENCL_OPTIONALCOREFEATURE(__opencl_c_fp64, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_images, false, 300, OCL_C_30)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_subgroups, false, 300, OCL_C_30)
 
 #undef OPENCL_OPTIONALCOREFEATURE
 #undef OPENCL_COREFEATURE
diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index dfcc79402257a..95a3d133c07ac 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -320,6 +320,8 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : public TargetInfo {
       Opts["__opencl_c_3d_image_writes"] = true;
       Opts["cl_khr_3d_image_writes"] = true;
       Opts["__opencl_c_program_scope_global_variables"] = true;
+      Opts["__opencl_c_atomic_scope_all_devices"] = true;
+      Opts["__opencl_c_atomic_order_seq_cst"] = true;
 
       if (GPUKind >= llvm::AMDGPU::GK_GFX700) {
         Opts["__opencl_c_generic_address_space"] = true;
diff --git a/clang/lib/Headers/opencl-c-base.h b/clang/lib/Headers/opencl-c-base.h
index 414f10ad832ce..898026c66614a 100644
--- a/clang/lib/Headers/opencl-c-base.h
+++ b/clang/lib/Headers/opencl-c-base.h
@@ -9,105 +9,6 @@
 #ifndef _OPENCL_BASE_H_
 #define _OPENCL_BASE_H_
 
-// Define extension macros
-
-#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
-// For SPIR and SPIR-V all extensions are supported.
-#if defined(__SPIR__) || defined(__SPIRV__)
-#define cl_khr_subgroup_extended_types 1
-#define cl_khr_subgroup_non_uniform_vote 1
-#define cl_khr_subgroup_ballot 1
-#define cl_khr_subgroup_non_uniform_arithmetic 1
-#define cl_khr_subgroup_shuffle 1
-#define cl_khr_subgroup_shuffle_relative 1
-#define cl_khr_subgroup_clustered_reduce 1
-#define cl_khr_subgroup_rotate 1
-#define cl_khr_extended_bit_ops 1
-#define cl_khr_integer_dot_product 1
-#define __opencl_c_integer_dot_product_input_4x8bit 1
-#define __opencl_c_integer_dot_product_input_4x8bit_packed 1
-#define cl_ext_float_atomics 1
-#ifdef cl_khr_fp16
-#define __opencl_c_ext_fp16_global_atomic_load_store 1
-#define __opencl_c_ext_fp16_local_atomic_load_store 1
-#define __opencl_c_ext_fp16_global_atomic_add 1
-#define __opencl_c_ext_fp16_local_atomic_add 1
-#define __opencl_c_ext_fp16_global_atomic_min_max 1
-#define __opencl_c_ext_fp16_local_atomic_min_max 1
-#endif
-#ifdef cl_khr_fp64
-#define __opencl_c_ext_fp64_global_atomic_add 1
-#define __opencl_c_ext_fp64_local_atomic_add 1
-#define __opencl_c_ext_fp64_global_atomic_min_max 1
-#define __opencl_c_ext_fp64_local_atomic_min_max 1
-#endif
-#define __opencl_c_ext_fp32_global_atomic_add 1
-#define __opencl_c_ext_fp32_local_atomic_add 1
-#define __opencl_c_ext_fp32_global_atomic_min_max 1
-#define __opencl_c_ext_fp32_local_atomic_min_max 1
-#define __opencl_c_ext_image_raw10_raw12 1
-#define __opencl_c_ext_image_unorm_int_2_101010 1
-#define __opencl_c_ext_image_unsigned_10x6_12x4_14x2 1
-#define cl_khr_kernel_clock 1
-#define __opencl_c_kernel_clock_scope_device 1
-#define __opencl_c_kernel_clock_scope_work_group 1
-#define __opencl_c_kernel_clock_scope_sub_group 1
-
-#endif // defined(__SPIR__) || defined(__SPIRV__)
-#endif // (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
-
-// Define feature macros for OpenCL C 2.0
-#if (__OPENCL_CPP_VERSION__ == 100 || __OPENCL_C_VERSION__ == 200)
-#define __opencl_c_pipes 1
-#define __opencl_c_generic_address_space 1
-#define __opencl_c_work_group_collective_functions 1
-#define __opencl_c_atomic_order_acq_rel 1
-#define __opencl_c_atomic_order_seq_cst 1
-#define __opencl_c_atomic_scope_device 1
-#define __opencl_c_atomic_scope_all_devices 1
-#define __opencl_c_device_enqueue 1
-#define __opencl_c_read_write_images 1
-#define __opencl_c_program_scope_global_variables 1
-#define __opencl_c_images 1
-#endif
-
-// Define header-only feature macros for OpenCL C 3.0.
-#if (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)
-// For the SPIR and SPIR-V target all features are supported.
-#if defined(__SPIR__) || defined(__SPIRV__)
-#define __opencl_c_work_group_collective_functions 1
-#define __opencl_c_atomic_order_seq_cst 1
-#define __opencl_c_atomic_scope_device 1
-#define __opencl_c_atomic_scope_all_devices 1
-#define __opencl_c_read_write_images 1
-#endif // defined(__SPIR__)
-
-#endif // (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)
-
-// Undefine any feature macros that have been explicitly disabled using
-// an __undef_<feature> macro.
-#ifdef __undef___opencl_c_work_group_collective_functions
-#undef __opencl_c_work_group_collective_functions
-#endif
-#ifdef __undef___opencl_c_atomic_order_seq_cst
-#undef __opencl_c_atomic_order_seq_cst
-#endif
-#ifdef __undef___opencl_c_atomic_scope_device
-#undef __opencl_c_atomic_scope_device
-#endif
-#ifdef __undef___opencl_c_atomic_scope_all_devices
-#undef __opencl_c_atomic_scope_all_devices
-#endif
-#ifdef __undef___opencl_c_read_write_images
-#undef __opencl_c_read_write_images
-#endif
-#ifdef __undef___opencl_c_integer_dot_product_input_4x8bit
-#undef __opencl_c_integer_dot_product_input_4x8bit
-#endif
-#ifdef __undef___opencl_c_integer_dot_product_input_4x8bit_packed
-#undef __opencl_c_integer_dot_product_input_4x8bit_packed
-#endif
-
 #if !defined(__opencl_c_generic_address_space)
 // Internal feature macro to provide named (global, local, private) address
 // space overloads for builtin functions that take a pointer argument.
diff --git a/clang/test/Headers/opencl-c-header.cl b/clang/test/Headers/opencl-c-header.cl
index 17cbb67f26038..e26f16827b20f 100644
--- a/clang/test/Headers/opencl-c-header.cl
+++ b/clang/test/Headers/opencl-c-header.cl
@@ -33,7 +33,7 @@
 // ===
 // Compile for OpenCL 2.0 for the first time. The module should change.
 // RUN: %clang_cc1 -triple spir-unknown-unknown -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -fdisable-module-hash -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
-// RUN: not diff %t/1_0.pcm %t/opencl_c.pcm
+// RUN: not diff %t/1_0.pcm %t/opencl_c.pcm > /dev/null
 // RUN: chmod u-w %t/opencl_c.pcm
 
 // ===
@@ -44,10 +44,10 @@
 // RUN: rm -rf %t
 // RUN: mkdir -p %t
 // RUN: %clang_cc1 -triple spir64-unknown-unknown -emit-llvm -o - -cl-std=CL1.2 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK --check-prefix=CHECK-MOD %s
-// RUN: %clang_cc1 -triple amdgcn--amdhsa -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
+// RUN: %clang_cc1 -triple amdgcn--amdhsa -target-cpu gfx700 -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
 // RUN: chmod u-w %t 
 // RUN: %clang_cc1 -triple spir64-unknown-unknown -emit-llvm -o - -cl-std=CL1.2 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK --check-prefix=CHECK-MOD %s
-// RUN: %clang_cc1 -triple amdgcn--amdhsa -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
+// RUN: %clang_cc1 -triple amdgcn--amdhsa -target-cpu gfx700 -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
 // RUN: chmod u+w %t
 
 // Verify that called builtins occur in the generated IR.
diff --git a/clang/test/SemaOpenCL/extension-version.cl b/clang/test/SemaOpenCL/extension-version.cl
index 4d92eff6ae3ac..35a1818706f35 100644
--- a/clang/test/SemaOpenCL/extension-version.cl
+++ b/clang/test/SemaOpenCL/extension-version.cl
@@ -165,6 +165,150 @@
 #endif
 #pragma OPENCL EXTENSION cl_khr_subgroups : enable
 
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_ext_float_atomics
+#error "Missing cl_ext_float_atomics define"
+#endif
+#else
+#ifdef cl_ext_float_atomics
+#error "Incorrect cl_ext_float_atomics define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_ext_float_atomics' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_ext_float_atomics : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_extended_bit_ops
+#error "Missing cl_khr_extended_bit_ops define"
+#endif
+#else
+#ifdef cl_khr_extended_bit_ops
+#error "Incorrect cl_khr_extended_bit_ops define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_extended_bit_ops' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_extended_bit_ops : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_integer_dot_product
+#error "Missing cl_khr_integer_dot_product define"
+#endif
+#else
+#ifdef cl_khr_integer_dot_product
+#error "Incorrect cl_khr_integer_dot_product define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_integer_dot_product' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_integer_dot_product : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_kernel_clock
+#error "Missing cl_khr_kernel_clock define"
+#endif
+#else
+#ifdef cl_khr_kernel_clock
+#error "Incorrect cl_khr_kernel_clock define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_kernel_clock' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_kernel_clock : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_ballot
+#error "Missing cl_khr_subgroup_ballot define"
+#endif
+#else
+#ifdef cl_khr_subgroup_ballot
+#error "Incorrect cl_khr_subgroup_ballot define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_ballot' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_ballot : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_clustered_reduce
+#error "Missing cl_khr_subgroup_clustered_reduce define"
+#endif
+#else
+#ifdef cl_khr_subgroup_clustered_reduce
+#error "Incorrect cl_khr_subgroup_clustered_reduce define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_clustered_reduce' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_clustered_reduce : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_extended_types
+#error "Missing cl_khr_subgroup_extended_types define"
+#endif
+#else
+#ifdef cl_khr_subgroup_extended_types
+#error "Incorrect cl_khr_subgroup_extended_types define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_extended_types' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_extended_types : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_non_uniform_arithmetic
+#error "Missing cl_khr_subgroup_non_uniform_arithmetic define"
+#endif
+#else
+#ifdef cl_khr_subgroup_non_uniform_arithmetic
+#error "Incorrect cl_khr_subgroup_non_uniform_arithmetic define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_non_uniform_arithmetic' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_non_uniform_arithmetic : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_non_uniform_vote
+#error "Missing cl_khr_subgroup_non_uniform_vote define"
+#endif
+#else
+#ifdef cl_khr_subgroup_non_uniform_vote
+#error "Incorrect cl_khr_subgroup_non_uniform_vote define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_non_uniform_vote' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_non_uniform_vote : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_rotate
+#error "Missing cl_khr_subgroup_rotate define"
+#endif
+#else
+#ifdef cl_khr_subgroup_rotate
+#error "Incorrect cl_khr_subgroup_rotate define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_rotate' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_rotate : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_shuffle_relative
+#error "Missing cl_khr_subgroup_shuffle_relative define"
+#endif
+#else
+#ifdef cl_khr_subgroup_shuffle_relative
+#error "Incorrect cl_khr_subgroup_shuffle_relative define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_shuffle_relative' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_shuffle_relative : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_shuffle
+#error "Missing cl_khr_subgroup_shuffle define"
+#endif
+#else
+#ifdef cl_khr_subgroup_shuffle
+#error "Incorrect cl_khr_subgroup_shuffle define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_shuffle' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_shuffle : enable
+
 #ifndef cl_amd_media_ops
 #error "Missing cl_amd_media_ops define"
 #endif
@@ -224,14 +368,62 @@
 //expected-warning@-1{{OpenCL extension '__opencl_c_atomic_order_acq_rel' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_atomic_order_seq_cst : disable
 //expected-warning@-1{{OpenCL extension '__opencl_c_atomic_order_seq_cst' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_atomic_scope_all_devices : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_scope_all_devices' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_atomic_scope_device : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_scope_device' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_device_enqueue : disable
 //expected-warning@-1{{OpenCL extension '__opencl_c_device_enqueue' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_global_atomic_add : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_global_atomic_add' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_global_atomic_load_store : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_global_atomic_load_store' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_global_atomic_min_max : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_global_atomic_min_max' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_local_atomic_add : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_local_atomic_add' unknown or does not require pragma - ignoring}}...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Nov 18, 2025

@llvm/pr-subscribers-backend-x86

Author: Wenju He (wenju-he)

Changes

Adds the remaining optional feature macros from the OpenCL C 3.0 spec (section 6.2.1 table). Targets can now enable these via OpenCLFeaturesMap returned by getSupportedOpenCLOpts().


Patch is 39.74 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/168016.diff

6 Files Affected:

  • (modified) clang/include/clang/Basic/OpenCLExtensions.def (+46-9)
  • (modified) clang/lib/Basic/Targets/AMDGPU.h (+2)
  • (modified) clang/lib/Headers/opencl-c-base.h (-99)
  • (modified) clang/test/Headers/opencl-c-header.cl (+3-3)
  • (modified) clang/test/SemaOpenCL/extension-version.cl (+244)
  • (modified) clang/test/SemaOpenCL/features.cl (+66-34)
diff --git a/clang/include/clang/Basic/OpenCLExtensions.def b/clang/include/clang/Basic/OpenCLExtensions.def
index 6f73b26137500..9fc8c31056f4d 100644
--- a/clang/include/clang/Basic/OpenCLExtensions.def
+++ b/clang/include/clang/Basic/OpenCLExtensions.def
@@ -78,10 +78,55 @@ OPENCL_EXTENSION(cl_khr_depth_images, true, 120)
 OPENCL_EXTENSION(cl_khr_gl_msaa_sharing,true, 120)
 
 // OpenCL 2.0.
+OPENCL_EXTENSION(cl_ext_float_atomics, true, 200)
+OPENCL_EXTENSION(cl_khr_extended_bit_ops, true, 200)
+OPENCL_EXTENSION(cl_khr_integer_dot_product, true, 200)
+OPENCL_EXTENSION(cl_khr_kernel_clock, true, 200)
 OPENCL_EXTENSION(cl_khr_mipmap_image, true, 200)
 OPENCL_EXTENSION(cl_khr_mipmap_image_writes, true, 200)
 OPENCL_EXTENSION(cl_khr_srgb_image_writes, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_ballot, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_clustered_reduce, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_extended_types, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_non_uniform_arithmetic, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_non_uniform_vote, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_rotate, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_shuffle_relative, true, 200)
+OPENCL_EXTENSION(cl_khr_subgroup_shuffle, true, 200)
 OPENCL_EXTENSION(cl_khr_subgroups, true, 200)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_acq_rel, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_seq_cst, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_all_devices, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_device, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_device_enqueue, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_load_store, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_load_store, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_global_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_global_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_local_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_local_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_global_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_global_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_local_atomic_add, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_local_atomic_min_max, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_raw10_raw12, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_unorm_int_2_101010, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_unsigned_10x6_12x4_14x2, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_generic_address_space, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_images, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_integer_dot_product_input_4x8bit, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_integer_dot_product_input_4x8bit_packed, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_device, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_sub_group, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_work_group, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_pipes, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_program_scope_global_variables, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_read_write_images, false, 200, OCL_C_20)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_work_group_collective_functions, false, 200, OCL_C_20)
 
 // Clang Extensions.
 OPENCL_EXTENSION(cl_clang_storage_class_specifiers, true, 100)
@@ -100,17 +145,9 @@ OPENCL_EXTENSION(cl_intel_subgroups_short, true, 120)
 OPENCL_EXTENSION(cl_intel_device_side_avc_motion_estimation, true, 120)
 
 // OpenCL C 3.0 features (6.2.1. Features)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_pipes, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_generic_address_space, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_acq_rel, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_seq_cst, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_subgroups, false, 300, OCL_C_30)
 OPENCL_OPTIONALCOREFEATURE(__opencl_c_3d_image_writes, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_device_enqueue, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_read_write_images, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_program_scope_global_variables, false, 300, OCL_C_30)
 OPENCL_OPTIONALCOREFEATURE(__opencl_c_fp64, false, 300, OCL_C_30)
-OPENCL_OPTIONALCOREFEATURE(__opencl_c_images, false, 300, OCL_C_30)
+OPENCL_OPTIONALCOREFEATURE(__opencl_c_subgroups, false, 300, OCL_C_30)
 
 #undef OPENCL_OPTIONALCOREFEATURE
 #undef OPENCL_COREFEATURE
diff --git a/clang/lib/Basic/Targets/AMDGPU.h b/clang/lib/Basic/Targets/AMDGPU.h
index dfcc79402257a..95a3d133c07ac 100644
--- a/clang/lib/Basic/Targets/AMDGPU.h
+++ b/clang/lib/Basic/Targets/AMDGPU.h
@@ -320,6 +320,8 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : public TargetInfo {
       Opts["__opencl_c_3d_image_writes"] = true;
       Opts["cl_khr_3d_image_writes"] = true;
       Opts["__opencl_c_program_scope_global_variables"] = true;
+      Opts["__opencl_c_atomic_scope_all_devices"] = true;
+      Opts["__opencl_c_atomic_order_seq_cst"] = true;
 
       if (GPUKind >= llvm::AMDGPU::GK_GFX700) {
         Opts["__opencl_c_generic_address_space"] = true;
diff --git a/clang/lib/Headers/opencl-c-base.h b/clang/lib/Headers/opencl-c-base.h
index 414f10ad832ce..898026c66614a 100644
--- a/clang/lib/Headers/opencl-c-base.h
+++ b/clang/lib/Headers/opencl-c-base.h
@@ -9,105 +9,6 @@
 #ifndef _OPENCL_BASE_H_
 #define _OPENCL_BASE_H_
 
-// Define extension macros
-
-#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
-// For SPIR and SPIR-V all extensions are supported.
-#if defined(__SPIR__) || defined(__SPIRV__)
-#define cl_khr_subgroup_extended_types 1
-#define cl_khr_subgroup_non_uniform_vote 1
-#define cl_khr_subgroup_ballot 1
-#define cl_khr_subgroup_non_uniform_arithmetic 1
-#define cl_khr_subgroup_shuffle 1
-#define cl_khr_subgroup_shuffle_relative 1
-#define cl_khr_subgroup_clustered_reduce 1
-#define cl_khr_subgroup_rotate 1
-#define cl_khr_extended_bit_ops 1
-#define cl_khr_integer_dot_product 1
-#define __opencl_c_integer_dot_product_input_4x8bit 1
-#define __opencl_c_integer_dot_product_input_4x8bit_packed 1
-#define cl_ext_float_atomics 1
-#ifdef cl_khr_fp16
-#define __opencl_c_ext_fp16_global_atomic_load_store 1
-#define __opencl_c_ext_fp16_local_atomic_load_store 1
-#define __opencl_c_ext_fp16_global_atomic_add 1
-#define __opencl_c_ext_fp16_local_atomic_add 1
-#define __opencl_c_ext_fp16_global_atomic_min_max 1
-#define __opencl_c_ext_fp16_local_atomic_min_max 1
-#endif
-#ifdef cl_khr_fp64
-#define __opencl_c_ext_fp64_global_atomic_add 1
-#define __opencl_c_ext_fp64_local_atomic_add 1
-#define __opencl_c_ext_fp64_global_atomic_min_max 1
-#define __opencl_c_ext_fp64_local_atomic_min_max 1
-#endif
-#define __opencl_c_ext_fp32_global_atomic_add 1
-#define __opencl_c_ext_fp32_local_atomic_add 1
-#define __opencl_c_ext_fp32_global_atomic_min_max 1
-#define __opencl_c_ext_fp32_local_atomic_min_max 1
-#define __opencl_c_ext_image_raw10_raw12 1
-#define __opencl_c_ext_image_unorm_int_2_101010 1
-#define __opencl_c_ext_image_unsigned_10x6_12x4_14x2 1
-#define cl_khr_kernel_clock 1
-#define __opencl_c_kernel_clock_scope_device 1
-#define __opencl_c_kernel_clock_scope_work_group 1
-#define __opencl_c_kernel_clock_scope_sub_group 1
-
-#endif // defined(__SPIR__) || defined(__SPIRV__)
-#endif // (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
-
-// Define feature macros for OpenCL C 2.0
-#if (__OPENCL_CPP_VERSION__ == 100 || __OPENCL_C_VERSION__ == 200)
-#define __opencl_c_pipes 1
-#define __opencl_c_generic_address_space 1
-#define __opencl_c_work_group_collective_functions 1
-#define __opencl_c_atomic_order_acq_rel 1
-#define __opencl_c_atomic_order_seq_cst 1
-#define __opencl_c_atomic_scope_device 1
-#define __opencl_c_atomic_scope_all_devices 1
-#define __opencl_c_device_enqueue 1
-#define __opencl_c_read_write_images 1
-#define __opencl_c_program_scope_global_variables 1
-#define __opencl_c_images 1
-#endif
-
-// Define header-only feature macros for OpenCL C 3.0.
-#if (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)
-// For the SPIR and SPIR-V target all features are supported.
-#if defined(__SPIR__) || defined(__SPIRV__)
-#define __opencl_c_work_group_collective_functions 1
-#define __opencl_c_atomic_order_seq_cst 1
-#define __opencl_c_atomic_scope_device 1
-#define __opencl_c_atomic_scope_all_devices 1
-#define __opencl_c_read_write_images 1
-#endif // defined(__SPIR__)
-
-#endif // (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)
-
-// Undefine any feature macros that have been explicitly disabled using
-// an __undef_<feature> macro.
-#ifdef __undef___opencl_c_work_group_collective_functions
-#undef __opencl_c_work_group_collective_functions
-#endif
-#ifdef __undef___opencl_c_atomic_order_seq_cst
-#undef __opencl_c_atomic_order_seq_cst
-#endif
-#ifdef __undef___opencl_c_atomic_scope_device
-#undef __opencl_c_atomic_scope_device
-#endif
-#ifdef __undef___opencl_c_atomic_scope_all_devices
-#undef __opencl_c_atomic_scope_all_devices
-#endif
-#ifdef __undef___opencl_c_read_write_images
-#undef __opencl_c_read_write_images
-#endif
-#ifdef __undef___opencl_c_integer_dot_product_input_4x8bit
-#undef __opencl_c_integer_dot_product_input_4x8bit
-#endif
-#ifdef __undef___opencl_c_integer_dot_product_input_4x8bit_packed
-#undef __opencl_c_integer_dot_product_input_4x8bit_packed
-#endif
-
 #if !defined(__opencl_c_generic_address_space)
 // Internal feature macro to provide named (global, local, private) address
 // space overloads for builtin functions that take a pointer argument.
diff --git a/clang/test/Headers/opencl-c-header.cl b/clang/test/Headers/opencl-c-header.cl
index 17cbb67f26038..e26f16827b20f 100644
--- a/clang/test/Headers/opencl-c-header.cl
+++ b/clang/test/Headers/opencl-c-header.cl
@@ -33,7 +33,7 @@
 // ===
 // Compile for OpenCL 2.0 for the first time. The module should change.
 // RUN: %clang_cc1 -triple spir-unknown-unknown -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -fdisable-module-hash -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
-// RUN: not diff %t/1_0.pcm %t/opencl_c.pcm
+// RUN: not diff %t/1_0.pcm %t/opencl_c.pcm > /dev/null
 // RUN: chmod u-w %t/opencl_c.pcm
 
 // ===
@@ -44,10 +44,10 @@
 // RUN: rm -rf %t
 // RUN: mkdir -p %t
 // RUN: %clang_cc1 -triple spir64-unknown-unknown -emit-llvm -o - -cl-std=CL1.2 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK --check-prefix=CHECK-MOD %s
-// RUN: %clang_cc1 -triple amdgcn--amdhsa -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
+// RUN: %clang_cc1 -triple amdgcn--amdhsa -target-cpu gfx700 -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
 // RUN: chmod u-w %t 
 // RUN: %clang_cc1 -triple spir64-unknown-unknown -emit-llvm -o - -cl-std=CL1.2 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK --check-prefix=CHECK-MOD %s
-// RUN: %clang_cc1 -triple amdgcn--amdhsa -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
+// RUN: %clang_cc1 -triple amdgcn--amdhsa -target-cpu gfx700 -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
 // RUN: chmod u+w %t
 
 // Verify that called builtins occur in the generated IR.
diff --git a/clang/test/SemaOpenCL/extension-version.cl b/clang/test/SemaOpenCL/extension-version.cl
index 4d92eff6ae3ac..35a1818706f35 100644
--- a/clang/test/SemaOpenCL/extension-version.cl
+++ b/clang/test/SemaOpenCL/extension-version.cl
@@ -165,6 +165,150 @@
 #endif
 #pragma OPENCL EXTENSION cl_khr_subgroups : enable
 
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_ext_float_atomics
+#error "Missing cl_ext_float_atomics define"
+#endif
+#else
+#ifdef cl_ext_float_atomics
+#error "Incorrect cl_ext_float_atomics define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_ext_float_atomics' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_ext_float_atomics : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_extended_bit_ops
+#error "Missing cl_khr_extended_bit_ops define"
+#endif
+#else
+#ifdef cl_khr_extended_bit_ops
+#error "Incorrect cl_khr_extended_bit_ops define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_extended_bit_ops' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_extended_bit_ops : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_integer_dot_product
+#error "Missing cl_khr_integer_dot_product define"
+#endif
+#else
+#ifdef cl_khr_integer_dot_product
+#error "Incorrect cl_khr_integer_dot_product define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_integer_dot_product' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_integer_dot_product : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_kernel_clock
+#error "Missing cl_khr_kernel_clock define"
+#endif
+#else
+#ifdef cl_khr_kernel_clock
+#error "Incorrect cl_khr_kernel_clock define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_kernel_clock' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_kernel_clock : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_ballot
+#error "Missing cl_khr_subgroup_ballot define"
+#endif
+#else
+#ifdef cl_khr_subgroup_ballot
+#error "Incorrect cl_khr_subgroup_ballot define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_ballot' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_ballot : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_clustered_reduce
+#error "Missing cl_khr_subgroup_clustered_reduce define"
+#endif
+#else
+#ifdef cl_khr_subgroup_clustered_reduce
+#error "Incorrect cl_khr_subgroup_clustered_reduce define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_clustered_reduce' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_clustered_reduce : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_extended_types
+#error "Missing cl_khr_subgroup_extended_types define"
+#endif
+#else
+#ifdef cl_khr_subgroup_extended_types
+#error "Incorrect cl_khr_subgroup_extended_types define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_extended_types' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_extended_types : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_non_uniform_arithmetic
+#error "Missing cl_khr_subgroup_non_uniform_arithmetic define"
+#endif
+#else
+#ifdef cl_khr_subgroup_non_uniform_arithmetic
+#error "Incorrect cl_khr_subgroup_non_uniform_arithmetic define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_non_uniform_arithmetic' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_non_uniform_arithmetic : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_non_uniform_vote
+#error "Missing cl_khr_subgroup_non_uniform_vote define"
+#endif
+#else
+#ifdef cl_khr_subgroup_non_uniform_vote
+#error "Incorrect cl_khr_subgroup_non_uniform_vote define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_non_uniform_vote' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_non_uniform_vote : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_rotate
+#error "Missing cl_khr_subgroup_rotate define"
+#endif
+#else
+#ifdef cl_khr_subgroup_rotate
+#error "Incorrect cl_khr_subgroup_rotate define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_rotate' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_rotate : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_shuffle_relative
+#error "Missing cl_khr_subgroup_shuffle_relative define"
+#endif
+#else
+#ifdef cl_khr_subgroup_shuffle_relative
+#error "Incorrect cl_khr_subgroup_shuffle_relative define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_shuffle_relative' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_shuffle_relative : enable
+
+#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
+#ifndef cl_khr_subgroup_shuffle
+#error "Missing cl_khr_subgroup_shuffle define"
+#endif
+#else
+#ifdef cl_khr_subgroup_shuffle
+#error "Incorrect cl_khr_subgroup_shuffle define"
+#endif
+// expected-warning@+2{{unsupported OpenCL extension 'cl_khr_subgroup_shuffle' - ignoring}}
+#endif
+#pragma OPENCL EXTENSION cl_khr_subgroup_shuffle : enable
+
 #ifndef cl_amd_media_ops
 #error "Missing cl_amd_media_ops define"
 #endif
@@ -224,14 +368,62 @@
 //expected-warning@-1{{OpenCL extension '__opencl_c_atomic_order_acq_rel' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_atomic_order_seq_cst : disable
 //expected-warning@-1{{OpenCL extension '__opencl_c_atomic_order_seq_cst' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_atomic_scope_all_devices : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_scope_all_devices' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_atomic_scope_device : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_atomic_scope_device' unknown or does not require pragma - ignoring}}
 #pragma OPENCL EXTENSION __opencl_c_device_enqueue : disable
 //expected-warning@-1{{OpenCL extension '__opencl_c_device_enqueue' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_global_atomic_add : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_global_atomic_add' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_global_atomic_load_store : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_global_atomic_load_store' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_global_atomic_min_max : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_global_atomic_min_max' unknown or does not require pragma - ignoring}}
+#pragma OPENCL EXTENSION __opencl_c_ext_fp16_local_atomic_add : disable
+//expected-warning@-1{{OpenCL extension '__opencl_c_ext_fp16_local_atomic_add' unknown or does not require pragma - ignoring}}...
[truncated]

@wenju-he
Copy link
Contributor Author

I would expect clang/test/SemaOpenCL/features.cl tests will be extended to cover new features. Potentially clang/test/Headers/opencl-c-header.cl and clang/test/SemaOpenCL/extension-version.cl as well.

updated clang tests in da29d83 and 8118731

@wenju-he wenju-he changed the title [OpenCL] Add missing OpenCL C 3.0 optional features to OpenCLExtensions.def [OpenCL] Add missing OpenCL 3.0 features to OpenCLExtensions.def; revert header-only macros Nov 18, 2025
@github-actions
Copy link

github-actions bot commented Nov 18, 2025

🐧 Linux x64 Test Results

  • 111323 tests passed
  • 4417 tests skipped

@bader bader requested a review from svenvh November 18, 2025 15:46
Copy link
Member

@svenvh svenvh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some formatting nits.

cc: @arsenm for the AMDGPU.h changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU backend:X86 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:headers Headers provided by Clang, e.g. for intrinsics clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants