Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 5 additions & 11 deletions clang/docs/OpenCLSupport.rst
Original file line number Diff line number Diff line change
Expand Up @@ -217,9 +217,9 @@ This section explains how to extend clang with the new functionality.

**Parsing functionality**

If an extension modifies the standard parsing it needs to be added to
the clang frontend source code. This also means that the associated macro
indicating the presence of the extension should be added to clang.
If a new extension is added it needs to be added to the clang frontend source
code. This also means that the associated macro indicating the presence of the
extension should be added to clang.

The default flow for adding a new extension into the frontend is to
modify `OpenCLExtensions.def
Expand All @@ -242,21 +242,15 @@ with :option:`-cl-ext` command-line flags.
**Library functionality**

If an extension adds functionality that does not modify standard language
parsing it should not require modifying anything other than header files and
parsing it may not require modifying anything other than header files and
``OpenCLBuiltins.td`` detailed in :ref:`OpenCL builtins <opencl_builtins>`.
Most commonly such extensions add functionality via libraries (by adding
non-native types or functions) parsed regularly. Similar to other languages this
is the most common way to add new functionality.

Clang has standard headers where new types and functions are being added,
for more details refer to
:ref:`the section on the OpenCL Header <opencl_header>`. The macros indicating
the presence of such extensions can be added in the standard header files
conditioned on target specific predefined macros or/and language version
predefined macros (see `feature/extension preprocessor macros defined in
opencl-c-base.h
<https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/opencl-c-base.h>`__).

:ref:`the section on the OpenCL Header <opencl_header>`.
**Pragmas**

Some extensions alter standard parsing dynamically via pragmas.
Expand Down
5 changes: 5 additions & 0 deletions clang/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,11 @@ C23 Feature Support

Non-comprehensive list of changes in this release
-------------------------------------------------
- Removed OpenCL header-only feature macros (previously unconditionally enabled
on SPIR-V and only selectively disabled via ``-D__undef_<feature>``). All
OpenCL extensions and features are now centralized in OpenCLExtensions.def,
allowing consistent control via ``getSupportedOpenCLOpts`` and ``-cl-ext``.

- Added ``__builtin_elementwise_ldexp``.

- Added ``__builtin_elementwise_fshl`` and ``__builtin_elementwise_fshr``.
Expand Down
55 changes: 46 additions & 9 deletions clang/include/clang/Basic/OpenCLExtensions.def
Original file line number Diff line number Diff line change
Expand Up @@ -78,10 +78,55 @@ OPENCL_EXTENSION(cl_khr_depth_images, true, 120)
OPENCL_EXTENSION(cl_khr_gl_msaa_sharing,true, 120)

// OpenCL 2.0.
OPENCL_EXTENSION(cl_ext_float_atomics, false, 200)
OPENCL_EXTENSION(cl_khr_extended_bit_ops, false, 200)
OPENCL_EXTENSION(cl_khr_integer_dot_product, false, 200)
OPENCL_EXTENSION(cl_khr_kernel_clock, false, 200)
OPENCL_EXTENSION(cl_khr_mipmap_image, true, 200)
OPENCL_EXTENSION(cl_khr_mipmap_image_writes, true, 200)
OPENCL_EXTENSION(cl_khr_srgb_image_writes, true, 200)
OPENCL_EXTENSION(cl_khr_subgroup_ballot, false, 200)
OPENCL_EXTENSION(cl_khr_subgroup_clustered_reduce, false, 200)
OPENCL_EXTENSION(cl_khr_subgroup_extended_types, false, 200)
OPENCL_EXTENSION(cl_khr_subgroup_non_uniform_arithmetic, false, 200)
OPENCL_EXTENSION(cl_khr_subgroup_non_uniform_vote, false, 200)
OPENCL_EXTENSION(cl_khr_subgroup_rotate, false, 200)
OPENCL_EXTENSION(cl_khr_subgroup_shuffle_relative, false, 200)
OPENCL_EXTENSION(cl_khr_subgroup_shuffle, false, 200)
OPENCL_EXTENSION(cl_khr_subgroups, true, 200)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_acq_rel, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_seq_cst, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_all_devices, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_scope_device, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_device_enqueue, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_add, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_load_store, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_global_atomic_min_max, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_add, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_load_store, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp16_local_atomic_min_max, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_global_atomic_add, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_global_atomic_min_max, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_local_atomic_add, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp32_local_atomic_min_max, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_global_atomic_add, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_global_atomic_min_max, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_local_atomic_add, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_fp64_local_atomic_min_max, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_raw10_raw12, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_unorm_int_2_101010, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_ext_image_unsigned_10x6_12x4_14x2, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_generic_address_space, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_images, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_integer_dot_product_input_4x8bit, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_integer_dot_product_input_4x8bit_packed, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_device, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_sub_group, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_kernel_clock_scope_work_group, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_pipes, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_program_scope_global_variables, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_read_write_images, false, 200, OCL_C_20)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_work_group_collective_functions, false, 200, OCL_C_20)

// Clang Extensions.
OPENCL_EXTENSION(cl_clang_storage_class_specifiers, true, 100)
Expand All @@ -100,17 +145,9 @@ OPENCL_EXTENSION(cl_intel_subgroups_short, true, 120)
OPENCL_EXTENSION(cl_intel_device_side_avc_motion_estimation, true, 120)

// OpenCL C 3.0 features (6.2.1. Features)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_pipes, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_generic_address_space, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_acq_rel, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_atomic_order_seq_cst, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_subgroups, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_3d_image_writes, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_device_enqueue, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_read_write_images, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_program_scope_global_variables, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_fp64, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_images, false, 300, OCL_C_30)
OPENCL_OPTIONALCOREFEATURE(__opencl_c_subgroups, false, 300, OCL_C_30)

#undef OPENCL_OPTIONALCOREFEATURE
#undef OPENCL_COREFEATURE
Expand Down
2 changes: 2 additions & 0 deletions clang/lib/Basic/Targets/AMDGPU.h
Original file line number Diff line number Diff line change
Expand Up @@ -320,6 +320,8 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : public TargetInfo {
Opts["__opencl_c_3d_image_writes"] = true;
Opts["cl_khr_3d_image_writes"] = true;
Opts["__opencl_c_program_scope_global_variables"] = true;
Opts["__opencl_c_atomic_order_seq_cst"] = true;
Opts["__opencl_c_atomic_scope_all_devices"] = true;

if (GPUKind >= llvm::AMDGPU::GK_GFX700) {
Opts["__opencl_c_generic_address_space"] = true;
Expand Down
99 changes: 0 additions & 99 deletions clang/lib/Headers/opencl-c-base.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,105 +9,6 @@
#ifndef _OPENCL_BASE_H_
#define _OPENCL_BASE_H_

// Define extension macros

#if (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)
// For SPIR and SPIR-V all extensions are supported.
#if defined(__SPIR__) || defined(__SPIRV__)
#define cl_khr_subgroup_extended_types 1
#define cl_khr_subgroup_non_uniform_vote 1
#define cl_khr_subgroup_ballot 1
#define cl_khr_subgroup_non_uniform_arithmetic 1
#define cl_khr_subgroup_shuffle 1
#define cl_khr_subgroup_shuffle_relative 1
#define cl_khr_subgroup_clustered_reduce 1
#define cl_khr_subgroup_rotate 1
#define cl_khr_extended_bit_ops 1
#define cl_khr_integer_dot_product 1
#define __opencl_c_integer_dot_product_input_4x8bit 1
#define __opencl_c_integer_dot_product_input_4x8bit_packed 1
#define cl_ext_float_atomics 1
#ifdef cl_khr_fp16
#define __opencl_c_ext_fp16_global_atomic_load_store 1
#define __opencl_c_ext_fp16_local_atomic_load_store 1
#define __opencl_c_ext_fp16_global_atomic_add 1
#define __opencl_c_ext_fp16_local_atomic_add 1
#define __opencl_c_ext_fp16_global_atomic_min_max 1
#define __opencl_c_ext_fp16_local_atomic_min_max 1
#endif
#ifdef cl_khr_fp64
#define __opencl_c_ext_fp64_global_atomic_add 1
#define __opencl_c_ext_fp64_local_atomic_add 1
#define __opencl_c_ext_fp64_global_atomic_min_max 1
#define __opencl_c_ext_fp64_local_atomic_min_max 1
#endif
#define __opencl_c_ext_fp32_global_atomic_add 1
#define __opencl_c_ext_fp32_local_atomic_add 1
#define __opencl_c_ext_fp32_global_atomic_min_max 1
#define __opencl_c_ext_fp32_local_atomic_min_max 1
#define __opencl_c_ext_image_raw10_raw12 1
#define __opencl_c_ext_image_unorm_int_2_101010 1
#define __opencl_c_ext_image_unsigned_10x6_12x4_14x2 1
#define cl_khr_kernel_clock 1
#define __opencl_c_kernel_clock_scope_device 1
#define __opencl_c_kernel_clock_scope_work_group 1
#define __opencl_c_kernel_clock_scope_sub_group 1

#endif // defined(__SPIR__) || defined(__SPIRV__)
#endif // (defined(__OPENCL_CPP_VERSION__) || __OPENCL_C_VERSION__ >= 200)

// Define feature macros for OpenCL C 2.0
#if (__OPENCL_CPP_VERSION__ == 100 || __OPENCL_C_VERSION__ == 200)
#define __opencl_c_pipes 1
#define __opencl_c_generic_address_space 1
#define __opencl_c_work_group_collective_functions 1
#define __opencl_c_atomic_order_acq_rel 1
#define __opencl_c_atomic_order_seq_cst 1
#define __opencl_c_atomic_scope_device 1
#define __opencl_c_atomic_scope_all_devices 1
#define __opencl_c_device_enqueue 1
#define __opencl_c_read_write_images 1
#define __opencl_c_program_scope_global_variables 1
#define __opencl_c_images 1
#endif

// Define header-only feature macros for OpenCL C 3.0.
#if (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)
// For the SPIR and SPIR-V target all features are supported.
#if defined(__SPIR__) || defined(__SPIRV__)
#define __opencl_c_work_group_collective_functions 1
#define __opencl_c_atomic_order_seq_cst 1
#define __opencl_c_atomic_scope_device 1
#define __opencl_c_atomic_scope_all_devices 1
#define __opencl_c_read_write_images 1
#endif // defined(__SPIR__)

#endif // (__OPENCL_CPP_VERSION__ == 202100 || __OPENCL_C_VERSION__ == 300)

// Undefine any feature macros that have been explicitly disabled using
// an __undef_<feature> macro.
#ifdef __undef___opencl_c_work_group_collective_functions
#undef __opencl_c_work_group_collective_functions
#endif
#ifdef __undef___opencl_c_atomic_order_seq_cst
#undef __opencl_c_atomic_order_seq_cst
#endif
#ifdef __undef___opencl_c_atomic_scope_device
#undef __opencl_c_atomic_scope_device
#endif
#ifdef __undef___opencl_c_atomic_scope_all_devices
#undef __opencl_c_atomic_scope_all_devices
#endif
#ifdef __undef___opencl_c_read_write_images
#undef __opencl_c_read_write_images
#endif
#ifdef __undef___opencl_c_integer_dot_product_input_4x8bit
#undef __opencl_c_integer_dot_product_input_4x8bit
#endif
#ifdef __undef___opencl_c_integer_dot_product_input_4x8bit_packed
#undef __opencl_c_integer_dot_product_input_4x8bit_packed
#endif

#if !defined(__opencl_c_generic_address_space)
// Internal feature macro to provide named (global, local, private) address
// space overloads for builtin functions that take a pointer argument.
Expand Down
6 changes: 3 additions & 3 deletions clang/test/Headers/opencl-c-header.cl
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
// ===
// Compile for OpenCL 2.0 for the first time. The module should change.
// RUN: %clang_cc1 -triple spir-unknown-unknown -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -fdisable-module-hash -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
// RUN: not diff %t/1_0.pcm %t/opencl_c.pcm
// RUN: not diff %t/1_0.pcm %t/opencl_c.pcm > /dev/null
// RUN: chmod u-w %t/opencl_c.pcm

// ===
Expand All @@ -44,10 +44,10 @@
// RUN: rm -rf %t
// RUN: mkdir -p %t
// RUN: %clang_cc1 -triple spir64-unknown-unknown -emit-llvm -o - -cl-std=CL1.2 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK --check-prefix=CHECK-MOD %s
// RUN: %clang_cc1 -triple amdgcn--amdhsa -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
// RUN: %clang_cc1 -triple amdgcn--amdhsa -target-cpu gfx700 -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
// RUN: chmod u-w %t
// RUN: %clang_cc1 -triple spir64-unknown-unknown -emit-llvm -o - -cl-std=CL1.2 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK --check-prefix=CHECK-MOD %s
// RUN: %clang_cc1 -triple amdgcn--amdhsa -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
// RUN: %clang_cc1 -triple amdgcn--amdhsa -target-cpu gfx700 -O0 -emit-llvm -o - -cl-std=CL2.0 -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=%t -ftime-report %s 2>&1 | FileCheck --check-prefix=CHECK20 --check-prefix=CHECK-MOD %s
// RUN: chmod u+w %t

// Verify that called builtins occur in the generated IR.
Expand Down
Loading