Skip to content

pocl/pocl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9,930 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Portable Computing Language (PoCL)

PoCL is a conformant implementation (for CPU and Level Zero GPU targets) of the OpenCL 3.0 standard which can be easily adapted for new targets.

Official web page

Full documentation

OpenSSF Best Practices Coverity Scan Build Status

Building

This section contains instructions for building PoCL in its default configuration and a subset of driver backends. You can find the full build instructions including a list of available options in the install guide.

Requirements

In order to build PoCL, you need the following support libraries and tools:

  • Latest released version of LLVM & Clang
  • development files for LLVM & Clang + their transitive dependencies (e.g. libclang-dev, libclang-cpp-dev, libllvm-dev, zlib1g-dev, libtinfo-dev...)
  • CMake 3.15 or newer
  • GNU make or ninja
  • Optional: pkg-config
  • Optional: hwloc v1.0 or newer (e.g. libhwloc-dev)
  • Optional (but enabled by default): python3 (for support of LLVM bitcode with SPIR target)
  • Optional: llvm-spirv (version-compatible with LLVM) and spirv-tools (required for SPIR-V support in CPU / CUDA; Vulkan driver supports SPIR-V through clspv)

For more details, consult the install guide.

Building PoCL follows the usual CMake build steps. Note however, that PoCL can be used from the build directory (without installing it system-wide).

Device drivers

PoCL supports several backend drivers, with different levels of maturity in terms of received testing, reliability and available features.

CPU/x86_64 is continuously tested to pass CTS, and is also able to pass >99% of all CTS tests when built with Thread or Address sanitizers. CPU driver is also tested on RISCV and ARM64; CPU driver on ARM32, i386, PPC, S390x is not tested or supported. We won't prevent building on these architectures, but we don't actively support them currently.

CTS pass rate reflects the expected pass rate of OpenCL-CTS tests when PoCL is compiled with ENABLE_CONFORMANCE=ON setting.

Driver Maturity CTS pass rate Dev. OpenCL ver. input SPIR-V
CPU/x86_64 very high 100% 3.0 1.4
CPU/ARM64 high >95% 3.0 1.4
CPU/RISCV high >99% 3.0 1.4
LevelZero high >99% 3.0 as LZ runtime
CUDA low 3.0 1.2
OpenASIP low 1.2 none
Vulkan low 3.0 ExecModel=Shader only

Supported OpenCL extensions & OpenCL C features

Support legend:

🟒 : Supported with all hardware & LLVM versions

🟑 : Partially supported, see notes

πŸ”΄ : Unsupported

empty cell : Unknown status

OpenCL extensions

Some extensions are available at Platform level:

  • cl_khr_icd - if compiled with ENABLE_ICD=1
  • cl_khr_create_command_queue
  • cl_pocl_content_size
  • cl_ext_buffer_device_address

Note that Remote devices pass-through most of their extensions, with a few exceptions; these are marked Unsupported in the table.

Extension CPU device Level Zero CUDA OpenASIP Remote
cl_exp_tensor 🟑 2️⃣ 🟑 1️⃣
cl_exp_defined_builtin_kernels 🟑 2️⃣ 🟑 1️⃣
cl_ext_buffer_device_address 🟒 🟑 1️⃣ πŸ”΄
cl_ext_float_atomics 🟒 🟑 1️⃣ 🟒
cl_intel_command_queue_families πŸ”΄
cl_intel_device_attribute_query πŸ”΄ 🟑 1️⃣
cl_intel_required_subgroup_size 🟑 2️⃣
cl_intel_split_work_group_barrier πŸ”΄ 🟑 2️⃣
cl_intel_spirv_subgroups πŸ”΄ 🟑 1️⃣
cl_intel_subgroups 🟑 2️⃣ 🟑 2️⃣
cl_intel_subgroups_short 🟑 2️⃣ 🟑 2️⃣
cl_intel_subgroups_char 🟑 2️⃣ 🟑 2️⃣
cl_intel_subgroups_long πŸ”΄ 🟑 2️⃣
cl_intel_subgroup_local_block_io πŸ”΄ 🟑 2️⃣
cl_intel_unified_shared_memory 🟒 🟑 1️⃣ πŸ”΄
cl_khr_3d_image_writes 🟒 🟑 2️⃣
cl_khr_byte_addressable_store 🟒 🟒 🟒
cl_khr_device_uuid 🟒 🟒 πŸ”΄
cl_khr_extended_bit_ops 🟑 6️⃣
cl_khr_global_int32_base_atomics 🟒 🟒 🟒
cl_khr_global_int32_extended_atomics 🟒 🟒 🟒
cl_khr_local_int32_base_atomics 🟒 🟒 🟒
cl_khr_local_int32_extended_atomics 🟒 🟒 🟒
cl_khr_int64_base_atomics 🟒 🟑 2️⃣ 1️⃣ 🟒
cl_khr_int64_extended_atomics 🟒 🟑 2️⃣ 1️⃣ 🟒
cl_khr_suggested_local_work_size 🟒
cl_khr_pci_bus_info πŸ”΄ 🟑 1️⃣
cl_khr_depth_images πŸ”΄ 🟑 2️⃣
cl_khr_integer_dot_product 🟒 🟑 1️⃣
cl_khr_command_buffer 🟑 2️⃣ 🟑 2️⃣
cl_khr_command_buffer_multi_device 🟑 2️⃣
cl_khr_command_buffer_mutable_dispatch 🟑 2️⃣
cl_khr_subgroups 🟑 2️⃣ 🟑 1️⃣ 🟑 2️⃣
cl_khr_subgroup_ballot 🟑 2️⃣ 🟑 2️⃣
cl_khr_subgroup_shuffle 🟑 2️⃣ 🟑 2️⃣
cl_khr_subgroup_shuffle_relative πŸ”΄ 🟑 2️⃣
cl_khr_subgroup_extended_types πŸ”΄ 🟑 2️⃣
cl_khr_subgroup_non_uniform_arithmetic πŸ”΄ 🟑 2️⃣
cl_khr_subgroup_non_uniform_vote πŸ”΄ 🟑 2️⃣
cl_khr_subgroup_clustered_reduce πŸ”΄ 🟑 2️⃣
cl_khr_il_program 🟑 3️⃣ 🟑 3️⃣ 🟑 3️⃣
cl_khr_spir πŸ”΄ πŸ”΄ πŸ”΄ πŸ”΄ πŸ”΄
cl_khr_spirv_queries 🟑 3️⃣ 🟑 3️⃣ 🟑 3️⃣
cl_khr_spirv_no_integer_wrap_decoration 🟒 🟒
cl_khr_spirv_linkonce_odr 🟒 🟑 1️⃣
cl_khr_fp16 🟑 4️⃣ 🟑 2️⃣ 1️⃣ 🟑 7️⃣
cl_khr_fp64 🟑 5️⃣ 🟑 2️⃣ 1️⃣ 🟒
cl_nv_device_attribute_query πŸ”΄ πŸ”΄ 🟒 πŸ”΄
cl_pocl_svm_rect 🟑 2️⃣
cl_pocl_command_buffer_svm 🟑 2️⃣
cl_pocl_command_buffer_host_buffer 🟑 2️⃣

OpenCL C features

Some of these have prequisites (e.g. for __opencl_c_ext_fp64_local_atomic_add requires cl_khr_fp64 & cl_ext_float_atomics), these must be additionally supported by the device.

Features CPU device Level Zero CUDA OpenASIP Remote
__opencl_c_images 🟒 🟑 1️⃣
__opencl_c_3d_image_writes 🟒 🟑 1️⃣
__opencl_c_atomic_order_acq_rel 🟒 🟑 1️⃣ 🟒
__opencl_c_atomic_order_seq_cst 🟒 🟑 1️⃣ 🟒
__opencl_c_atomic_scope_device 🟒 🟑 1️⃣ 🟒
__opencl_c_atomic_scope_all_devices 🟒 🟑 1️⃣
__opencl_c_generic_address_space 🟒 🟒 🟒
__opencl_c_work_group_collective_functions 🟒 🟒
__opencl_c_integer_dot_product_input_4x8bit 🟒 🟑 2️⃣ 1️⃣
__opencl_c_integer_dot_product_input_4x8bit_packed 🟒 🟑 2️⃣ 1️⃣
__opencl_c_subgroups 🟑 2️⃣ 🟑 2️⃣ 1️⃣ 🟑 2️⃣
__opencl_c_read_write_images 🟑 2️⃣ 🟑 1️⃣
__opencl_c_program_scope_global_variables 🟑 2️⃣ 🟑 2️⃣ 🟒
__opencl_c_ext_fp32_global_atomic_add 🟒 🟑 1️⃣ 🟒
__opencl_c_ext_fp32_local_atomic_add 🟒 🟑 1️⃣ 🟒
__opencl_c_ext_fp32_global_atomic_min_max 🟒 🟑 1️⃣ 🟒
__opencl_c_ext_fp32_local_atomic_min_max 🟒 🟑 1️⃣ 🟒
__opencl_c_ext_fp64_global_atomic_add 🟒 🟑 1️⃣ 🟒
__opencl_c_ext_fp64_local_atomic_add 🟒 🟑 1️⃣ 🟒
__opencl_c_ext_fp64_global_atomic_min_max 🟒 🟑 1️⃣ 🟒
__opencl_c_ext_fp64_local_atomic_min_max 🟒 🟑 1️⃣ 🟒
__opencl_c_work_group_collective_functions πŸ”΄ 🟒

Notes

  1. Availability depends on Hardware and Runtime (LevelZero, CUDA) support; if both are available, the extensions/features are enabled by default.
  2. These extensions are only enabled when ENABLE_CONFORMANCE=OFF, because they're incomplete or fail some corner-cases or similar.
  3. These extensions are supported when PoCL is compiled with SPIR-V support.
  4. The cl_khr_fp16 extension is enabled on CPU if all of these are met:
    • both the host & device compilers support the required type (_Float16) and can emulate / execute operations on the type
    • Note: GCC only supports _Float16 since version 12
    • LLVM >= 19, ENABLE_CONFORMANCE=OFF, Linux, CpuArch != i386
  5. The cl_khr_fp64 extension is enabled by default on all CPU architectures, unless explicitly disabled.
  6. The cl_khr_extended_bit_ops is only supported with LLVM 20+.
  7. The cl_khr_fp16 is supported on CUDA devices with Compute Capability >= 6.0 only.

Supported CI environments

CI status:

x86-64 x86-64 ARM64 CUDA Level Zero OpenASIP+Vulkan Remote Apple Silicon Windows

Support Matrix legend:

πŸ”· Achieved status of OpenCL conformant implementation

πŸ”Ά Tested in CI extensively, including OpenCL-CTS tests

🟒 : Tested in CI

🟑 : Should work, but is untested

πŸ”΄ : Unsupported

Linux

CPU device LLVM 18 LLVM 19 LLVM 20 LLVM 21 LLVM 22
x86-64 πŸ”· 🟒 🟒 πŸ”Ά πŸ”Ά
ARM64 🟑 🟑 🟑 🟑 🟒
i686 🟑 🟑 🟑 🟑 🟑
ARM32 🟑 🟑 🟑 🟑 🟑
RISC-V 🟑 🟑 🟑 🟑 🟑
PowerPC 🟑 🟑 🟑 🟑 🟑
GPU device LLVM 18 LLVM 19 LLVM 20 LLVM 21 LLVM 22
CUDA SM5.0 🟑 🟑 🟒 πŸ”΄ 🟒
CUDA SM other than 5.0 🟑 🟑 🟑 πŸ”΄ 🟑
Level Zero 🟑 🟑 🟒 πŸ”Ά 🟒
Vulkan 🟒 πŸ”΄ πŸ”΄ πŸ”΄ πŸ”΄

Note: CUDA with LLVM 21 is broken due to a bug in Clang (llvm/llvm-project#154772).

Special device LLVM 18 LLVM 19 LLVM 20 LLVM 21 LLVM 22
OpenASIP πŸ”΄ πŸ”΄ πŸ”΄ 🟒 πŸ”΄
Remote 🟒 🟒 🟒 🟒 🟑

Mac OS X

CPU device LLVM 18 LLVM 19 LLVM 20 LLVM 21 LLVM 22
Apple Silicon 🟑 🟑 🟒 🟒 🟑
Intel CPU 🟑 πŸ”΄ πŸ”΄ πŸ”΄ πŸ”΄

Windows

CPU device LLVM 18 LLVM 19 LLVM 20 LLVM 21 LLVM 22
MinGW / x86-64 🟑 🟒 🟑 🟑 🟑
MSVC / x86-64 🟑 🟒 🟒 🟑 🟑

Binary packages

Linux distros

PoCL with CPU device support can be found on many linux distribution managers. See latest packaged version(s)

PoCL with CUDA driver

PoCL with CUDA driver support for Linux x86_64, aarch64 and ppc64le can be found on conda-forge distribution and can be installed with

wget "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh"
bash Mambaforge-$(uname)-$(uname -m).sh   # install mambaforge

To install pocl with cuda driver

mamba install pocl-cuda

To install all drivers

mamba install pocl

macOS

Homebrew

PoCL with CPU driver support Intel and Apple Silicon chips can be found on homebrew and can be installed with

brew install pocl

Note that this installs an ICD loader from KhronoGroup and the builtin OpenCL implementation will be invisible when your application is linked to this loader.

Conda

PoCL with CPU driver support Intel and Apple Silicon chips can be found on conda-forge distribution and can be installed with

curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh"
bash Mambaforge-$(uname)-$(uname -m).sh

To install the CPU driver

mamba install pocl

Note that this installs an ICD loader from KhronosGroup and the builtin OpenCL implementation will be invisible when your application is linked to this loader. To make both pocl and the builtin OpenCL implementaiton visible, do

mamba install pocl ocl_icd_wrapper_apple

License

PoCL is distributed under the terms of the MIT license. Contributions are expected to be made with the same terms.

About

pocl - Portable Computing Language

Topics

Resources

License

Unknown and 2 other licenses found

Licenses found

Unknown
LICENSE
MIT
COPYING
Unknown
LICENSE.with.3rdparty

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors