# Installation of PyOpenCL for SIMD programming

Elwin van 't Wout

PUC Chile

25-9-2024


This tutorial shows how to configure Google Colab for SIMD programming and install the necessary drivers and libraries to use PyOpenCL. Similar installation procedures can be used on other Linux machines as well.

The Google Colab environment provides different hardware backends. By default, a CPU is used, but one can select a GPU accelerator as well. To use a GPU in Google Colab, click Runtime -> Change Runtime Type in the menu and select the T4 GPU.

In the upper right corner, click "Connect T4" to start using a session with GPU hardware. After the runtime environment has been initialized, you can click on the T4 symbol in the upper right corner and see the GPU usage.

Since Python does not work on GPUs natively, special libraries have to be used to program on GPU hardware. The OpenCL library is an open-source framework that works on heterogeneous hardware, including CPUs, GPUs and other processing units. The PyOpenCL library is a Python interface to the OpenCL library written in C. However, installing the PyOpenCL library itself is not sufficient because it depends on the correct drivers for the computing devices. Unfortunately, the Google Colab environment changed its configuration recently and the default OpenCL drivers were removed and we have to install them manually.

Remember that the Google Colab environment is a virtual Linux machine, so we can use bash commands as well. These have to be executed after an exclamation mark (!). We even have sudo (super user) access to the virtual machine.

For example, the following cell searches for the path of the python executable.

In [None]:
!which python

/usr/local/bin/python


We can see that Python is available in the `/usr/local/bin` directory.

The drivers for OpenCL are stored in the `/etc/OpenCL/vendors` directory.

In [None]:
!ls /etc/OpenCL/vendors

ls: cannot access '/etc/OpenCL/vendors': No such file or directory


In [None]:
!clinfo

Number of platforms                               0


Unfortunately, the OpenCL drivers are not installed by default on the Google Colab virtual machines. Therefore, OpenCL cannot detect any computing device, even if it is available. The previous outputs show that no drivers are present in the corresponding folder so that OpenCL cannot find any platform to run on.

However, the GPU hardware is available on the virtual machine. The next command provides characteristics of the GPU.

In [None]:
!nvidia-smi

Wed Sep 25 18:34:14 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   56C    P8              10W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

The above output shows that the Tesla T4 GPU card is indeed available. However, OpenCL cannot detect the GPU because it lacks the corresponding information to find the drivers. Hence, we need to install the NVIDIA drivers for the T4 GPU manually.

The `apt` tool installs programs an a Ubuntu machine. Let us first update the system. The `-y` flag answers yes to queries.

In [None]:
!sudo apt -y update

[33m0% [Working][0m            Get:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,626 B]
Get:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease [1,581 B]
Get:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  Packages [1,001 kB]
Get:4 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
Ign:5 https://r2u.stat.illinois.edu/ubuntu jammy InRelease
Get:6 https://r2u.stat.illinois.edu/ubuntu jammy Release [5,713 B]
Hit:7 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:8 https://r2u.stat.illinois.edu/ubuntu jammy Release.gpg [793 B]
Get:9 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
Hit:10 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
Get:11 https://r2u.stat.illinois.edu/ubuntu jammy/main amd64 Packages [2,586 kB]
Hit:12 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease
Hit:13 https://ppa.launchpadcontent.

Google Colab provides access to a T4 card, which is a GPU from the Tesla product line of NVIDIA. In the case of NVIDIA GPU cards, OpenCL provides an interface to CUDA. That is, the syntax corresponds to OpenCL and OpenCL translates it into CUDA code. Hence, we need to install the CUDA toolkit to use the GPU.

Notice that installing the GPU drivers may take several minutes.

In [None]:
!sudo apt install -y nvidia-cuda-toolkit

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  fonts-dejavu-core fonts-dejavu-extra libaccinj64-11.5 libatk-wrapper-java
  libatk-wrapper-java-jni libbabeltrace1 libcub-dev libcublas11 libcublaslt11
  libcudart11.0 libcufft10 libcufftw10 libcuinj64-11.5 libcupti-dev
  libcupti-doc libcupti11.5 libcurand10 libcusolver11 libcusolvermg11
  libcusparse11 libdebuginfod-common libdebuginfod1 libegl-dev libfontenc1
  libgail-common libgail18 libgl-dev libgl1-mesa-dev libgles-dev libgles1
  libglvnd-core-dev libglvnd-dev libglx-dev libgtk2.0-0 libgtk2.0-bin
  libgtk2.0-common libipt2 libnppc11 libnppial11 libnppicc11 libnppidei11
  libnppif11 libnppig11 libnppim11 libnppist11 libnppisu11 libnppitc11
  libnpps11 libnvblas11 libnvidia-compute-495 libnvidia-compute-510
  libnvidia-compute-535 libnvidia-ml-dev libnvjpeg11 libnvrtc-builtins11.5
  libnvrtc11.2 libnvtoolsext1 libnvvm4 libopengl-de

Let's check if the NVIDIA GPU driver is available to the OpenCL installation.

In [None]:
!ls /etc/OpenCL/vendors

nvidia.icd


The output should now display `nvidia.icd`, which is the *Installable Client Driver* for NVIDIA GPU cards like the T4.



One of the strengths of OpenCL is that it can run the same code on different types of hardware, most importantly on both a CPU and GPU. However, it always needs the corresponding drivers for the hardware, also for a CPU. The POCL library provides the OpenCL drivers for most CPUs. Again, Google Colab does not provide the drivers by default and they need to be installed manually.

In [None]:
!sudo apt install -y pocl-opencl-icd

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  binfmt-support clang-11 libclang-common-11-dev libclang-cpp11 libclang1-11
  libffi-dev libllvm11 libpfm4 libpocl2 libpocl2-common libz3-4 libz3-dev
  llvm-11 llvm-11-dev llvm-11-linker-tools llvm-11-runtime llvm-11-tools
  python3-pygments python3-yaml
Suggested packages:
  clang-11-doc llvm-11-doc python-pygments-doc ttf-bitstream-vera
The following NEW packages will be installed:
  binfmt-support clang-11 libclang-common-11-dev libclang-cpp11 libclang1-11
  libffi-dev libllvm11 libpfm4 libpocl2 libpocl2-common libz3-4 libz3-dev
  llvm-11 llvm-11-dev llvm-11-linker-tools llvm-11-runtime llvm-11-tools
  pocl-opencl-icd python3-pygments python3-yaml
0 upgraded, 20 newly installed, 0 to remove and 50 not upgraded.
Need to get 98.9 MB of archives.
After this operation, 554 MB of additional disk space will be used.
Get:1 http://archive.ubu

Let us check if the ICD is available to OpenCL. The next command should now display both `nvidia.icd` and `pocl.icd`.

In [None]:
!ls /etc/OpenCL/vendors

nvidia.icd  pocl.icd


The following bash command prints out the characteristics of all platforms available to OpenCL.

In [None]:
!clinfo

Number of platforms                               2
  Platform Name                                   Portable Computing Language
  Platform Vendor                                 The pocl project
  Platform Version                                OpenCL 2.0 pocl 1.8  Linux, None+Asserts, RELOC, LLVM 11.1.0, SLEEF, DISTRO, POCL_DEBUG
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_pocl_content_size
  Platform Extensions function suffix             POCL

  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 3.0 CUDA 12.2.138
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_

Now, we have OpenCL and the hardware drivers for the CPU and GPU installed. However, we still need to install the PyOpenCL library to use the Python interface to OpenCL. This can be done with `pip`.

In [None]:
!pip install pyopencl

Collecting pyopencl
  Downloading pyopencl-2024.2.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.7 kB)
Collecting pytools>=2024.1.5 (from pyopencl)
  Downloading pytools-2024.1.14-py3-none-any.whl.metadata (3.0 kB)
Downloading pyopencl-2024.2.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (698 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m698.1/698.1 kB[0m [31m15.7 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pytools-2024.1.14-py3-none-any.whl (89 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m89.9/89.9 kB[0m [31m8.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pytools, pyopencl
Successfully installed pyopencl-2024.2.7 pytools-2024.1.14


In [None]:
import pyopencl as cl

  warn("Unable to import recommended hash 'siphash24.siphash13', "


Let's print out some characteristics of the OpenCL configuration.

In [None]:
print('OpenCL Devices')
for p, platform in enumerate(cl.get_platforms()):
    print('')
    print('Platform ' + str(p) + ' - Name:    ' + platform.name)
    print('Platform ' + str(p) + ' - Vendor:  ' + platform.vendor)
    print('Platform ' + str(p) + ' - Version: ' + platform.version)
    print('Platform ' + str(p) + ' - Profile: ' + platform.profile)
    for d, device in enumerate(platform.get_devices()):
        print('')
        print('Device ' + str(p) + '.' + str(d) + ' - Name:  ' + device.name)
        print('Device ' + str(p) + '.' + str(d) + ' - Type:  ' + cl.device_type.to_string(device.type))
        print('Device ' + str(p) + '.' + str(d) + ' - Max Clock Speed:  {0} Mhz'.format(device.max_clock_frequency))
        print('Device ' + str(p) + '.' + str(d) + ' - Compute Units:  {0}'.format(device.max_compute_units))
        print('Device ' + str(p) + '.' + str(d) + ' - Local Memory:  {0:.0f} KB'.format(device.local_mem_size/1024.0))
        print('Device ' + str(p) + '.' + str(d) + ' - Constant Memory:  {0:.0f} KB'.format(device.max_constant_buffer_size/1024.0))
        print('Device ' + str(p) + '.' + str(d) + ' - Global Memory:  {0:.0f} GB'.format(device.global_mem_size/1073741824.0))
        print('Device ' + str(p) + '.' + str(d) + ' - Max Work Group Size:  {0:.0f}'.format(device.max_work_group_size))

OpenCL Devices

Platform 0 - Name:    NVIDIA CUDA
Platform 0 - Vendor:  NVIDIA Corporation
Platform 0 - Version: OpenCL 3.0 CUDA 12.2.138
Platform 0 - Profile: FULL_PROFILE

Device 0.0 - Name:  Tesla T4
Device 0.0 - Type:  ALL | GPU
Device 0.0 - Max Clock Speed:  1590 Mhz
Device 0.0 - Compute Units:  40
Device 0.0 - Local Memory:  48 KB
Device 0.0 - Constant Memory:  64 KB
Device 0.0 - Global Memory:  15 GB
Device 0.0 - Max Work Group Size:  1024

Platform 1 - Name:    Portable Computing Language
Platform 1 - Vendor:  The pocl project
Platform 1 - Version: OpenCL 2.0 pocl 1.8  Linux, None+Asserts, RELOC, LLVM 11.1.0, SLEEF, DISTRO, POCL_DEBUG
Platform 1 - Profile: FULL_PROFILE

Device 1.0 - Name:  pthread-Intel(R) Xeon(R) CPU @ 2.00GHz
Device 1.0 - Type:  ALL | CPU
Device 1.0 - Max Clock Speed:  2000 Mhz
Device 1.0 - Compute Units:  2
Device 1.0 - Local Memory:  512 KB
Device 1.0 - Constant Memory:  512 KB
Device 1.0 - Global Memory:  11 GB
Device 1.0 - Max Work Group Size:  4096


The output now displays all platforms that PyOpenCL can find. You should see two platforms, one based on NVIDIA CUDA and the other based on the Portable Computing Language. The first platform supports the Tesla T4 GPU while the second plaforms supports the Intel Xeon CPU.

If you get the error `clGetPlatformIDs failed: PLATFORM_NOT_FOUND_KHR`, than the PyOpenCL library is correctly installed, but it cannot find the GPU drivers. Try restarting the runtime: click Runtime -> Restart runtime in the menu. Installing the NVIDIA toolkit again should not be necessary, since it remains the same virtual machine.

If PyOpenCL still cannot find the GPU, explicitly install the GPU drivers: `!sudo apt install -y nvidia-driver-530`. Notice that this can take a few minutes and you might be requested to select a keyboard layout. Type in `85` and then `1` when prompted.