|
| 1 | +.. _GPU_mode: |
| 2 | + |
| 3 | +============== |
| 4 | +GPU Mode |
| 5 | +============== |
| 6 | + |
| 7 | +.. include:: check.rst |
| 8 | + |
| 9 | +.. contents:: Table of Contents |
| 10 | + :depth: 4 |
| 11 | + :local: |
| 12 | + |
| 13 | +.. note:: This feature is very experimental and may change in the future. |
| 14 | + |
| 15 | +The *GPU* mode of LLVM's libc is an experimental mode used to support calling |
| 16 | +libc routines during GPU execution. The goal of this project is to provide |
| 17 | +access to the standard C library on systems running accelerators. To begin using |
| 18 | +this library, build and install the ``libcgpu.a`` static archive following the |
| 19 | +instructions in :ref:`building_gpu_mode` and link with your offloading |
| 20 | +application. |
| 21 | + |
| 22 | +.. _building_gpu_mode: |
| 23 | + |
| 24 | +Building the GPU library |
| 25 | +======================== |
| 26 | + |
| 27 | +LLVM's libc GPU support *must* be built using the same compiler as the final |
| 28 | +application to ensure relative LLVM bitcode compatibility. This can be done |
| 29 | +automatically using the ``LLVM_ENABLE_RUNTIMES=libc`` option. Furthermore, |
| 30 | +building for the GPU is only supported in :ref:`fullbuild_mode`. To enable the |
| 31 | +GPU build, set the target OS to ``gpu`` via ``LLVM_LIBC_TARGET_OS=gpu``. By |
| 32 | +default, ``libcgpu.a`` will be built using every supported GPU architecture. To |
| 33 | +restrict the number of architectures build, set ``LLVM_LIBC_GPU_ARCHITECTURES`` |
| 34 | +to the list of desired architectures or use ``all``. A typical ``cmake`` |
| 35 | +configuration will look like this: |
| 36 | + |
| 37 | +.. code-block:: sh |
| 38 | +
|
| 39 | + $> cd llvm-project # The llvm-project checkout |
| 40 | + $> mkdir build |
| 41 | + $> cd build |
| 42 | + $> cmake ../llvm -G Ninja \ |
| 43 | + -DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt" \ |
| 44 | + -DLLVM_ENABLE_RUNTIMES="libc;openmp" \ |
| 45 | + -DCMAKE_BUILD_TYPE=<Debug|Release> \ # Select build type |
| 46 | + -DLLVM_LIBC_FULL_BUILD=ON \ # We need the full libc |
| 47 | + -DLLVM_LIBC_TARGET_OS=gpu \ # Build in GPU mode |
| 48 | + -DLLVM_LIBC_GPU_ARCHITECTURES=all \ # Build all supported architectures |
| 49 | + -DCMAKE_INSTALL_PREFIX=<PATH> \ # Where 'libcgpu.a' will live |
| 50 | + $> ninja install |
| 51 | +
|
| 52 | +Since we want to include ``clang``, ``lld`` and ``compiler-rt`` in our |
| 53 | +toolchain, we list them in ``LLVM_ENABLE_PROJECTS``. To ensure ``libc`` is built |
| 54 | +using a compatible compiler and to support ``openmp`` offloading, we list them |
| 55 | +in ``LLVM_ENABLE_RUNTIMES`` to build them after the enabled projects using the |
| 56 | +newly built compiler. ``CMAKE_INSTALL_PREFIX`` specifies the installation |
| 57 | +directory in which to install the ``libcgpu.a`` library along with LLVM. |
| 58 | + |
| 59 | +Usage |
| 60 | +===== |
| 61 | + |
| 62 | +Once the ``libcgpu.a`` static archive has been built in |
| 63 | +:ref:`building_gpu_mode`, it can be linked directly with offloading applications |
| 64 | +as a standard library. This process is described in the `clang documentation |
| 65 | +<https://clang.llvm.org/docs/OffloadingDesign.html>_`. This linking mode is used |
| 66 | +by the OpenMP toolchain, but is currently opt-in for the CUDA and HIP toolchains |
| 67 | +using the ``--offload-new-driver``` and ``-fgpu-rdc`` flags. A typical usage |
| 68 | +will look this this: |
| 69 | + |
| 70 | +.. code-block:: sh |
| 71 | +
|
| 72 | + $> clang foo.c -fopenmp --offload-arch=gfx90a -lcgpu |
| 73 | +
|
| 74 | +The ``libcgpu.a`` static archive is a fat-binary containing LLVM-IR for each |
| 75 | +supported target device. The supported architectures can be seen using LLVM's |
| 76 | +objdump with the ``--offloading`` flag: |
| 77 | + |
| 78 | +.. code-block:: sh |
| 79 | +
|
| 80 | + $> llvm-objdump --offloading libcgpu.a |
| 81 | + libcgpu.a(strcmp.cpp.o): file format elf64-x86-64 |
| 82 | +
|
| 83 | + OFFLOADING IMAGE [0]: |
| 84 | + kind llvm ir |
| 85 | + arch gfx90a |
| 86 | + triple amdgcn-amd-amdhsa |
| 87 | + producer <none> |
| 88 | +
|
| 89 | +Because the device code is stored inside a fat binary, it can be difficult to |
| 90 | +inspect the resulting code. This can be done using the following utilities: |
| 91 | + |
| 92 | +.. code-block:: sh |
| 93 | + $> llvm-ar x libcgpu.a strcmp.cpp.o |
| 94 | + $> clang-offload-packager strcmp.cpp.o --image=arch=gfx90a,file=gfx90a.bc |
| 95 | + $> opt -S out.bc |
| 96 | + ... |
| 97 | +
|
| 98 | +Supported Functions |
| 99 | +=================== |
| 100 | + |
| 101 | +The following functions and headers are supported at least partially on the |
| 102 | +device. Currently, only basic device functions that do not require an operating |
| 103 | +system are supported on the device. Supporting functions like `malloc` using an |
| 104 | +RPC mechanism is a work-in-progress. |
| 105 | + |
| 106 | +ctype.h |
| 107 | +------- |
| 108 | + |
| 109 | +============= ========= |
| 110 | +Function Name Available |
| 111 | +============= ========= |
| 112 | +isalnum |check| |
| 113 | +isalpha |check| |
| 114 | +isascii |check| |
| 115 | +isblank |check| |
| 116 | +iscntrl |check| |
| 117 | +isdigit |check| |
| 118 | +isgraph |check| |
| 119 | +islower |check| |
| 120 | +isprint |check| |
| 121 | +ispunct |check| |
| 122 | +isspace |check| |
| 123 | +isupper |check| |
| 124 | +isxdigit |check| |
| 125 | +toascii |check| |
| 126 | +tolower |check| |
| 127 | +toupper |check| |
| 128 | +============= ========= |
| 129 | + |
| 130 | +string.h |
| 131 | +-------- |
| 132 | + |
| 133 | +============= ========= |
| 134 | +Function Name Available |
| 135 | +============= ========= |
| 136 | +bcmp |check| |
| 137 | +bzero |check| |
| 138 | +memccpy |check| |
| 139 | +memchr |check| |
| 140 | +memcmp |check| |
| 141 | +memcpy |check| |
| 142 | +memmove |check| |
| 143 | +mempcpy |check| |
| 144 | +memrchr |check| |
| 145 | +memset |check| |
| 146 | +stpcpy |check| |
| 147 | +stpncpy |check| |
| 148 | +strcat |check| |
| 149 | +strchr |check| |
| 150 | +strcmp |check| |
| 151 | +strcpy |check| |
| 152 | +strcspn |check| |
| 153 | +strlcat |check| |
| 154 | +strlcpy |check| |
| 155 | +strlen |check| |
| 156 | +strncat |check| |
| 157 | +strncmp |check| |
| 158 | +strncpy |check| |
| 159 | +strnlen |check| |
| 160 | +strpbrk |check| |
| 161 | +strrchr |check| |
| 162 | +strspn |check| |
| 163 | +strstr |check| |
| 164 | +strtok |check| |
| 165 | +strtok_r |check| |
| 166 | +strdup |
| 167 | +strndup |
| 168 | +============= ========= |
0 commit comments