Skip to content

Commit 194788b

Browse files
committed
[libc][docs] Add documentation for the new GPU mode
This patch introduces documentation for the new GPU mode added in D138608. The documentation includes instructions for building and using the library, along with a description of the supported functions and headers. Reviewed By: sivachandra, lntue, michaelrj Differential Revision: https://reviews.llvm.org/D138856
1 parent 55151e1 commit 194788b

File tree

2 files changed

+169
-0
lines changed

2 files changed

+169
-0
lines changed

libc/docs/gpu_mode.rst

Lines changed: 168 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,168 @@
1+
.. _GPU_mode:
2+
3+
==============
4+
GPU Mode
5+
==============
6+
7+
.. include:: check.rst
8+
9+
.. contents:: Table of Contents
10+
:depth: 4
11+
:local:
12+
13+
.. note:: This feature is very experimental and may change in the future.
14+
15+
The *GPU* mode of LLVM's libc is an experimental mode used to support calling
16+
libc routines during GPU execution. The goal of this project is to provide
17+
access to the standard C library on systems running accelerators. To begin using
18+
this library, build and install the ``libcgpu.a`` static archive following the
19+
instructions in :ref:`building_gpu_mode` and link with your offloading
20+
application.
21+
22+
.. _building_gpu_mode:
23+
24+
Building the GPU library
25+
========================
26+
27+
LLVM's libc GPU support *must* be built using the same compiler as the final
28+
application to ensure relative LLVM bitcode compatibility. This can be done
29+
automatically using the ``LLVM_ENABLE_RUNTIMES=libc`` option. Furthermore,
30+
building for the GPU is only supported in :ref:`fullbuild_mode`. To enable the
31+
GPU build, set the target OS to ``gpu`` via ``LLVM_LIBC_TARGET_OS=gpu``. By
32+
default, ``libcgpu.a`` will be built using every supported GPU architecture. To
33+
restrict the number of architectures build, set ``LLVM_LIBC_GPU_ARCHITECTURES``
34+
to the list of desired architectures or use ``all``. A typical ``cmake``
35+
configuration will look like this:
36+
37+
.. code-block:: sh
38+
39+
$> cd llvm-project # The llvm-project checkout
40+
$> mkdir build
41+
$> cd build
42+
$> cmake ../llvm -G Ninja \
43+
-DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt" \
44+
-DLLVM_ENABLE_RUNTIMES="libc;openmp" \
45+
-DCMAKE_BUILD_TYPE=<Debug|Release> \ # Select build type
46+
-DLLVM_LIBC_FULL_BUILD=ON \ # We need the full libc
47+
-DLLVM_LIBC_TARGET_OS=gpu \ # Build in GPU mode
48+
-DLLVM_LIBC_GPU_ARCHITECTURES=all \ # Build all supported architectures
49+
-DCMAKE_INSTALL_PREFIX=<PATH> \ # Where 'libcgpu.a' will live
50+
$> ninja install
51+
52+
Since we want to include ``clang``, ``lld`` and ``compiler-rt`` in our
53+
toolchain, we list them in ``LLVM_ENABLE_PROJECTS``. To ensure ``libc`` is built
54+
using a compatible compiler and to support ``openmp`` offloading, we list them
55+
in ``LLVM_ENABLE_RUNTIMES`` to build them after the enabled projects using the
56+
newly built compiler. ``CMAKE_INSTALL_PREFIX`` specifies the installation
57+
directory in which to install the ``libcgpu.a`` library along with LLVM.
58+
59+
Usage
60+
=====
61+
62+
Once the ``libcgpu.a`` static archive has been built in
63+
:ref:`building_gpu_mode`, it can be linked directly with offloading applications
64+
as a standard library. This process is described in the `clang documentation
65+
<https://clang.llvm.org/docs/OffloadingDesign.html>_`. This linking mode is used
66+
by the OpenMP toolchain, but is currently opt-in for the CUDA and HIP toolchains
67+
using the ``--offload-new-driver``` and ``-fgpu-rdc`` flags. A typical usage
68+
will look this this:
69+
70+
.. code-block:: sh
71+
72+
$> clang foo.c -fopenmp --offload-arch=gfx90a -lcgpu
73+
74+
The ``libcgpu.a`` static archive is a fat-binary containing LLVM-IR for each
75+
supported target device. The supported architectures can be seen using LLVM's
76+
objdump with the ``--offloading`` flag:
77+
78+
.. code-block:: sh
79+
80+
$> llvm-objdump --offloading libcgpu.a
81+
libcgpu.a(strcmp.cpp.o): file format elf64-x86-64
82+
83+
OFFLOADING IMAGE [0]:
84+
kind llvm ir
85+
arch gfx90a
86+
triple amdgcn-amd-amdhsa
87+
producer <none>
88+
89+
Because the device code is stored inside a fat binary, it can be difficult to
90+
inspect the resulting code. This can be done using the following utilities:
91+
92+
.. code-block:: sh
93+
$> llvm-ar x libcgpu.a strcmp.cpp.o
94+
$> clang-offload-packager strcmp.cpp.o --image=arch=gfx90a,file=gfx90a.bc
95+
$> opt -S out.bc
96+
...
97+
98+
Supported Functions
99+
===================
100+
101+
The following functions and headers are supported at least partially on the
102+
device. Currently, only basic device functions that do not require an operating
103+
system are supported on the device. Supporting functions like `malloc` using an
104+
RPC mechanism is a work-in-progress.
105+
106+
ctype.h
107+
-------
108+
109+
============= =========
110+
Function Name Available
111+
============= =========
112+
isalnum |check|
113+
isalpha |check|
114+
isascii |check|
115+
isblank |check|
116+
iscntrl |check|
117+
isdigit |check|
118+
isgraph |check|
119+
islower |check|
120+
isprint |check|
121+
ispunct |check|
122+
isspace |check|
123+
isupper |check|
124+
isxdigit |check|
125+
toascii |check|
126+
tolower |check|
127+
toupper |check|
128+
============= =========
129+
130+
string.h
131+
--------
132+
133+
============= =========
134+
Function Name Available
135+
============= =========
136+
bcmp |check|
137+
bzero |check|
138+
memccpy |check|
139+
memchr |check|
140+
memcmp |check|
141+
memcpy |check|
142+
memmove |check|
143+
mempcpy |check|
144+
memrchr |check|
145+
memset |check|
146+
stpcpy |check|
147+
stpncpy |check|
148+
strcat |check|
149+
strchr |check|
150+
strcmp |check|
151+
strcpy |check|
152+
strcspn |check|
153+
strlcat |check|
154+
strlcpy |check|
155+
strlen |check|
156+
strncat |check|
157+
strncmp |check|
158+
strncpy |check|
159+
strnlen |check|
160+
strpbrk |check|
161+
strrchr |check|
162+
strspn |check|
163+
strstr |check|
164+
strtok |check|
165+
strtok_r |check|
166+
strdup
167+
strndup
168+
============= =========

libc/docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ stages there is no ABI stability in any form.
5252
usage_modes
5353
overlay_mode
5454
fullbuild_mode
55+
gpu_mode
5556

5657
.. toctree::
5758
:hidden:

0 commit comments

Comments
 (0)