Skip to content

Commit a957634

Browse files
committed
[OpenMP] Add documentation for error messages and release notes
Add extra information to the runtime page describing the error messages and add information to the release notes for clang 12.0 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94562
1 parent f0abe2a commit a957634

File tree

2 files changed

+116
-2
lines changed

2 files changed

+116
-2
lines changed

openmp/docs/ReleaseNotes.rst

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,34 @@ Introduction
1313
============
1414

1515
This document contains the release notes for the OpenMP runtime, release 12.0.0.
16-
Here we describe the status of openmp, including major improvements
17-
from the previous release. All openmp releases may be downloaded
16+
Here we describe the status of OpenMP, including major improvements
17+
from the previous release. All OpenMP releases may be downloaded
1818
from the `LLVM releases web site <https://llvm.org/releases/>`_.
1919

2020
Non-comprehensive list of changes in this release
2121
=================================================
2222

23+
- Extended the ``libomptarget`` API functions to include source location
24+
information and OpenMP target mapper support. This allows ``libomptarget`` to
25+
know the source location of the OpenMP region it is executing, as well as the
26+
name and declarations of all the variables used inside the region. Each
27+
function generated now uses its ``mapper`` variant. The old API calls now call
28+
into the new API functions with ``nullptr`` arguments for backwards
29+
compatibility with old binaries. Source location information for
30+
``libomptarget`` is now generated by Clang at any level of debugging
31+
information.
2332

33+
- Added improved error messages for ``libomptarget`` and ``CUDA`` plugins. Error
34+
messages are now presented without requiring a debug build of
35+
``libomptarget``. The newly added source location information can also be used
36+
to identify which OpenMP target region the failure occurred in. More
37+
information can be found :ref:`here <libopenmptarget_errors>`.
38+
39+
- Added additional environment variables to control output from the
40+
``libomptarget`` runtime library. ``LIBOMPTARGET_PROFILE`` to
41+
generate time profile output similar to Clang's ``-ftime-trace`` option.
42+
``LIBOMPTARGET_MEMORY_MANAGER_THRESHOLD`` sets the threshold size for which
43+
the ``libomptarget`` memory manager will handle the allocation.
44+
``LIBOMPTARGET_INFO`` allows the user to request certain information from the
45+
``libomptarget`` runtime using a 32-bit field. A full description of each
46+
environment variable is described :ref:`here <libopenmptarget_environment_vars>`.

openmp/docs/design/Runtimes.rst

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ the LLVM/OpenMP host runtime, aka. `libomp.so`, is available as a `pdf
1616
LLVM/OpenMP Target Host Runtime (``libomptarget``)
1717
--------------------------------------------------
1818

19+
.. _libopenmptarget_environment_vars:
20+
1921
Environment Variables
2022
^^^^^^^^^^^^^^^^^^^^^
2123

@@ -171,6 +173,95 @@ shows that ``D`` will be copied back from the device once the OpenMP device
171173
kernel region ends even though it isn't written to. Finally, at the end of the
172174
OpenMP data region the entries for ``X`` and ``Y`` are removed from the table.
173175

176+
.. _libopenmptarget_errors:
177+
178+
Errors:
179+
^^^^^^^
180+
181+
``libomptarget`` provides error messages when the program fails inside the
182+
OpenMP target region. Common causes of failure could be an invalid pointer
183+
access, running out of device memory, or trying to offload when the device is
184+
busy. If the application was built with debugging symbols the error messages
185+
will additionally provide the source location of the OpenMP target region.
186+
187+
For example, consider the following code that implements a simple parallel
188+
reduction on the GPU. This code has a bug that causes it to fail in the
189+
offloading region.
190+
191+
.. code-block:: c++
192+
193+
#include <cstdio>
194+
195+
double sum(double *A, std::size_t N) {
196+
double sum = 0.0;
197+
#pragma omp target teams distribute parallel for reduction(+:sum)
198+
for (int i = 0; i < N; ++i)
199+
sum += A[i];
200+
201+
return sum;
202+
}
203+
204+
int main() {
205+
const int N = 1024;
206+
double A[N];
207+
sum(A, N);
208+
}
209+
210+
If this code is compiled and run, there will be an error message indicating what is
211+
going wrong.
212+
213+
.. code-block:: console
214+
215+
$ clang++ -fopenmp -fopenmp-targets=nvptx64 -O3 -gline-tables-only sum.cpp -o sum
216+
$ ./sum
217+
218+
.. code-block:: text
219+
220+
CUDA error: Error when copying data from device to host.
221+
CUDA error: an illegal memory access was encountered
222+
Libomptarget error: Copying data from device failed.
223+
Libomptarget error: Call to targetDataEnd failed, abort target.
224+
Libomptarget error: Failed to process data after launching the kernel.
225+
Libomptarget error: Run with LIBOMPTARGET_INFO=4 to dump host-target pointer mappings.
226+
sum.cpp:5:1: Libomptarget error 1: failure of target construct while offloading is mandatory
227+
228+
This shows that there is an illegal memory access occuring inside the OpenMP
229+
target region once execution has moved to the CUDA device, suggesting a
230+
segmentation fault. This then causes a chain reaction of failures in
231+
``libomptarget``. Another message suggests using the ``LIBOMPTARGET_INFO``
232+
environment variable as described in :ref:`libopenmptarget_environment_vars`. If
233+
we do this it will print the sate of the host-target pointer mappings at the
234+
time of failure.
235+
236+
.. code-block:: console
237+
238+
$ clang++ -fopenmp -fopenmp-targets=nvptx64 -O3 -gline-tables-only sum.cpp -o sum
239+
$ env LIBOMPTARGET_INFO=4 ./sum
240+
241+
.. code-block:: text
242+
243+
info: OpenMP Host-Device pointer mappings after block at sum.cpp:5:1:
244+
info: Host Ptr Target Ptr Size (B) RefCount Declaration
245+
info: 0x00007ffc058280f8 0x00007f4186600000 8 1 sum at sum.cpp:4:10
246+
247+
This tells us that the only data mapped between the host and the device is the
248+
``sum`` variable that will be copied back from the device once the reduction has
249+
ended. There is no entry mapping the host array ``A`` to the device. In this
250+
situation, the compiler cannot determine the size of the array at compile time
251+
so it will simply assume that the pointer is mapped on the device already by
252+
default. The solution is to add an explicit map clause in the target region.
253+
254+
.. code-block:: c++
255+
256+
double sum(double *A, std::size_t N) {
257+
double sum = 0.0;
258+
#pragma omp target teams distribute parallel for reduction(+:sum) map(to:A[0 : N])
259+
for (int i = 0; i < N; ++i)
260+
sum += A[i];
261+
262+
return sum;
263+
}
264+
174265
.. toctree::
175266
:hidden:
176267
:maxdepth: 1

0 commit comments

Comments
 (0)