Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,3 +67,7 @@ Please follow instructions in the [DEBUGGING.md](DEBUGGING.md)
## Reporting issues

Please use https://github.com/IntelPython/numba-dppy/issues to report issues and bugs.

## Features

Read this guide for additional features [INDEX.md](docs/INDEX.md)
18 changes: 18 additions & 0 deletions docs/INDEX.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# numba-dppy

Below is the functionality that is implemented in numba-dppy. You can follow the detailed descriptions of some of the features.

## Offload Diagnostics

Setting the debug environment variable `NUMBA_DPPY_OFFLOAD_DIAGNOSTICS `
(e.g. `export NUMBA_DPPY_OFFLOAD_DIAGNOSTICS=1`) enables the parallel and offload diagnostics information.

If set to an integer value between 1 and 4 (inclusive) diagnostic information about parallel transforms undertaken by Numba will be written to STDOUT. The higher the value set the more detailed the information produced.
In the "Auto-offloading" section there is the information on which device (device name) this parfor or kernel was offloaded.

## Controllable Fallback

With the default behavior of numba-dppy, if a section of code cannot be offloaded on the GPU, then it is automatically executed on the CPU and printed a warning. This behavior only applies to njit functions and auto-offloading of numpy functions, array expressions, and prange loops.

Setting the debug environment variable `NUMBA_DPPY_FALLBACK_OPTION `
(e.g. `export NUMBA_DPPY_FALLBACK_OPTION=0`) enables the code is not automatically offload to the CPU, and an error occurs. This is necessary in order to understand at an early stage which parts of the code do not work on the GPU, and not to wait for the program to execute on the CPU if you don't need it.