This page lists the Python features supported in the CUDA Python. This includes all kernel and device functions compiled with @cuda.jit
and other higher level Numba decorators that targets the CUDA GPU.
CUDA Python maps directly to the single-instruction multiple-thread execution (SIMT) model of CUDA. Each instruction is implicitly executed by multiple threads in parallel. With this execution model, array expressions are less useful because we don't want multiple threads to perform the same task. Instead, we want threads to perform a task in a cooperative fashion.
For details please consult the CUDA Programming Guide.
The following Python constructs are not supported:
- Exception handling (
try .. except
,try .. finally
) - Context management (the
with
statement) - Comprehensions (either list, dict, set or generator comprehensions)
- Generator (any
yield
statements)
The raise
statement is supported.
The assert
statement is supported, but only has an effect when debug=True
is passed to the numba.cuda.jit
decorator. This is similar to the behavior of the assert
keyword in CUDA C/C++, which is ignored unless compiling with device debug turned on.
Printing of strings, integers, and floats is supported, but printing is an asynchronous operation - in order to ensure that all output is printed after a kernel launch, it is necessary to call numba.cuda.synchronize
. Eliding the call to synchronize
is acceptable, but output from a kernel may appear during other later driver operations (e.g. subsequent kernel launches, memory transfers, etc.), or fail to appear before the program execution completes.
The following built-in types support are inherited from CPU nopython mode.
- int
- float
- complex
- bool
- None
- tuple
See nopython built-in types <pysupported-builtin-types>
.
The following built-in functions are supported:
abs
bool
complex
enumerate
float
int
: only the one-argument formlen
min
: only the multiple-argument formmax
: only the multiple-argument formpow
range
round
zip
The following functions from the cmath
module are supported:
cmath.acos
cmath.acosh
cmath.asin
cmath.asinh
cmath.atan
cmath.atanh
cmath.cos
cmath.cosh
cmath.exp
cmath.isfinite
cmath.isinf
cmath.isnan
cmath.log
cmath.log10
cmath.phase
cmath.polar
cmath.rect
cmath.sin
cmath.sinh
cmath.sqrt
cmath.tan
cmath.tanh
The following functions from the math
module are supported:
math.acos
math.asin
math.atan
math.acosh
math.asinh
math.atanh
math.cos
math.sin
math.tan
math.hypot
math.cosh
math.sinh
math.tanh
math.atan2
math.erf
math.erfc
math.exp
math.expm1
math.fabs
math.frexp
math.ldexp
math.gamma
math.lgamma
math.log
math.log2
math.log10
math.log1p
math.sqrt
math.remainder
: Python 3.7+math.pow
math.ceil
math.floor
math.copysign
math.fmod
math.modf
math.isnan
math.isinf
math.isfinite
The following functions from the operator
module are supported:
operator.add
operator.and_
operator.eq
operator.floordiv
operator.ge
operator.gt
operator.iadd
operator.iand
operator.ifloordiv
operator.ilshift
operator.imod
operator.imul
operator.invert
operator.ior
operator.ipow
operator.irshift
operator.isub
operator.itruediv
operator.ixor
operator.le
operator.lshift
operator.lt
operator.mod
operator.mul
operator.ne
operator.neg
operator.not_
operator.or_
operator.pos
operator.pow
operator.rshift
operator.sub
operator.truediv
operator.xor
Due to the CUDA programming model, dynamic memory allocation inside a kernel is inefficient and is often not needed. Numba disallows any memory allocating features. This disables a large number of NumPy APIs. For best performance, users should write code such that each thread is dealing with a single element at a time.
Supported numpy features:
- accessing ndarray attributes .shape, .strides, .ndim, .size, etc..
- scalar ufuncs that have equivalents in the math module; i.e.
np.sin(x[0])
, where x is a 1D array. - indexing and slicing works.
Unsupported numpy features:
- array creation APIs.
- array methods.
- functions that returns a new array.