In [13]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

In [4]:
import numpy as np
import numba as nb

Numba - just-in-time compiler for Python, works best with numpy and loops

**JIT compilation**:

A JIT compiler runs after the program has started and compiles the code (usually bytecode or some kind of VM instructions) on the fly (or just-in-time, as it's called) into a form that's usually faster, typically the host CPU's native instruction set. A JIT has access to dynamic runtime information whereas a standard compiler doesn't and can make better optimizations like inlining functions that are used frequently.

This is in contrast to a traditional compiler that compiles all the code to machine language before the program is first run.

GIL means that a python interpreter can only work with a single thread at a time. Threads work in parallel but share the same memory heap (unlike processes which run independently). So there's risk of race condition for threads and that's why GIL was imposed.

In [5]:
# numba nopython mode support only numpy

# nb.jit - JIT compilator

x = np.arange(100).reshape(10, 10)

@nb.jit(nopython=True) # set "nopython" mode for best performance, equivalent to @njit
def go_fast(a): # Function is compiled to machine code when called the first time
    trace = 0.0
    for i in range(a.shape[0]):   # Numba likes loops
        trace += np.tanh(a[i, i]) # Numba likes NumPy functions
    return a + trace              # Numba likes NumPy broadcasting

print(go_fast(x))

[[  9.  10.  11.  12.  13.  14.  15.  16.  17.  18.]
 [ 19.  20.  21.  22.  23.  24.  25.  26.  27.  28.]
 [ 29.  30.  31.  32.  33.  34.  35.  36.  37.  38.]
 [ 39.  40.  41.  42.  43.  44.  45.  46.  47.  48.]
 [ 49.  50.  51.  52.  53.  54.  55.  56.  57.  58.]
 [ 59.  60.  61.  62.  63.  64.  65.  66.  67.  68.]
 [ 69.  70.  71.  72.  73.  74.  75.  76.  77.  78.]
 [ 79.  80.  81.  82.  83.  84.  85.  86.  87.  88.]
 [ 89.  90.  91.  92.  93.  94.  95.  96.  97.  98.]
 [ 99. 100. 101. 102. 103. 104. 105. 106. 107. 108.]]


@jit will try first to apply nopython mode, then switch to object mode if fail. Recommended to directly use @njit (same as nopython=True) to make sure the code runs w/o python (and there's actually performance improvement)

First time njit function call adds some overhead due to compilation, when it's compiled - it's cached and the subsequent calls will be without overhead and hopefully faster than the python version

In [14]:
# @nb.vectorize will produce numpy ufunc (i.e. a vectorized function, same as np.vectorize output)
# used when we need to decorate single argument function both with @njit and vectorize it

import math
arr = np.array([1.2, 3.2, 5.4])

# math.ceil(arr)  # won't work since math.ceil is not ufunc
arr_ceil = np.vectorize(math.ceil)
arr_ceil(arr)

# arr_ceil_nb = nb.njit(arr_ceil)  # won't work

@nb.vectorize
def arr_ceil_nb(x):
    return math.ceil(x)

arr_ceil_nb(arr)

array([2, 4, 6])

array([2, 4, 6])

In [19]:
# eager compilation - specify signature, types must be imported from numba
from numba import int32

@nb.njit(int32(int32, int32))
def f(x, y):
    return x + y

# signature definition: int32, int64, float64 etc, void for no return func, int64[:], int64[:,:] etc for arrays

In [18]:
# certain number of functions from numpy and math are supported by numba and can be compiled
# otherwise only compiled user functions can be used inside other compiled functions (in nopython mode of course)
@nb.njit
def square(x):
    return x ** 2

@nb.njit
def hypot(x, y):
    return math.sqrt(square(x) + square(y))

* Python interpreter uses GIL, if a function is compiled by numba in nopython mode (njit), it's no longer necessary to hold Python's GIL.

* Thus can pass ```nogil``` bool argument to @jit or @njit
* nogil mode can have better performance in multi-core systems, but has to take care about race condition etc (if that's an issue)

### arguments of @jit
* nopython
* nogil : release GIL if True
* cache : if True then cache compiled function in a file at the first fun
* parallel : enables parallelization for certain type of operations

In [21]:
# parallel : at the moment only works for CPUs

# some operations are naturally suitable for parallelization (like adding scalar to an array) - if parallel=True numba with automatically detect and parallelize such function parts

# can use numba.prange to make a loop parallel (loop must not have cross-dependencies)
# in python mode prange is just an alias for range
# best to use with nogil=True

In [25]:
# can pass multiple possible signatures in a list
from numba import int32, int64, float32, float64

@nb.vectorize([int32(int32, int32),
            int64(int64, int64),
            float32(float32, float32),
            float64(float64, float64)])
def f(x, y):
    return x + y

# generalize vectorized functions (will be applied to sub-arrays)
@nb.guvectorize([(int64[:], int64, int64[:])], '(n),()->(n)')
def g(x, y, res):
    for i in range(x.shape[0]):
        res[i] = x[i] + y

arr = np.arange(6).reshape(2, 3)
arr
g(arr, 10)

array([[0, 1, 2],
       [3, 4, 5]])

array([[10, 11, 12],
       [13, 14, 15]])

### both @nb.vectorize and @nb.guvectorize support nopython=True

In [34]:
# can use structured arrays inside jitted functions (dicts are on the other hand : not supported)
# structured arrays will often help when we separate and jit numerical part of a function on a pandas df
# (and typically already have column names, etc)

# use pandas method df.to_records() to convert df to structured array and use in numba jit function

@nb.njit
def f(x):
    print(x.ndim)  # ndim method is supported
    return x['a'].sum()

arr = np.array([(4, 5), (3, 6), (2, 3)], dtype=np.dtype([('a', 'f8'), ('b', 'f8')]))

f(arr)

1


9.0

In [55]:
@nb.njit
def get_ndim(x):
    print(x.ndim)


get_ndim(np.array(4))  # can pass arguments of different type
get_ndim(np.array([1, 2, 3]))  # JIT compilation means that the signature is defined when all the args are already known (as opposed to ahead-of-time compilation that has to have well-defined signature)

0
1
