## Introduction to numba

- Accelerate pure Python code
- JIT compiler
- Easy to use
- Supports some parallelization (YMMV)
- Ability to write GPU code (from Python)
- https://numba.pydata.org
- Cross platform


## How does it work?

- Analyzes your code
- Generates low level machine code
- Uses LLVM (same as some other popular languages)


## Installation

- Use `conda` or `pip`
- Should work on most OSs


## Features

- Can happily use numpy code
- Broadcasting and numpy-style indexing
- Pure Python data structures will not be faster
- Nor will generic Python modules like pandas etc.
- Much easier to write than native GPU code for GPU execution

<br/>

- Ideally suited for numerical computation


## Simple example

- Will try something in Python
- Compare with numba


In [None]:
import numpy as np
import numba

In [None]:
def vaxpb(y, x, a, b):
    y[:] = a*x + np.sin(b)

In [None]:
def axpb(y, x, a, b):
    for i in range(y.shape[0]):
        y[i] = a[i]*x[i] + np.sin(b[i])

## Performance with numpy


In [None]:
def make_data(n):
    x = np.linspace(0, 2*np.pi, n)
    a, b = np.random.random((2, n))
    y = np.zeros_like(x)
    return y, x, a, b

```python
y, x, a, b = make_data(100)
```

In [None]:
y, x, a, b = make_data(1000)

In [None]:
%timeit vaxpb(y, x, a, b)

In [None]:
%timeit axpb(y, x, a, b)

## With numba


In [None]:
@numba.njit
def nvaxpb(y, x, a, b):
    y[:] = a*x + np.sin(b)

In [None]:
def dumb_dec(f):
    print("Haha I got the function 2")
    def _shadow_f(x):
        print("I am called every time!")
        return f(x)
    return _shadow_f

In [None]:
@dumb_dec
def g(x):
    return x + 1

In [None]:
g(1)

Same as:

In [None]:
nvaxpb = numba.njit(vaxpb)

In [None]:
naxpb = numba.njit(axpb)

In [None]:
%timeit nvaxpb(y, x, a, b)

In [None]:
%timeit naxpb(y, x, a, b)

## Some details

- `numba.njit` == `numba.jit(nopython=True)`
- What is nopython?
- Avoid using it


## Parallel computing

- This has been somewhat experimental


In [None]:
from numba import prange
@numba.njit(parallel=True)
def paxpb(y, x, a, b):
    for i in prange(y.shape[0]):
        y[i] = a[i]*x[i] + np.sin(b[i])


- Doesn't work for me!

In [None]:
y, x, a, b = make_data(1000000)

In [None]:
%timeit paxpb(y, x, a, b)

In [None]:
@numba.njit(parallel=True)
def pvaxpb(y, x, a, b):
    y[:] = a*x + np.sin(b)

- Works and is very fast.

In [None]:
%timeit pvaxpb(y, x, a, b)

In [None]:
@numba.vectorize
def junk(x):
    if x > 0:
        return np.sin(x)
    else:
        return np.cos(x)

In [None]:
x = np.linspace(-1, 1, 100000)

In [None]:
%timeit junk(x)

## More options

- `@vectorize` - numpy ufuncs

- `@jitclass` - for jitted classes

- Many more: see documentation: https://numba.pydata.org

- Possible to get excellent performance with Python
- Use the right tools
