# Introduction to Numba

* Numba is a just-in-time (JIT) compiler. It compiles Python to optimised machine code
* Works well with numerical operations. Not good with Python objects (e.g. strings) and classes
* Extremely easy to use

## Simple example 

In [1]:
from numba import jit

In [2]:
def sum_n(n):
    """Sum up numbers from 1 to n"""
    total = 0
    for i in range(1, n+1):
        total += i
    return total

In [3]:
t1 = %timeit -o sum_n(1000)

60.3 µs ± 1.27 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [4]:
t1.average

6.0337307385738575e-05

In [5]:
@jit
def sum_n(n):
    """Sum up numbers from 1 to n"""
    total = 0
    for i in range(1, n+1):
        total += i
    return total

In [6]:
t2 = %timeit -o sum_n(1000)

188 ns ± 5.28 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [7]:
print(f'Speedup factor: {t1.average/t2.average:.2f}')

Speedup factor: 320.98


## What's the catch?

* Numba works well with numeric code and data types

* Numba has two modes: *nopython* and *object*. It's worth knowing which mode is being used. Summary: *nopython* is good; *object* is bad (likely not any faster).

In [8]:
@jit
def add_stuff(a,b):
    return a+b

When `add_stuff` is called for the first time, Numba inspects the data types of the function arguments. It then compiles an appropriate function.

In [9]:
add_stuff(1., 2.)

3.0

We can look at the data types with the `inspect_types()` function. Here we can see that `float64` is used. As this is a numeric type, Numba is using *nopython* mode.

In [10]:
add_stuff.inspect_types()

add_stuff (float64, float64)
--------------------------------------------------------------------------------
# File: <ipython-input-8-c27ce7f5b967>
# --- LINE 1 --- 
# label 0
#   del b
#   del a
#   del $0.3

@jit

# --- LINE 2 --- 

def add_stuff(a,b):

    # --- LINE 3 --- 
    #   a = arg(0, name=a)  :: float64
    #   b = arg(1, name=b)  :: float64
    #   $0.3 = a + b  :: float64
    #   $0.4 = cast(value=$0.3)  :: float64
    #   return $0.4

    return a+b




But, Python is not a strongly-typed language, so we can call `add_stuff` with other types, e.g. strings:

In [11]:
add_stuff('Good', ' morning')

'Good morning'

Numba has now compiled a string version of the function. Using `inspect_types()` we can see a second function that takes `str` arguments. Note also the use of `pyobject`: this means Numba is in *object* mode.

In [12]:
add_stuff.inspect_types()

add_stuff (float64, float64)
--------------------------------------------------------------------------------
# File: <ipython-input-8-c27ce7f5b967>
# --- LINE 1 --- 
# label 0
#   del b
#   del a
#   del $0.3

@jit

# --- LINE 2 --- 

def add_stuff(a,b):

    # --- LINE 3 --- 
    #   a = arg(0, name=a)  :: float64
    #   b = arg(1, name=b)  :: float64
    #   $0.3 = a + b  :: float64
    #   $0.4 = cast(value=$0.3)  :: float64
    #   return $0.4

    return a+b


add_stuff (str, str)
--------------------------------------------------------------------------------
# File: <ipython-input-8-c27ce7f5b967>
# --- LINE 1 --- 
# label 0
#   del b
#   del a
#   del $0.3

@jit

# --- LINE 2 --- 

def add_stuff(a,b):

    # --- LINE 3 --- 
    #   a = arg(0, name=a)  :: pyobject
    #   b = arg(1, name=b)  :: pyobject
    #   $0.3 = a + b  :: pyobject
    #   $0.4 = cast(value=$0.3)  :: pyobject
    #   return $0.4

    return a+b




### Force nopython mode 
We can force nopython mode with the `@njit` decorator:

In [13]:
from numba import njit

In [14]:
@njit
def add_stuff(a,b):
    return a+b

With `@njit`, Numba will complain if the arguments are not numeric:

In [None]:
add_stuff('Bad', ' morning')

This can be helpful as it quickly indicates that Numba might not be doing a good job.

## Numba is also multi-threaded

You can control the number of threads with `NUMBA_NUM_THREADS`

## Further information

Numba site: https://numba.pydata.org

Great video on Numba in more detail: https://youtu.be/1AwG0T4gaO0

## Summary

My tips for writing optimised code:

* Use NumPy/SciPy as much as possible
* 'Pythonic' code is often faster (avoid `for` loops). But what exactly is 'Pythonic'?
* If you still need loops, use Numba
* Use profiling to check that code changes result in a real speed-up