# Just-in-Time Compiling

Numba's central feature is the `numba.jit()` decoration. Using this decoration, it is possible to mark a function for optimization by Numba's JIT compiler. Various invocations models trigger differing compilation options and behaviors.


### Python Decorators

Decorators are a way to uniformly modify functions in a particular way. You can think of them as functions that take functions as input and produce a function as output. See the Python reference documentation for a detailed discussion.

A function definition may be wrapped by one or more decorator expressions. Decorator expressions are evaluated when the function is defined, in the scope that contains the function definition. The result must be callable, which is invoked with the function object as the only argument. The returned value is bound to the function name instead of the function object. Multiple decorators are applied in nested fashion.

Let's see Numba in action. The following is a Python implementation of bubblesort for NumPy arrays.

In [1]:
def bubblesort(X):
    N = len(X)
    for end in range(N, 1, -1):
        for i in range(end - 1):
            cur = X[i]
            if (cur > X[i + 1]):
                tmp = X[i]
                X[i] = X[i + 1]
                X[i + 1] = tmp

First, we'll create an array of sorted values and randomly shuffle them.

In [2]:
import numpy as np

original = np.arange(.0, 10., .01, dtype='f4')
shuffled = original.copy()
np.random.shuffle(shuffled)

Next, create a copy and perform a bubble sort on the copy.

In [3]:
sorted_copy = shuffled.copy()
bubblesort(sorted_copy)
print(np.array_equal(sorted_copy, original))

True


In [4]:
# Timing the execution.
# NOTE: we need to copy the array so we sort a random array each time as 
# sorting an already sorted array is faster so would distort timing.
%timeit sorted_copy[:] = shuffled[:]; bubblesort(sorted_copy)

218 ms ± 6.76 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


Now we know the speed of the Python implementation. The recommended was to use the `@jit` decorator is to let Numba decide when and how to optimize, so we simply add the decorator to the function:

In [5]:
from numba import jit

@jit 
def bubblesort(X):
    N = len(X)
    for end in range(N, 1, -1):
        for i in range(end - 1):
            cur = X[i]
            if (cur > X[i + 1]):
                tmp = X[i]
                X[i] = X[i + 1]
                X[i + 1] = tmp

In [6]:
%timeit sorted_copy[:] = shuffled[:]; bubblesort(sorted_copy)

640 µs ± 7.81 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


Using the decorator in this way will defer compilation until the first function execution, so the first execution will be significantly slower.

Numba will infer the argument types at call time, and generate optimized code based on this information. Numba will also be able to compile seperate specializations depending on the input types.

# Function Signitures

Questions:
> _Is it possible to use function type information to improve performance with Numba?_

Objectives:

> Learn how to specify function signitures.

> Learn the different function signiture notations.


It is also possible to specify the signiture of the Numba function. A function signiture describes the types of the arguments and the return type of the function. This can produce slightly faster code as the compiler does not need to infer the types. However, the function is no longer able to accept other types.

In [7]:
from numba import jit, int32, float64

@jit(float64(int32, int32))
def f(x, y):
    return (x + y)/3.14

In this example, `float64(int32, int32)` is the function’s signature specifying a function that takes two 32-bit integer arguments and returns a double precision float. Numba provides a shorthand notation, so the same signature can be specified as `f8(i4, i4)`.

The specialization will be compiled by the `@jit` decorator, and no other specialization will be allowed. This is useful if you want fine-grained control over types chosen by the compiler (for example, to use single-precision floats).

If you omit the return type, e.g. by writing `(int32, int32)` instead of `float64(int32, int32)`, Numba will try to infer it for you. Function signatures can also be strings, and you can pass several of them as a list; see the `numba.jit()` documentation for more details.

Of course, the compiled function gives the expected results:

In [8]:
f(1, 3)

1.2738853503184713

In [9]:
# Trying the short version
from numba import jit, f8, i4

@jit(f8(i4, i4))
def f(x, y):
    return (x + y)/3.14

In [10]:
f(1, 3)

1.2738853503184713

In [11]:
# The bubblesort function
from numba import jit, i4

@jit('void(i4)')
def bubblesort(X):
    N = len(X)
    for end in range(N, 1, -1):
        for i in range(end - 1):
            cur = X[i]
            if cur > X[i + 1]:
                tmp = X[i]
                X[i] = X[i + 1]
                X[i + 1] = tmp

In [12]:
%timeit sorted_copy[:] = shuffled[:]; bubblesort(sorted_copy)

668 µs ± 25.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


### Numba Functions

***Calling other functions***

Numba functions can call other Numba functions. Of course, both functions must have the `@jit` decorator, otherwise the code will be much slower.

In [13]:
import numpy as np
from numba import jit 

@jit('void(f4[:])', nopython=True)
def bubblesort(X):
    N = len(X)
    for end in range(N, 1, -1):
        for i in range(end - 1):
            cur = X[i]
            if cur > X[i + 1]:
                tmp = X[i]
                X[i] = X[i + 1]
                X[i + 1] = tmp
                
@jit('void(f4[:])', nopython=True)
def do_sort(sorted):
    bubblesort(sorted)
    
original = np.arange(.0, 10., .01, dtype='f4')
shuffled = original.copy()
np.random.shuffle(shuffled)
sorted_copy = shuffled.copy()
%timeit sorted_copy[:]=shuffled[:]; do_sort(sorted_copy)

756 µs ± 3.68 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


### Numpy Universal Functions

Numba's `@vectorize` decorator allows Python functions taking scalar input arguments to be used as NumPy `ufunc's`. Creating a traditional NumPy 'ufunc' is not the most straightforward process and involves writing some C code. Numba makes this easy. Using the '@vectorize' decorator, Numba can compile a pure Python function into a 'ufunc' that operates over NumPy arrays as fast as traditional 'ufunc's written in C.

***Universal Functions (ufunc)***
A universal function (or 'ufunc' for short) is a function that operates on NumPy arrays (ndarrays) in an element-by-element fashion. They support array broadcasting, type casting, and several other standard features.

A `ufunc` is a "vectorized" wrapper for a function that takes a fixed number of scalar inputs and produces a fixed number of scalar outputs.