# Using `jit`

We'll start with a trivial example but get to some more realistic applications shortly.

### Array sum

The function below is a naive `sum` function that sums all the elements of a given array.

In [1]:
def sum_array(inp):
    J, I = inp.shape
    
    #this is a bad idea
    mysum = 0
    for j in range(J):
        for i in range(I):
            mysum += inp[j, i]
            
    return mysum

In [2]:
import numpy

In [3]:
arr = numpy.random.random((300, 300))

In [4]:
sum_array(arr)

45041.071854295071

In [5]:
plain = %timeit -o sum_array(arr)

10 loops, best of 3: 20.5 ms per loop


# Let's get started

In [6]:
from numba import jit

## As a function call

In [7]:
sum_array_numba = jit()(sum_array)

What's up with the weird double `()`s?  We'll cover that in a little bit.

In [8]:
sum_array_numba(arr)

45041.07185429507

In [9]:
jitted = %timeit -o sum_array_numba(arr)

10000 loops, best of 3: 86.2 µs per loop


In [10]:
plain.best / jitted.best

238.1571011913437

## (more commonly) As a decorator

In [11]:
@jit
def sum_array(inp):
    I, J = inp.shape
    
    mysum = 0
    for i in range(I):
        for j in range(J):
            mysum += inp[i, j]
            
    return mysum

In [12]:
sum_array(arr)

45041.07185429507

In [13]:
%timeit sum_array(arr)

10000 loops, best of 3: 89.1 µs per loop


## How does this compare to NumPy?

In [14]:
%timeit arr.sum()

The slowest run took 5.33 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 40.7 µs per loop


## When is Numba faster than NumPy?

When doing more complex things, or when using less common integer types, like int16:

In [15]:
arr_int16 = (arr * 4096).astype(numpy.int16)

In [17]:
jitted_int16 = %timeit -o sum_array_numba(arr_int16)

10000 loops, best of 3: 20 µs per loop


In [18]:
numpy_int16 = %timeit -o arr_int16.sum()

The slowest run took 7.39 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 108 µs per loop


In [19]:
numpy_int16.best / jitted_int16.best

5.420978311244756

NumPy doesn't have a specialized version of `sum()` for 16-bit integers, but Numba just generated one that was many times faster!  Numba can take advantage of things like AVX support for packed integers while NumPy has to cast to a larger datatype to use one of the precompiled implementations.

## When does `numba` compile things?

The first time you call the function.  