## Numba Example

from “Python for Data Analysis” by McKinney, O’Reilly,  see Appendix A

Eample of 

-a slow loop in python

-a faster np vector calculation

-an even faster numba compiled approach

In [1]:
import numpy as np

def mean_distance(x, y):
    nx = len(x)
    result = 0.0
    count = 0
    for i in range(nx):
        result += x[i] - y[i]
        count += 1
    return result / count

Here will compute the mean differences for some relatively large data sets,  created with random values

The %timeit magic function (ie a Jupyter notebook function) can easily help with code timing estimates

In [6]:
 x = np.random.randn(5000000)

y = np.random.randn(5000000)

%timeit mean_distance(x, y)

    

2.19 s ± 15.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [7]:
%timeit (x - y).mean()

14.9 ms ± 26.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [8]:
import numba as nb

# this creates a compiled version of the loop based function

numba_mean_distance = nb.jit(mean_distance)

In [9]:
%timeit numba_mean_distance(x, y)

5.06 ms ± 32.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
