### **Numba Python Library**
**What is Numba?** <br>
"Numba is an open source JIT (just-in-time) compiler that translates a subset of Python and NumPy code into fast machine code" - numba.pydata.org <br>
 - Uses LLVM compiler library to translate Python functions to optimized machine code <br>
 - Numba's speed is comparable to code in C, C++ and Fortran <br>
 
**Why Numba?**
 - Useful to speed up computationally heavy Python functions, especially loops <br>
 - Can run a simple NumPy operation up to 2 times faster and complex Python loops 4 times faster than before
 - Can still use the same Python code and only have to add a decorator around a function
 - Functions are cached after first use, meaning it will be even faster to run with its next use
    

In [16]:
# pip install numba

In [17]:
import numpy as np
import numba
from numba import jit

In [18]:
print(numba.__version__)

0.52.0


### Numba's JIT Compilier
The central feature of Numba is the numba.jit() decorator. This decorator marks a function for optimization and decides when and how to optimize.  

In [19]:
@jit 
def f(x, y):
    output = x - y
    return output

In [20]:
print(f(3,2))
print(f(3j,2))

1
(-2+3j)


In this example, compilation is deferred until the first function is executed. Numba can also infer argument types and generate optimized code and separate specializations based on the information it is given. In the above example, a different output was produced depending on th argument types that were called in the function f().

We can also specify the type, or the function signature, we are expecting to give the function. Now, the @jit decorator will only compile inputs that meet the function's signature and no other specialization will be allowed. 

In [21]:
from numba import jit, int64
@jit(int64(int64, int64))
def f(x, y):
    output = x - y
    return output

In [22]:
print(f(5, 3))
print(f(2**62, 2**62 + 1))

2
-1


### JIT Signatures
 - void - functions that return nothing
 - intp and uintp - pointer-sized integers
 - intc and uintc - same as C integer types
 - int8, uint8, int16, uint16, int32, uint32, int64, uint64
 - float32, float64
 - complex64, complex28 - single and double precision complex numbers
 - float32[:] and int8[:] - array types

### Compilation Options
Numba has to compilation modes: <br>
 - object mode - can handle all values as Python objects and uses Python C API to run code. Runs at the same speed as Python code.
 - nopython mode - prodcues faster code, but value types in functions cannot be assigned and must be inferred. Will begin to use object mode if hits limitations, but can use nopython = True to prevent this from occuring. 

### Numba Speed Test

We will build a function using the Numba compiler and conduct a speed test for the run time of the function using Numba and then only Python functions. Will use the nopython = True compilation mode.

In [23]:
@jit(nopython=True)
def go_fast(x):
    a = np.arange(x)
    #print(a)
    for i in range(len(a)):
        a[i] = np.sqrt(a[i])
    return a 

In [24]:
go_fast(100)

array([0, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4,
       4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6,
       6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8,
       8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9,
       9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9])

Now that we have built our function, we can use the %timeit magic function to time how long it takes for the function to execute <br>
 - The %timeit magic function will run go_fast many times in a loop in order to get a more accurate estimation of its execution time

In [25]:
%timeit go_fast(100)

723 ns ± 2.51 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


We can then use the .py_function to run go_fast as an original uncompiled Python function

In [26]:
np.testing.assert_array_equal(go_fast(100), go_fast.py_func(100)) # checks that both arrays are equal
%timeit go_fast.py_func(100)

202 µs ± 7.86 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


We can compare the loop speed for each function:
 - Numba Function: 746 ns +- 13.1 ns per loop <br>
 - Original Python Function: 199 us +- 5.56 per loop <br>
The Numba function is much faster than the Python function. Numba is able to execute so fast because it uses explicit loops that do no run fast in Python. 

In [27]:
@jit(nopython=True)
def car_simulation(N):
    #N = 5
    m = np.random.randint(1,4,N)
    t = np.random.randint(-1,2,N)
    x = [0 for i in range(N)]
    y = [0 for i in range(N)]
    d = [0 for i in range(N)]

    point_x = 0
    point_y = 0

    for i in range(0,N):
        d[i] = (d[i-1] + t[i]) % 4
        if d[i] == 0:
            point_y = point_y + m[i] 
        elif d[i] == 1:
            point_x = point_x + m[i]
        elif d[i] == 2:
            point_y = point_y - m[i] 
        else:
            point_x = point_x - m[i] 
        x[i] = point_x
        y[i] = point_y

In [28]:
%timeit car_simulation(10)

The slowest run took 13.86 times longer than the fastest. This could mean that an intermediate result is being cached.
7.43 µs ± 8.41 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [29]:
np.testing.assert_array_equal(car_simulation(10), car_simulation.py_func(10)) # checks that both arrays are equal
%timeit car_simulation.py_func(10)

35.8 µs ± 1.01 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


### Disadvantages of Numba
 - Have to be using pure Python code in order to Numba to be beneficial. Could not use other libraries such as Pandas with Numba. 
 - But, only takes a few minutes to check and see if it does work with your code

### Sources
 - https://numba.pydata.org/ 
 - https://numba.pydata.org/numba-doc/latest/user/jit.html
 - https://towardsdatascience.com/speed-up-your-algorithms-part-2-numba-293e554c5cc1 
 - https://towardsdatascience.com/heres-how-you-can-get-some-free-speed-on-your-python-code-with-numba-89fdc8249ef3