# Using ```numba```: Automatic parallelization with ```jit```


## Overview

In this section we introduce <a href="https://numba.pydata.org/">numba</a>. Numba  is a Python framework for parallelizing computations.
It translates Python functions to optimized machine code at runtime using the industry-standard LLVM compiler library. Numba-compiled numerical algorithms in Python can approach the speeds of C or FORTRAN. Numba offers a range of options for parallelizing your code for CPUs and GPUs, often with only minor code changes.

## Using ```numba```

Setting the parallel option for ```jit()``` enables a Numba transformation pass that attempts to automatically parallelize and perform other optimizations on (part of) a function. At the moment, this feature only works on CPUs.

### Explicit parallel loops

Another feature of the code transformation pass (when parallel=True) is support for explicit parallel loops. One can use Numba’s ```prange``` instead of range to specify that a loop can be parallelized. The user is required to make sure that the loop does not have cross iteration dependencies except for supported reductions. Here is an example.

In [35]:
import numpy as np
from numba import njit, prange, jit
from typing import Callable, Tuple
import random
import time

In [2]:


@njit(parallel=True)
def prange_test(arr):
    s = 0
    # Without "parallel=True" in the jit-decorator
    # the prange statement is equivalent to range
    for i in prange(arr.shape[0]):
        s += arr[i]
    return s

In [3]:
data = [i for i in range(1000000)]
prange_test(np.array(data))

499999500000

We can also perform reductions into slices or elements of an array. However, if the elements specified by the slice or index are written simultaneously by multiple parallel threads then a race condition will occur and this may go undetected by the compiler. 

## Example 1

In this example we use ```numba``` to parallelize the basic Monte Carlo integration method. 

In [32]:
@jit(parallel=True, nopython=True)
def mc_parallel(n: int, 
                interval: Tuple[float, float]) -> Tuple[float,float]:
    
    # the function to integrate. Numba
    # gives warnings (nopython=False)
    # or fails (nopython=True) if the function
    # is passed as an argument
    def f(x):
        return x*x*x
    
    # initialize
    integral = 0.0
    y = np.zeros(n)
    
    a = interval[0]
    b = interval[1]
                 
    for i in prange(n):
        
        # sample a point 
        point = random.uniform(a,b)
        val_f = f(point)
        integral += val_f
        y[i] = val_f*(b-a)
                 
    # compute the answer
    integral = integral *(b-a) / float(n) 
                 
    # compute standard error
    sum_2 = 0.0
    for i in prange(len(y)):
          sum_2 += (y[i] - integral)*(y[i] - integral)
    
                 
    s2 = np.sqrt(sum_2/(n-1))
    se_hat = s2/np.sqrt(n)
                 
    return integral, se_hat

In [36]:
interval = (0.0, 1.0)
exact_answer = 1.0 / 4.0

start = time.time()
integral, se_hat = mc_parallel(n=10000, interval=interval)
end = time.time()
print(f"Calculated answer={integral}")
print(f"Standard error of estimate {se_hat}")
print(f"95% C.I. for estimate=[{integral - 1.96*se_hat}, {integral + 1.96*se_hat}")
print("Elapsed (with compilation) = %s" % (end - start))

Calculated answer=0.25250796910413603
Standard error of estimate 0.0028373751302988447
95% C.I. for estimate=[0.2469467138487503, 0.2580692243595218
Elapsed (with compilation) = 0.032784223556518555
