Python being an interpreter language speeds up the development. However, the same makes it slower during run time as it has to compile and execute each statement every time. This becomes a problem during scaling. So then what we can do to speed up the computations? What if we compile the python code; will that make it faster? **That is exactly what Numba does.**

In this kernel, we will see how we can speed up numerical computations in python using Numba. 

Cosine Similarity computation is one of the common operations in NLP while using embeddings. Let us try improving the computation time for it.

In [1]:
import numpy as np

We will start by defining two vectors,u and v, of 50 dimensions each. We will compute cosine similarity for these two vectors.

In [1]:
u = np.random.rand(50)
u

In [1]:
v = np.random.rand(50)
v

Let us now define the function to compute cosine similarity.

In [1]:
def cosine_similarity(u:np.ndarray, v:np.ndarray):
    assert(u.shape[0] == v.shape[0])
    uv = 0
    uu = 0
    vv = 0
    for i in range(u.shape[0]):
        uv += u[i]*v[i]
        uu += u[i]*u[i]
        vv += v[i]*v[i]
    cos_theta = 1
    if uu!=0 and vv!=0:
        cos_theta = uv/np.sqrt(uu*vv)
    return cos_theta

In [1]:
cosine_similarity(u, v)

### Numba

As per the website, Numba is an open source JIT(Just In Time) compiler that translates a subset of Python and NumPy code into fast machine code. It is designed to be used with NumPy arrays and functions. It optimizes array-oriented and math-heavy python code.

Numba comes pre-installed in kaggle kernels. If not then we can install it by running 
> pip install numba

In [1]:
!pip show numba

Using numba is very simple. Just do the following steps:
1. Import jit from numba.
2. Add the jit annotation with nopython=True over the definition of cosine similarity.

In [1]:
from numba import jit

@jit(nopython=True)
def cosine_similarity_numba(u:np.ndarray, v:np.ndarray):
    assert(u.shape[0] == v.shape[0])
    uv = 0
    uu = 0
    vv = 0
    for i in range(u.shape[0]):
        uv += u[i]*v[i]
        uu += u[i]*u[i]
        vv += v[i]*v[i]
    cos_theta = 1
    if uu!=0 and vv!=0:
        cos_theta = uv/np.sqrt(uu*vv)
    return cos_theta

In [1]:
cosine_similarity_numba(u, v)

We will now run the same cosine siliarity computations over a loop. We will keep increasing the loop size for with and without numba computations and observe the difference in computation time.

In [1]:
k = 10 # Change this value and run the below cells to experiment with different number of computations.

In [1]:
%%timeit
for i in range(k):
    cosine_similarity(u, v)

In [1]:
%%timeit
for i in range(k):
    cosine_similarity_numba(u, v)

### Results


|No. of computations|Without Numba | With Numba
|--|--|--|
|1|83.6 µs|1.11 µs|
|100|8.34 ms|70.9 µs|
|1000|86.6 ms|706 µs|
|10000|830 ms|7.09 ms|
|100000|8.08 s|70.2 ms|
|1000000|1min 27s|699 ms|

The difference is evident. 
> **The use of numba has made our computations multiple times faster.**

In this kernel, we just scratched the surface. There is a lot more we can do with Numba. I will leave that exploration to you.
