This post serves as a short comparison between different speed optimizations in Python. Here I will compare some different approaches for computing on vectors. I will compare pure python code 

https://towardsdatascience.com/speed-up-your-algorithms-part-2-numba-293e554c5cc1
https://jekel.me/2017/Python-with-Numba-faster-than-fortran/
https://jakevdp.github.io/blog/2013/06/15/numba-vs-cython-take-2/
https://anaconda.org/ijstokes/accelerating-python-with-numba/notebook

Numba has two compilation modes: nopython mode and object mode. The former produces much faster code, but has limitations that can force Numba to fall back to the latter. To prevent Numba from falling back, and instead raise an error, pass nopython=True.

In [1]:
#%load_ext line_profiler

In [68]:
import numpy as np
import numba
from numba import njit, jit 
import time as time
from timeit import default_timer as timer
#njit is equivalent to @jit(nopython=True), which tries to conveert full function into compalied code.
#If not sucessed then raises error. Then you can try @jit - which will try to convert part of the code.

In [9]:
def frexp(x):
    return np.frexp(x)
frexp(3.2)

(0.80000000000000004, 2)

The cell below will cose the error, not all nupy functions are covered by numba:
    https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html

In [10]:
@njit
def frexp(x):
    return np.frexp(x)
frexp(3.2)

UntypedAttributeError: Failed at nopython (nopython frontend)
Unknown attribute 'frexp' of type Module(<module 'numpy' from 'C:\\Program Files\\Anaconda3\\lib\\site-packages\\numpy\\__init__.py'>)
File "<ipython-input-10-30a0e374b004>", line 3
[1] During: typing of get attribute at <ipython-input-10-30a0e374b004> (3)

In [60]:
rows = 1000
cols = 100
X, Y = np.random.rand(rows, cols), np.random.rand(rows, cols)

In [61]:
def python_01(X, Y):
    i_list = range(X.shape[0])
    j_list = range(X.shape[1])
    k = 0
    for i in i_list:
        z=0
        for j in j_list:
            z += (X[i,j]**2 + Y[i,j]**2)**0.5
        k += z
    return k

In [62]:
def numpy_01(X, Y):
    dis = ((X**2+Y**2)**0.5).sum()
    return dis

def numpy_02(X, Y):
    dis = (np.sqrt((X**2+Y**2))).sum()
    return dis

def numpy_03(X, Y):
    dis = (np.sqrt((np.power(X, 2)+np.power(Y, 2)))).sum()
    return dis

In [63]:
@njit
def numba_01(X, Y):
    i_list = range(X.shape[0])
    j_list = range(X.shape[1])
    k = 0
    for i in i_list:
        z = 0
        for j in j_list:
            z += (X[i,j]**2 + Y[i,j]**2)**0.5
        k += z
    return k

@njit
def numba_02(X, Y):
    i_list = range(X.shape[0])
    j_list = range(X.shape[1])
    k = 0
    for i in i_list:
        z = 0
        for j in j_list:
            z = z + (X[i,j]**2 + Y[i,j]**2)**0.5
        k = k + z
    return k

@njit
def numba_03(X, Y):
    dis = (np.sqrt((np.power(X, 2)+np.power(Y, 2)))).sum()
    return dis

@njit
def numba_04(X, Y):
    dis = ((X**2+Y**2)**0.5).sum()
    return dis

@numba.vectorize(["float64(float64,float64)"], nopython=True,target='parallel')
def numba_05(x, y):
    return (x**2 + y**2)**0.5

In [80]:
def time_comp(funcs, rows, cols, n_runs = 5):
    for ff in funcs:
        time_tot = 0
        np.random.seed(0) 
        X = np.random.rand(rows, cols)
        np.random.seed(1) 
        Y = np.random.rand(rows, cols)
        res = ff(X,Y) #to comapre and validate results
        for i in range(n_runs):
            np.random.seed(i) 
            start_time = timer()
            ff(X,Y)
            time_tot = time_tot + timer() - start_time
        print("Average execution time of " + ff.__name__+ " is {:.4f}, sum {:4f}".format(time_tot/n_runs, res))

In [81]:
funcs_diff = [python_01, numpy_01, numpy_02, numpy_03, numba_01, numba_02, numba_03, numba_04]
time_comp(funcs_diff, 1000, 1000, n_runs=5)

Average execution time of python_01 is 2.3191, sum 765339.822723
Average execution time of numpy_01 is 0.0430, sum 765339.822723
Average execution time of numpy_02 is 0.0416, sum 765339.822723
Average execution time of numpy_03 is 0.1782, sum 765339.822723
Average execution time of numba_01 is 0.0065, sum 765339.822723
Average execution time of numba_02 is 0.0066, sum 765339.822723
Average execution time of numba_03 is 0.0170, sum 765339.822723
Average execution time of numba_04 is 0.0189, sum 765339.822723


In [82]:
def python_add(X, Y):
    i_list = range(X.shape[0])
    j_list = range(X.shape[1])
    k = 0
    for i in i_list:
        z=0
        for j in j_list:
            z += X[i,j] + Y[i,j]
        k += z
    return k

def numpy_add(X, Y):
    return (X+Y).sum()

@njit
def numba_add(X, Y):
    i_list = range(X.shape[0])
    j_list = range(X.shape[1])
    k = 0
    for i in i_list:
        z = 0
        for j in j_list:
            z += X[i,j]+ Y[i,j]
        k += z
    return k

In [83]:
# funcs_add = [python_add, numpy_add, numba_add]
funcs_add = [numpy_add, numba_add]
time_comp(funcs_add, 10, 100000, n_runs=5)

Average execution time of numpy_add is 0.0119, sum 1000335.952249
Average execution time of numba_add is 0.0015, sum 1000335.952249


Speed gain and some understading of it:
http://numba.pydata.org/numba-doc/0.12/tutorial_numpy_and_numba.html