# 01 - Test Numpy basic operations

This notebook tests some Numpy operations directly from [GitHub.dev](https://github.dev) console using Python in the browser directly 😍

## Numpy test

In this quick example I will test Euclidian Distance calculations using the following approaches:

* using Python's built in [math library](https://docs.python.org/3.9/library/math.html),
* using same approach, but using [Numpy's](https://numpy.org/doc/stable/reference/) built in functions,
* using [numpy.linalg.norm](https://numpy.org/doc/stable/reference/generated/numpy.linalg.norm.html) function (see [this answer](https://stackoverflow.com/questions/1401712/how-can-the-euclidean-distance-be-calculated-with-numpy/1401828#1401828) on StakOverflow for more information),
* using [scipy.spatial.distance.euclidean](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.euclidean.html) function, which is designed for it.

Then I use [timeit](https://docs.python.org/3.9/library/timeit.html) function to benchmark these approaches. *All done in browser*, yeah!! 🥸

In [None]:
# Whoa, we're doing Python in the browser!

import math
import numpy as np
import pandas as pd
from scipy.spatial import distance

Defintion of all four approaches:

In [None]:
def euclidean_dist_math(v1, v2):
    dist = [math.pow((a - b), 2) for a, b in zip(v1, v2)]
    eudist = math.sqrt(sum(dist))
    return eudist


def euclidean_dist_numpy_1(v1, v2):
    v1_a = np.array(v1)
    v2_a = np.array(v2)
    sd = np.sum((v1 - v2) ** 2)
    eudist = np.sqrt(sd)
    return eudist

def euclidean_dist_numpy_2(v1, v2):
    return np.linalg.norm(v1 - v2)

def euclidean_dist_scipy(v1, v2):
    return distance.euclidean(v1, v2)


I'm generating two random vectors for tests:

In [40]:
dis1 = np.random.rand(20)
dis2 = np.random.rand(20)
v1, v2 = np.array(dis1), np.array(dis2)
v1, v2

(array([0.85785269, 0.4487316 , 0.86080665, 0.15414342, 0.24974709,
        0.2895639 , 0.2340785 , 0.53340573, 0.96725424, 0.65639269,
        0.98669588, 0.87662192, 0.50414805, 0.85620101, 0.33123078,
        0.2265702 , 0.34091318, 0.30261994, 0.97186833, 0.005325  ]),
 array([0.25932477, 0.68312249, 0.84058217, 0.64248053, 0.49378973,
        0.46038518, 0.5698131 , 0.73716253, 0.47597153, 0.37743853,
        0.3753025 , 0.10587056, 0.68730655, 0.04646169, 0.16477113,
        0.40117483, 0.39630528, 0.04704349, 0.06771778, 0.61792316]))

And then resting all:

In [None]:
# Inspired by https://stackoverflow.com/questions/37794849/efficient-and-precise-calculation-of-the-euclidean-distance

import timeit

def wrapper(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

wrappered_math = wrapper(euclidean_dist_math, v1, v2)
wrappered_numpy1 = wrapper(euclidean_dist_numpy_1, v1, v2)
wrappered_numpy2 = wrapper(euclidean_dist_numpy_2, v1, v2)
wrappered_scipy = wrapper(euclidean_dist_scipy, v1, v2)
t_math = timeit.repeat(wrappered_math, repeat=3, number=100000)
t_numpy1 = timeit.repeat(wrappered_numpy1, repeat=3, number=100000)
t_numpy2 = timeit.repeat(wrappered_numpy2, repeat=3, number=100000)
t_scipy = timeit.repeat(wrappered_scipy, repeat=3, number=100000)

print(f'math approach: {sum(t_math)/len(t_math)}')
print(f'numpy simple approach: {sum(t_numpy1)/len(t_numpy1)}')
print(f'numpy.linalg.norm approach: {sum(t_numpy2)/len(t_numpy2)}')
print(f'scipy.distance approach: {sum(t_scipy)/len(t_scipy)}')

math approach: 3.378466666999998
numpy simple approach: 1.5132333330000165
numpy.linalg.norm approach: 1.048533332999985
scipy.distance approach: 2.0003666663333206


Test came out as expected (tested on Apple MacBook Pro w/M1 using Brave Browser):

* The fastest is [numpy.linalg.norm](https://numpy.org/doc/stable/reference/generated/numpy.linalg.norm.html) function, 1.048533s per run,
* Followed by [Numpy's](https://numpy.org/doc/stable/reference/) built in functions, 1.513233s per run,
* And [scipy.spatial.distance.euclidean](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.euclidean.html) function, 2.000367s per run,
* And finally Python's built in [math library](https://docs.python.org/3.9/library/math.html) with 3.378467s per run.

*All in the browser!* 😊