# 01 - Test Numpy basic operations

This notebook tests some Numpy operations directly from [GitHub.dev](https://github.dev) console using browser only.

## Numpy test

In this quick example I will test Euclidian Distance calculations using the following approaches:

* using Python's built in [math library](https://docs.python.org/3.9/library/math.html),
* using same approach, but using [Numpy's](https://numpy.org/doc/stable/reference/) built in functions,
* using [numpy.linalg.norm](https://numpy.org/doc/stable/reference/generated/numpy.linalg.norm.html) function (see [this answer](https://stackoverflow.com/questions/1401712/how-can-the-euclidean-distance-be-calculated-with-numpy/1401828#1401828) on StakOverflow for more information),
* using [scipy.spatial.distance.euclidean](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.euclidean.html) function, which is designed for it.

Then I use [timeit](https://docs.python.org/3.9/library/timeit.html) function to benchmark these approaches. *All done in browser*, yeah!! 🥸

In [None]:
# Whoa, we're doing Python in the browser!

import math
import numpy as np
import pandas as pd
from scipy.spatial import distance

Defintion of all four approaches:

In [None]:
def euclidean_dist_math(v1, v2):
    dist = [math.pow((a - b), 2) for a, b in zip(v1, v2)]
    eudist = math.sqrt(sum(dist))
    return eudist


def euclidean_dist_numpy_1(v1, v2):
    v1_a = np.array(v1)
    v2_a = np.array(v2)
    sd = np.sum((v1 - v2) ** 2)
    eudist = np.sqrt(sd)
    return eudist

def euclidean_dist_numpy_2(v1, v2):
    return np.linalg.norm(v1 - v2)

def euclidean_dist_scipy(v1, v2):
    return distance.euclidean(v1, v2)


I'm generating two random vectors for tests:

In [15]:
dis1 = np.random.rand(20)
dis2 = np.random.rand(20)
v1, v2 = np.array(dis1), np.array(dis2)
v1, v2

(array([0.5344385 , 0.98860181, 0.52826368, 0.59355232, 0.34808987,
        0.20042474, 0.10012634, 0.43215682, 0.40427057, 0.59510778,
        0.51997115, 0.42772395, 0.42790569, 0.35361587, 0.99976697,
        0.84024487, 0.15112119, 0.99122512, 0.87993399, 0.06136295]),
 array([0.60584606, 0.93038409, 0.45462116, 0.45039148, 0.17457065,
        0.09512527, 0.85745713, 0.23797963, 0.91281345, 0.56607402,
        0.9454191 , 0.46736936, 0.06308445, 0.15377244, 0.8472466 ,
        0.8856591 , 0.76454757, 0.85972132, 0.88369905, 0.69685654]))

And then resting all:

In [None]:
# Inspired by https://stackoverflow.com/questions/37794849/efficient-and-precise-calculation-of-the-euclidean-distance

import timeit

def wrapper(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

wrappered_math = wrapper(euclidean_dist_math, v1, v2)
wrappered_numpy1 = wrapper(euclidean_dist_numpy_1, v1, v2)
wrappered_numpy2 = wrapper(euclidean_dist_numpy_2, v1, v2)
wrappered_scipy = wrapper(euclidean_dist_scipy, v1, v2)
t_math = timeit.repeat(wrappered_math, repeat=3, number=100000)
t_numpy1 = timeit.repeat(wrappered_numpy1, repeat=3, number=100000)
t_numpy2 = timeit.repeat(wrappered_numpy2, repeat=3, number=100000)
t_scipy = timeit.repeat(wrappered_scipy, repeat=3, number=100000)

print(f'math approach: {sum(t_math)/len(t_math)}')
print(f'numpy simple approach: {sum(t_numpy1)/len(t_numpy1)}')
print(f'numpy.linalg.norm approach: {sum(t_numpy2)/len(t_numpy2)}')
print(f'scipy.distance approach: {sum(t_scipy)/len(t_scipy)}')

math approach: 3.3649666663332027
numpy simple approach: 1.5148333333333237
numpy.linalg.norm approach: 1.0447666666667221
scipy.distance approach: 1.9894333336665113


Test came out as expected (tested on Apple MacBook Pro w/M1 using Brave Browser):

* The fastest is [numpy.linalg.norm](https://numpy.org/doc/stable/reference/generated/numpy.linalg.norm.html) function, 1.044767s per run,
* Followed by [Numpy's](https://numpy.org/doc/stable/reference/) built in functions, 1.514833s per run,
* And [scipy.spatial.distance.euclidean](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.euclidean.html) function, 1.989443s per run,
* And finally Python's built in [math library](https://docs.python.org/3.9/library/math.html) with 3.364967s per run.

*All in the browser!* 😊