<h1 align="center">Scientific Programming in Python</h1>
<h2 align="center">Topic 5: Accelerating Python with Cython: Writting C in Python </h2> 
<h3 align="center">Maximiliano bombin - 201104308-1 </h3> 

In [128]:
import numba
import numpy as np
from math import sqrt

%load_ext Cython

The Cython extension is already loaded. To reload it, use:
  %reload_ext Cython


## La distancia de Hausdorff nuevamente...

En esta actividad volveremos a implementar la distancia/métrica de Hausdorff, pero ahora utilizando Cython.

__La métrica de Hausdorff__ corresponde a un métrica o distancia ocupada para medir cuán disímiles son dos subconjuntos dados. 

Esta tiene muchas aplicaciones, en particular para comparar el parecido entre imágenes. En el caso en donde los conjuntos son arreglos bidimensionales, la definición es la siguiente:

Sean $X \in \mathbb{R}^{m \times 3}$ e  $Y \in \mathbb{R}^{n \times 3}$ dos matrices, la métrica/distancia de Hausdorff sobre sobre estas como:

$$
d_H(X,Y) = \max \left(\ \max_{i\leq m} \min_{j \leq n} d(X[i],Y[j]), \ \max_{j\leq n} \min_{i \leq m} d(Y[j],X[i]) \ \right)
$$

donde $d$ es la _distancia Euclideana_ clásica. ($X[i]$ indíca la i-ésima fila de X).

__Ilustración unidimensional:__ Distancia entre funciones.
<img src='data/hausdorff.png' style="width: 600px;">

A continuación se le proveen 3 funciones que implementan tal métrica, usando __Numba__.

In [22]:
@numba.jit('float64 (float64[:], float64[:])')
def metric_numba(x, y):
    """
    standard Euclidean distance
    """
    ret = x-y
    ret *= ret
    return np.sqrt(ret.sum())


@numba.jit('float64 (float64[:], float64[:,:])', nopython=True)
def inf_dist_numba(x, Y):
    """
    inf distance between row x and array Y
    """
    m = Y.shape[0]
    inf = np.inf
    
    for i in range(m):
        dist = metric_numba(x, Y[i])
        if dist < inf:
            inf = dist
    return inf

@numba.jit('float64 (float64[:,:], float64[:,:])', nopython=True)
def hausdorff_numba(X, Y):
    """
    Hausdorff distance between arrays X and Y
    """
    m = X.shape[0]
    n = Y.shape[0]
    sup1 = -1.
    sup2 = -1.
    
    for i in range(m):
        inf1 = inf_dist_numba(X[i], Y)
        if inf1 > sup1:
            sup1 = inf1
    for i in range(n):
        inf2 = inf_dist_numba(Y[i], X)
        if inf2 > sup2:
            sup2 = inf2
            
    return max(sup1, sup2)

Se solicita que realice lo siguiente:

1. Escribir el equivalente __Cython__ de las tres funciones anteriores, ocupando todas las optimizaciones posibles: __Compiler directives__, __Memory Views__, __Inline Functions__, __Pure C functions__ o cualquier otra optimización que usted considere conveniente.
2. Cree `10` arreglos $X,Y$ aleatorios, con cantidad creciente de filas, y realice análsis de tiempos de ejecuciones de las versiones __Numba__ y __Cython__ de las funciontes anteriores sobre estos arreglos.
3. Concluya.

In [106]:
%%cython -c=-fPIC -c=-fwrapv -c=-O3 -c=-fno-strict-aliasing
#!python
#cython: cdivision=True, boundscheck=False, nonecheck=False, wraparound=False, initializedcheck=False

import numpy as np
cimport numpy as cnp

ctypedef cnp.float64_t float64_t
from libc.math cimport sqrt

cdef inline float64_t metric_cython(float64_t[::1] a, float64_t[::1] b):
    cdef:
        int i = 0
        int n = a.shape[0]
        float64_t ret = 0
    
    for i in range(n):
        ret += (a[i]-b[i])**2
    
    return sqrt(ret)

cdef inline float64_t inf_dist_cython(float64_t[::1] v, float64_t[:,::1] M, long rows):
    
    cdef:
        long i = 0
        float64_t inf = np.inf
        float64_t dist = 0.0
    
    for i in range(rows):
        dist = metric_cython(v, M[i])
        if dist < inf:
            inf = dist
    
    return inf

def hausdorff_distance_cython(float64_t[:,::1] M1, float64_t[:,::1] M2):
    
    cdef:
        long i = 0
        long j = 0
        long m1_rows = M1.shape[0]
        long m2_rows = M2.shape[0]
        float64_t sup1 = -1.0
        float64_t sup2 = -1.0
        
    for i in range(m1_rows):
            inf1 = inf_dist_cython(M1[i], M2, m2_rows)
            if inf1 > sup1:
                sup1 = inf1
            
    for j in range(m2_rows):
            inf2 = inf_dist_cython(M2[j], M1, m1_rows)
            if inf2 > sup2:
                sup2 = inf2
    
    if sup1 > sup2:
        return sup1
    
    return sup2

In [125]:
A1 = np.random.random((100,3))
B1 = np.random.random((200,3))

A2 = np.random.random((200,3))
B2 = np.random.random((400,3))

A3 = np.random.random((300,3))
B3 = np.random.random((600,3))

A4 = np.random.random((400,3))
B4 = np.random.random((800,3))

A5 = np.random.random((500,3))
B5 = np.random.random((1000,3))

A6 = np.random.random((600,3))
B6 = np.random.random((1200,3))

A7 = np.random.random((700,3))
B7 = np.random.random((1400,3))

A8 = np.random.random((800,3))
B8 = np.random.random((1600,3))

A9 = np.random.random((900,3))
B9 = np.random.random((1800,3))

A10 = np.random.random((1000,3))
B10 = np.random.random((2000,3))

In [127]:
print("Experimento 1:\n")
print("cython time:"), 
time1_cython = %timeit hausdorff_distance_cython(A1,B1)
print("\nnumba time:"), 
time1_numba = %timeit hausdorff_numba(A1,B1)
print("\n")

print("Experimento 2:\n")
print("cython time:"), 
time1_cython = %timeit hausdorff_distance_cython(A2,B2)
print("\nnumba time:"), 
time1_numba = %timeit hausdorff_numba(A2,B2)
print("\n")

print("Experimento 3:\n")
print("cython time:"), 
time1_cython = %timeit hausdorff_distance_cython(A3,B3)
print("\nnumba time:"), 
time1_numba = %timeit hausdorff_numba(A3,B3)
print("\n")

print("Experimento 4:\n")
print("cython time:"), 
time1_cython = %timeit hausdorff_distance_cython(A4,B4)
print("\nnumba time:"), 
time1_numba = %timeit hausdorff_numba(A4,B4)
print("\n")

print("Experimento 5:\n")
print("cython time:"), 
time1_cython = %timeit hausdorff_distance_cython(A5,B5)
print("\nnumba time:"), 
time1_numba = %timeit hausdorff_numba(A5,B5)
print("\n")

print("Experimento 6:\n")
print("cython time:"), 
time1_cython = %timeit hausdorff_distance_cython(A6,B6)
print("\nnumba time:"), 
time1_numba = %timeit hausdorff_numba(A6,B6)
print("\n")

print("Experimento 7:\n")
print("cython time:"), 
time1_cython = %timeit hausdorff_distance_cython(A7,B7)
print("\nnumba time:"), 
time1_numba = %timeit  hausdorff_numba(A7,B7)
print("\n")

print("Experimento 8:\n")
print("cython time:"), 
time1_cython = %timeit hausdorff_distance_cython(A8,B8)
print("\nnumba time:"), 
time1_numba = %timeit hausdorff_numba(A8,B8)
print("\n")

print("Experimento 9:\n")
print("cython time:"), 
time1_cython = %timeit hausdorff_distance_cython(A9,B9)
print("\nnumba time:"), 
time1_numba = %timeit hausdorff_numba(A9,B9)
print("\n")

print("Experimento 10\n")
print("cython time:"), 
time1_cython = %timeit hausdorff_distance_cython(A10,B10)
print("\nnumba time:"), 
time1_numba = %timeit hausdorff_numba(A10,B10)
print("\n")

Experimento 1:

cython time:
1000 loops, best of 3: 873 µs per loop

numba time:
100 loops, best of 3: 8.62 ms per loop


Experimento 2:

cython time:
100 loops, best of 3: 3.53 ms per loop

numba time:
10 loops, best of 3: 32.5 ms per loop


Experimento 3:

cython time:
100 loops, best of 3: 8 ms per loop

numba time:
10 loops, best of 3: 75.9 ms per loop


Experimento 4:

cython time:
100 loops, best of 3: 13.7 ms per loop

numba time:
10 loops, best of 3: 127 ms per loop


Experimento 5:

cython time:
10 loops, best of 3: 20.7 ms per loop

numba time:
1 loop, best of 3: 198 ms per loop


Experimento 6:

cython time:
10 loops, best of 3: 30.7 ms per loop

numba time:
1 loop, best of 3: 285 ms per loop


Experimento 7:

cython time:
10 loops, best of 3: 40.5 ms per loop

numba time:
1 loop, best of 3: 397 ms per loop


Experimento 8:

cython time:
10 loops, best of 3: 53.4 ms per loop

numba time:
1 loop, best of 3: 511 ms per loop


Experimento 9:

cython time:
10 loops, best of 3: 6

Al analizar los tiempos de ejecución de ambos algoritmos, se puede apreciar que los tiempos obtenidos en los experimentos por las funciones escritas en Cython fueron de mucho menor tiempo que aquellos obtenidos por numba en todos los casos.