# Using `numba.jit` to speedup the computation of the Euclidean distance matrix 

In this notebook we implement a function to compute the Euclidean distance matrix using Numba's *just-in-time* compilation decorator. We compare it with the NumPy function we wrote before.

In [None]:
import numpy as np
import numba

In [None]:
@numba.njit(parallel=True)
def euclidean_numba(x, y):
    """Implementation with numba."""

    num_samples, num_feat = x.shape
    dist_matrix = np.zeros((num_samples, num_samples))
    for i in numba.prange(num_samples):
        for j in range(num_samples):
            r = 0.0
            for k in range(num_feat):
                r += (x[i][k] - y[j][k])**2
            dist_matrix[i][j] = r

    return dist_matrix

Let's include here our numpy implementation for comparison.

In [None]:
def euclidean_numpy(x, y):
    """Euclidean square distance matrix.
    
    Inputs:
    x: (N, m) numpy array
    y: (N, m) numpy array
    
    Ouput:
    (N, N) Euclidean square distance matrix:
    r_ij = (x_ij - y_ij)^2
    """

    x2 = np.einsum('ij,ij->i', x, x)[:, np.newaxis]  # equivalent to (x * x).sum(axis=1) but faster
    y2 = np.einsum('ij,ij->i', y, y)[np.newaxis, :]

    xy = np.dot(x, y.T)

    return np.abs(x2 + y2 - 2. * xy)

### Exercise 1
Before runing the different functions, could you say which of the two numba implementations would be faster?

In [None]:
# Let's check that they all give the same result
a = 10. * np.random.random([100, 10])

print(np.abs(euclidean_numpy(a, a) - euclidean_numba(a, a)).max())

In a more realistic case, our NumPy implementation is much faster:

In [None]:
nsamples = 6000
nfeat = 50

x = 10. * np.random.random([nsamples, nfeat])

%timeit euclidean_numpy(x, x)
%timeit euclidean_numba(x, x)