# Distance Metrics

Let's explore some distance metrics and how we can define them in
Python.

Resources:

* https://www.maartengrootendorst.com/blog/distances/
* https://www.mathsisfun.com/algebra/vectors-dot-product.html
* https://www.w3resource.com/python-exercises/numpy/compute-various-distance-metrics-using-numpy-and-scipy.php
* https://medium.com/@jodancker/a-brief-introduction-to-distance-measures-ac89cbd2298

## Euclidean distance

**Formula**

![Euclidean distance formula](https://www.gstatic.com/education/formulas2/553212783/en/euclidean_distance.svg)

**Illustration**

![Illustration](https://upload.wikimedia.org/wikipedia/commons/5/55/Euclidean_distance_2d.svg)

In [1]:
x = [3, 0]
y = [0, 4]
# We know the result should be 5

Calculate as a **Python list** and using built-in library **math**

In [2]:
import math

def euclidean_distance_math(array1, array2):
    sum_of_sqaured_diffs = 0
    for i in range(len(array1)):
        sum_of_sqaured_diffs += (array1[i] - array2[i]) ** 2
    return math.sqrt(sum_of_sqaured_diffs)

euclidean_distance_math(x, y)

5.0

Calculate using **NumPy** array operations

In [3]:
import numpy as np

xn = np.array(x)
yn = np.array(y)

def euclidean_distance_numpy(array1, array2):
    return np.sqrt(np.sum((array1 - array2) ** 2))

euclidean_distance_numpy(xn, yn)

np.float64(5.0)

Calculate using NumPy **linalg.norm()**

In [4]:
def euclidean_distance_numpy_linalg(array1, array2):
  return np.linalg.norm(array1 - array2)

euclidean_distance_numpy_linalg(xn, yn)

np.float64(5.0)

Calculate using **scipy.spatial.distance.euclidean**

In [5]:
from scipy.spatial.distance import euclidean

euclidean(xn, yn)

np.float64(5.0)

In [6]:
def euclidean_distance_scipy(array1, array2):
    return euclidean(array1, array2)

euclidean_distance_scipy(xn, yn)

np.float64(5.0)

In [7]:
from scipy.spatial.distance import euclidean, cdist
import numpy as np

point_a = [1, 2, 3]
point_b = [4, 5, 6]

# Using scipy.spatial.distance.euclidean
dist_scipy = euclidean(point_a, point_b)
print(f"Distance using scipy.spatial.distance.euclidean: {dist_scipy}")

# Using cdist for a distance matrix (requires input points to be array-like)
points_set1 = np.array([[1, 2], [3, 4]])
points_set2 = np.array([[5, 6], [7, 8]])

distance_matrix = cdist(points_set1, points_set2, 'euclidean')
print(f"Distance matrix using cdist:\n{distance_matrix}")


Distance using scipy.spatial.distance.euclidean: 5.196152422706632
Distance matrix using cdist:
[[5.65685425 8.48528137]
 [2.82842712 5.65685425]]


## Getting help from the AI assistants

**Answer by the assisstant**

Using the scipy Library

The scipy library provides highly optimized functions for scientific computing, including distance calculations, which are generally faster than pure Python loops or basic math. 

scipy.spatial.distance.euclidean(p, q): This function calculates the Euclidean distance between two points.

scipy.spatial.distance.cdist(XA, XB, 'euclidean'): This is ideal if you need to calculate the distance between every point in one collection (XA) and every point in another collection (XB), returning a distance matrix. 

python example

```python
# You would need to install scipy first: pip install scipy
from scipy.spatial.distance import euclidean, cdist
import numpy as np # Note: Scipy typically uses numpy internally for data structures

point_a = [1, 2, 3]
point_b = [4, 5, 6]

# Using scipy.spatial.distance.euclidean
dist_scipy = euclidean(point_a, point_b)
print(f"Distance using scipy.spatial.distance.euclidean: {dist_scipy}")

# Using cdist for a distance matrix (requires input points to be array-like)
points_set1 = np.array([[1, 2], [3, 4]])
points_set2 = np.array([[5, 6], [7, 8]])

distance_matrix = cdist(points_set1, points_set2, 'euclidean')
print(f"Distance matrix using cdist:\n{distance_matrix}")

```


**Example Prompt:**

"i want to calculate euclideans distance using scipy library. here is an example of import math and numpy versions of it.

import math 

def euclidean_distance(array1, array2): sum_of_squared_diffs = 0 for i in range(len(array1)): sum_of_squared_diffs += (array1[i] - array2[i]) ** 2 return math.sqrt(sum_of_squared_diffs) euclidean_distance(x, y) 

import numpy as np 

xn = np.array(x) yn = np.array(y) def euclidean_distance_numpy(array1, array2): return np.sqrt(np.sum((array1 - array2) ** 2)) euclidean_distance_numpy(xn, yn)"

---

**Answer by the assisstant**

Here is the correct and clean way to calculate Euclidean distance using SciPy, based on your examples with math and numpy.

âœ… Euclidean Distance using SciPy

SciPy has a built-in function for this:

ðŸ‘‰ scipy.spatial.distance.euclidean()

Here is the exact version you need:

```python
from scipy.spatial.distance import euclidean

euclidean(x, y)
```

Thatâ€™s it â€” SciPy handles everything for you.

âœ… Full Example (copy-paste ready)

```python
import numpy as np
from scipy.spatial.distance import euclidean

# Example vectors
x = [2, 4, 5]
y = [6, 1, 3]

# Convert to numpy arrays (optional)
xn = np.array(x)
yn = np.array(y)

# SciPy Euclidean distance
dist = euclidean(xn, yn)
print("Euclidean distance (SciPy):", dist)
```

ðŸ“Œ Matching your function style (optional)

If you want it wrapped in a function like your other examples:

```python
from scipy.spatial.distance import euclidean

def euclidean_distance_scipy(array1, array2):
    return euclidean(array1, array2)
```

Then you call:

```python
euclidean_distance_scipy(x, y)
```

## Testing All Euclidean Distance Calculations

### 3, 4, 5 right triangle

In [8]:
x = [3, 0]
y = [0, 4]
xn = np.array(x)
yn = np.array(y)

print(euclidean_distance_math(x, y))
print(euclidean_distance_numpy(xn, yn))
print(euclidean_distance_numpy_linalg(xn, yn))
print(euclidean_distance_scipy(xn, yn))

5.0
5.0
5.0
5.0


### Multi-dimensional input

In [9]:
x = [3, 0, 1, 10, 3, 6, 7]
y = [0, 4, 4, -2, 5, 6.7, 9]
xn = np.array(x)
yn = np.array(y)

print(euclidean_distance_math(x, y))
print(euclidean_distance_numpy(xn, yn))
print(euclidean_distance_numpy_linalg(xn, yn))
print(euclidean_distance_scipy(xn, yn))

13.656134152826708
13.656134152826708
13.656134152826708
13.656134152826708


## Manhattan Distance

**Illustration**

![Manhattan Distance Illustration](https://www.maartengrootendorst.com/images/posts/2021-01-02-distances/manhattan.svg)

**Formula**

![Manhattan Distance Formula](https://www.maartengrootendorst.com/images/posts/2021-01-02-distances/manhattan_distance.png)

Using NumPy

In [10]:
def manhattan_distance_numpy(array1, array2):
    return np.sum(abs(array1 - array2))

manhattan_distance_numpy(xn, yn)

np.float64(26.7)

Using Scipy

In [11]:
from scipy.spatial.distance import cityblock

def manhattan_distance_scipy(array1, array2):
    return cityblock(array1, array2)

manhattan_distance_scipy(xn, yn)

np.float64(26.7)

## Chebyshev Distance

![](https://www.maartengrootendorst.com/images/posts/2021-01-02-distances/chebyshev.svg)

![](https://www.maartengrootendorst.com/images/posts/2021-01-02-distances/chebyshev_distance.png)

In [12]:
def chebyshev_distance_numpy(array1, array2):
    return np.max(abs(array1 - array2))

chebyshev_distance_numpy(xn, yn)

np.float64(12.0)