# DISTANCE FORMULA


## 1. Euclidean Distance

**Euclidean Distance** is the most commonly used distance formula. To find the Euclidean distance between two points, we first calculate the squared distance between each dimension. If we add up all of these squared differences and take the square root, we’ve computed the Euclidean distance.

Let’s take a look at the equation that represents what we just learned:

$\sqrt{(a_1-b_1)^2+(a_2-b_2)^2+\ldots+(a_n - b_n)^2}$	

The image below shows a visual of Euclidean distance being calculated:

<img src = "img/euclidean.svg" height = "50%" width = "50%">


The Euclidean distance between two points.

$d = \sqrt{(a_1-b_1)^2+(a_2-b_2)^2}$


### Python Implementation of Euclidean Formula

In [3]:
def euclidean_distance(pt1, pt2):
    distance = 0
    for i in range(len(pt1)):
        distance += (pt1[i] - pt2[i]) ** 2
    return distance ** 0.5

print(euclidean_distance([1, 2], [4, 0]))
print(euclidean_distance([5, 4, 3], [1, 7, 9]))
print(euclidean_distance([2, 3, 4], [1, 2]))

3.605551275463989
7.810249675906654


IndexError: list index out of range

**Note:** We can’t find the distance between points that have a different number of dimensions!

## 2. Manhattan Distance
Manhattan Distance is extremely similar to Euclidean distance. Rather than summing the squared difference between each dimension, we instead sum the absolute value of the difference between each dimension. It’s called Manhattan distance because it’s similar to how you might navigate when walking city blocks. If you’ve ever wondered “how many blocks will it take me to get from point A to point B”, you’ve computed the Manhattan distance.

The equation is shown below:

$\mid a_1 - b_1 \mid + \mid a_2 - b_2 \mid + \ldots + \mid a_n - b_n \mid$

Note that Manhattan distance will always be greater than or equal to Euclidean distance. Take a look at the image below visualizing Manhattan Distance:

<img src = "img/manhattan.svg" height = "50%" width = "50%">
The Manhattan distance between two points.

$d = \mid a_1 - b_1 \mid + \mid a_2 - b_2 \mid$

### Python Implementation of Manhattan Formula

In [4]:
def manhattan_distance(pt1, pt2):
    distance = 0
    for i in range(len(pt1)):
        distance += abs(pt1[i] - pt2[i])
        return distance

print(manhattan_distance([1, 2], [4, 0]))
print(manhattan_distance([5, 4, 3], [1, 7, 9]))

5
13


## 3. Hamming Distance
Hamming Distance is another slightly different variation on the distance formula. Instead of finding the difference of each dimension, Hamming distance only cares about whether the dimensions are exactly equal. When finding the Hamming distance between two points, add one for every dimension that has different values.

Hamming distance is used in spell checking algorithms. For example, the Hamming distance between the word “there” and the typo “thete” is one. Each letter is a dimension, and each dimension has the same value except for one.

### Python Implementation of Hamming Formula

In [7]:
def hamming_distance(pt1, pt2):
    distance = 0
    for i in range(len(pt1)):
        if pt1[i] != pt2[i]:
            distance += 1
    return distance

print(hamming_distance([1, 2], [1, 100]))
print(hamming_distance([5, 4, 9], [1, 7, 9]))

1
2


## 4. SciPy Distances
Now that you’ve written these three distance formulas yourself, let’s look at how to use them using Python’s SciPy library:

- Euclidean Distance `.euclidean()`
- Manhattan Distance `.cityblock()`
- Hamming Distance `.hamming()`
There are a few noteworthy details to talk about:

First, the `scipy` implementation of Manhattan distance is called `cityblock()`. Remember, computing Manhattan distance is like asking how many blocks away you are from a point.

Second, the `scipy` implementation of Hamming distance will always return a number between `0` an `1`. Rather than summing the number of differences in dimensions, this implementation sums those differences and then divides by the total number of dimensions. For example, in your implementation, the Hamming distance between `[1, 2, 3]` and `[7, 2, -10]` would be `2`. In scipy‘s version, it would be `2/3`.

In [10]:
from scipy.spatial import distance

print(distance.euclidean([1, 2], [4, 0]))
print(distance.cityblock([1, 2], [4, 0]))
print(distance.hamming([1, 2], [4, 0]))

3.605551275463989
5
1.0
