# DISTANCE FORMULA
Representing Points
In this lesson, you will learn three different ways to define the distance between two points:

Euclidean Distance
Manhattan Distance
Hamming Distance
Before diving into the distance formulas, it is first important to consider how to represent points in your code.

In this exercise, we will use a list, where each item in the list represents a dimension of the point. For example, the point (5, 8) could be represented in Python like this:

pt1 = [5, 8]
Points aren’t limited to just two dimensions. 

For example, a five-dimensional point could be represented as
[4, 8, 15, 16, 23].

Ultimately, we want to find the distance between two points. We’ll be writing functions that look like this:

distance([1, 2, 3], [5, 8, 9])
Note that we can only find the difference between two points if they have the same number of dimensions!

# Euclidean Distance
Euclidean Distance is the most commonly used distance formula. To find the Euclidean distance between two points, we first calculate the squared distance between each dimension. If we add up all of these squared differences and take the square root, we’ve computed the Euclidean distance.

Let’s take a look at the equation that represents what we just learned:

$ \sqrt{(a_1-b_1)^2+(a_2-b_2)^2+\ldots+(a_n - b_n)^2} $
 
The image below shows a visual of Euclidean distance being calculated:


<img src="images/euclidean.svg" style="width: 400px;"/>

                                            The Euclidean distance between two points.

 $ d = \sqrt{(a_1-b_1)^2+(a_2-b_2)^2} $


# Questions

1.
Create a function named euclidean_distance() that takes two lists as parameters named pt1 and pt2.

In the function, create a variable named distance, set it equal to 0, and return distance.



2.
After defining distance, create a for loop to loop through the dimensions of each point.

Add the squared difference between each dimension to distance.

Remember, in Python, you can square the variable num by using num ** 2.



3.
Outside of the for loop, take the square root of distance and return that value.



4.
Print the Euclidean distance between [1, 2] and [4, 0].

Print the Euclidean distance between [5, 4, 3] and [1, 7, 9].

Why can’t you find the difference between [2, 3, 4] and [1, 2]?

In [5]:
def euclidean_distance(pt1, pt2):
  distance = 0
  for i in range(len(pt1)):
    distance += (pt1[i] -pt2[i])**2
  distance = distance**0.5
  return distance


print(euclidean_distance([1,2], [4,0]))

print(euclidean_distance([5, 4, 3], [1, 7, 9]))

#Answer: You can’t find the distance between [2, 3, 4] and [1, 2] points that have a different number of dimensions!

3.605551275463989
7.810249675906654


# Manhattan Distance

Manhattan Distance is extremely similar to Euclidean distance. Rather than summing the squared difference between each dimension, we instead sum the absolute value of the difference between each dimension. It’s called Manhattan distance because it’s similar to how you might navigate when walking city blocks. If you’ve ever wondered “how many blocks will it take me to get from point A to point B”, you’ve computed the Manhattan distance.

The equation is shown below:

$ \mid a_1 - b_1 \mid + \mid a_2 - b_2 \mid + \ldots + \mid a_n - b_n \mid $

Note that Manhattan distance will always be greater than or equal to Euclidean distance. Take a look at the image below visualizing Manhattan Distance:

<img src="images/manhattan.svg" style="width: 400px;"/>

The Manhattan distance between two points.

$ d = \mid a_1 - b_1 \mid + \mid a_2 - b_2 \mid  $


# Questions
1.
Below euclidean_distance(), create a function called manhattan_distance() that takes two lists named pt1 and pt2 as parameters.

In the function, create a variable named distance, set it equal to 0, and return it.



2.
After defining distance, create a for loop to loop through the dimensions of each point.

Add the absolute value of the difference between each dimension to distance.

Remember, in Python, you can take the absolute value of num by using abs(num)



3.
You’re done with manhattan_distance()! Go ahead and find the Manhattan distance between the same points as last time.

Below the print statements for Euclidean distance, print the Manhattan distance between [1, 2] and [4, 0].

Also print the Manhattan distance between [5, 4, 3] and [1, 7, 9].

In [6]:
def manhattan_distance(pt1,pt2):
  distance = 0
  for i in range(len(pt1)):
    distance+= abs(pt1[i]-pt2[i])
  return distance

print(manhattan_distance([1, 2] , [4, 0]))

print(manhattan_distance([5, 4, 3] , [1, 7, 9]))

5
13


# Hamming Distance
Hamming Distance is another slightly different variation on the distance formula. Instead of finding the difference of each dimension, Hamming distance only cares about whether the dimensions are exactly equal. When finding the Hamming distance between two points, add one for every dimension that has different values.

Hamming distance is used in spell checking algorithms. For example, the Hamming distance between the word “there” and the typo “thete” is one. Each letter is a dimension, and each dimension has the same value except for one.

# Question
1.
Below manhattan_distance(), define your function in the same way as before. It should be named hamming_distance() and have two parameters named pt1 and pt2.

Create a variable named distance, have it start at 0, and return it.

2.
After defining distance, create a for loop to loop through the dimensions of each point. If the values at each dimension are different, add 1 to distance.



3.
hamming_distance() is done as well!

Print the Hamming distance between [1, 2] and [1, 100].

Print the Hamming distance between [5, 4, 9] and [1, 7, 9].

In [7]:
def hamming_distance(pt1, pt2):
  distance = 0
  for i in range(len(pt1)):
    if pt1[i]!=pt2[i]:
      distance+=1   
  return distance
print(hamming_distance([1, 2] , [1, 100]))

print(hamming_distance([5, 4, 9] , [1, 7, 9]))

1
2


# SciPy Distances

Now that you’ve written these three distance formulas yourself, let’s look at how to use them using Python’s SciPy library:

* Euclidean Distance .euclidean()
* Manhattan Distance .cityblock()
* Hamming Distance .hamming()

There are a few noteworthy details to talk about:

First, the scipy implementation of Manhattan distance is called cityblock(). Remember, computing Manhattan distance is like asking how many blocks away you are from a point.

Second, the scipy implementation of Hamming distance will always return a number between 0 an 1. Rather than summing the number of differences in dimensions, this implementation sums those differences and then divides by the total number of dimensions. For example, in your implementation, the Hamming distance between [1, 2, 3] and [7, 2, -10] would be 2. In scipy‘s version, it would be 2/3.

#Qestions

Call distance.euclidean() using the points [1, 2] and [4, 0] as parameters.

Print the result.



2.
Call distance.cityblock() using the points [1, 2] and [4, 0] as parameters.

Print the result.



3.
Call distance.hamming() using [5, 4, 9] and [1, 7, 9] as parameters and print the results.

Your answer shouldn’t match your function’s results. Remember, scipy divides by the number of dimensions.

In [8]:
from scipy.spatial import distance

def euclidean_distance(pt1, pt2):
  distance = 0
  for i in range(len(pt1)):
    distance += (pt1[i] - pt2[i]) ** 2
  return distance ** 0.5

def manhattan_distance(pt1, pt2):
  distance = 0
  for i in range(len(pt1)):
    distance += abs(pt1[i] - pt2[i])
  return distance

def hamming_distance(pt1, pt2):
  distance = 0
  for i in range(len(pt1)):
    if pt1[i] != pt2[i]:
      distance += 1
  return distance

print(euclidean_distance([1, 2], [4, 0]))
print(distance.euclidean([1, 2], [4, 0])) #from Scipy
print("")
print(manhattan_distance([1, 2], [4, 0]))
print(distance.cityblock([1, 2], [4, 0])) #from Scipy
print("")
print(hamming_distance([5, 4, 9], [1, 7, 9]))
print(distance.hamming([5, 4, 9], [1, 7, 9])) #from Scipy




3.605551275463989
3.605551275463989

5
5

2
0.6666666666666666
