# Distance Metrics - Lab

## Introduction

In this lab, we'll calculate various distances between multiple points using the distance metrics we learned about!

## Objectives

You will be able to:

* Calculate Euclidean Distance between 2 points
* Calculate Manhattan Distance between 2 points
* Compare and Contrast Manhattan, Euclidean, and Minkowski Distance

## Getting Started

To begin this lab, we'll start by writing a generalized function to calculate any of the three distance metrics we've learned about. Let's review what we know so far:

## How These Distance Metrics Are Related

Recall from the previous lesson that **_Manhattan Distance_** and **_Euclidean Distance_** are both just special cases of **_Minkowski Distance_**. Take a look at the formula for Minkowski Distance below:

<img src='minkowski-equation.png'>

**_Manhattan Distance_** is a special case where $r=1$ in the equation above (which means that we can remove the root operation and just keep the summation).  

**_Euclidean Distance_** is a special case where $r=2$ in the equation above.

Knowing this, we can create a generalized `distance` function that just calculates minkowski distance, and takes in `r` as a parameter. That way, we can use the same function for every problem, and still calculate Manhattan and Euclidean distance metrics by just passing in the appropriate values for the `r` parameter!

In the cell below:

* Complete the `distance` function. 
* This function should take in 3 arguments:
    * `a`, a tuple or array that describes a vector in n-dimensional space. 
    * `b`, a tuple or array that describes a vector in n-dimensional space (this must be the same length as `a`!)
    * `r`, which tells us the norm to calculate the vector space (if set to `1`, the result will be Manhattan, while `2` will calculate Euclidean distance)
* Since euclidean distance is the most common distance metric used, this function should default to using `r=2` if no value is set for `r`.
* Include a parameter called `verbose` which is set to `True` by default. If true, the function should print out if the distance metric returned is a measurement of Manhattan, Euclidean, or Minkowski distance.  
* This function should implement the minkowski distance equation above, and return the result. 

**_NOTE:_**  Remember that for Manhattan Distance, you need to make use of `np.abs()` to get the absolute value of the distance for each dimension, since we don't have the squaring function to make this positive for us!

**_HINT:_** Use `np.power()` as an easy way to implement both squares and square roots. `np.power(a, 3)` will return the cube of `a`, while `np.power(a, 1/3)` will return the cube root of 3. For more information on this function, see the numpy [documentation](https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.power.html)!

In [3]:
import numpy as np

# Complete this function! 
def distance(a, b, r=2):
    if len(a) != len(b):
        raise ValueError("vectors need to be of equal length")
        
    running_total = 0
    
    for index,value_a in enumerate(a):
        running_total += np.power(np.abs(value_a - b[index]), r)

    return np.power(running_total, 1/r)


test_point_1 = (1, 2)
test_point_2 = (4, 6)
print(distance(test_point_1, test_point_2)) # Expected Output: 5.0
print(distance(test_point_1, test_point_2, r=1)) # Expected Output: 7.0
print(distance(test_point_1, test_point_2, r=3)) # Expected Output: 4.497941445275415

5.0
7.0
4.497941445275415


Great job! 

Now, let's use the function so solve some practice problems.

## Problem 1:

Calculate the **_Euclidean Distance_** between the following points in 5-dimensional space:

Point 1: (-2, -3.4, 4, 15, 7)

Point 2: (3, -1.2, -2, -1, 7)

In [17]:
point_1 = [-2,-3.4,4,15,7]
point_2 = [3, -1.2, -2, -1, 7]



distance(point_1, point_2, 2)  
    
    # Expected Output: 17.939899665271266

17.939899665271266

## Problem 2:

Calculate the **_Manhattan Distance_** between the following points in 10-dimensional space:

Point 1: \[0, 0, 0, 7, 16, 2, 0, 1, 2, 1\]  
Point 2: \[1, -1, 5, 7, 14, 3, -2, 3, 3, 6\]

In [19]:
Point_1 = [0, 0, 0, 7, 16, 2, 0, 1, 2, 1]
Point_2 =  [1, -1, 5, 7, 14, 3, -2, 3, 3, 6]

distance(Point_1, Point_2, 1)



# Expected Output: 20

20.0

## Problem 3: 

Calculate the **_Minkowski Distance_** with a norm of 3.5 between the following points:

Point 1: (-2, 7, 3.4)
Point 2: (3, 4, 1.5)

In [31]:
Point1 = [-2, 7, 3.4]
Point2 = [3, 4, 1.5] 

for i in np.arange(0.2,20, .2):
    print(distance(Point1, Point2, i))    
    

    
    # Expected Output: 5.268789659188307

753.9544012205047
49.12752783141338
19.97411489554371
12.833369774402511
9.9
8.36689810597748
7.448436088708126
6.8485233785468855
6.4329493321731395
6.132699242584786
5.908874097140145
5.737968816509579
5.604985755328779
5.499929784657178
5.41589964155518
5.347989204120525
5.292624920692467
5.247150165626501
5.2095554954471925
5.178298360745649
5.15217946978256
5.130256068320937
5.111779893088137
5.096151993573655
5.082889321820591
5.071599682392741
5.061962719132103
5.053715325335617
5.046640338159674
5.040557700563029
5.035317497149551
5.0307944269391704
5.026883387709852
5.023495927116698
5.020557374647352
5.018004511943378
5.015783671452576
5.013849177805333
5.012162064863687
5.010689015598436
5.009401482905608
5.008274957980125
5.007288359510046
5.006423522177005
5.005664767074114
5.004998539928722
5.004413105631125
5.0038982896650195
5.0034452587212295
5.003046334138435
5.00269483291944
5.002384931970973
5.002111551949663
5.001870257699031
5.0016571727574375
5.001468905825183
5.

## Summary

Great job! Now that we know how to calculate distance metrics, we can easily apply this to writing a K-Nearest Neighbors classifer from scratch!