# Distance Metrics - Lab

## Introduction

In this lab, we'll calculate various distances between multiple points using the distance metrics we learned about!

## Objectives

You will be able to:

* Calculate Euclidean Distance between 2 points
* Calculate Manhattan Distance between 2 points
* Compare and Contrast Manhattan, Euclidean, and Minkowski Distance

## Getting Started

To begin this lab, we'll start by writing a generalized function to calculate any of the three distance metrics we've learned about. Let's review what we know so far:

## How These Distance Metrics Are Related

Recall from the previous lesson that **_Manhattan Distance_** and **_Euclidean Distance_** are both just special cases of **_Minkowski Distance_**. Take a look at the formula for Minkowski Distance below:

<img src='minkowski-equation.png' width='300px'>

**_Manhattan Distance_** is a special case where $c=1$ in the equation above (which means that we can remove the root operation and just keep the summation).  

**_Euclidean Distance_** is a special case where $c=2$ in the equation above.

Knowing this, we can create a generalized `distance` function that just calculates minkowski distance, and takes in `c` as a parameter. That way, we can use the same function for every problem, and still calculate Manhattan and Euclidean distance metrics by just passing in the appropriate values for the `c` parameter!

In the cell below:

* Complete the `distance` function. 
* This function should take in 3 arguments:
    * `a`, a tuple or array that describes a vector in n-dimensional space. 
    * `b`, a tuple or array that describes a vector in n-dimensional space (this must be the same length as `a`!)
    * `c`, which tells us the norm to calculate the vector space (if set to `1`, the result will be Manhattan, while `2` will calculate Euclidean distance)
* Since euclidean distance is the most common distance metric used, this function should default to using `c=2` if no value is set for `c`.
* Include a parameter called `verbose` which is set to `True` by default. If true, the function should print out if the distance metric returned is a measurement of Manhattan, Euclidean, or Minkowski distance.  
* This function should implement the minkowski distance equation above, and return the result. 

**_NOTE:_**  Remember that for Manhattan Distance, you need to make use of `np.abs()` to get the absolute value of the distance for each dimension, since we don't have the squaring function to make this positive for us!

**_HINT:_** Use `np.power()` as an easy way to implement both squares and square roots. `np.power(a, 3)` will return the cube of `a`, while `np.power(a, 1/3)` will return the cube root of 3. For more information on this function, see the numpy [documentation](https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.power.html)!

In [13]:
a = (1, 2)
b = np.array((4, 6))

In [16]:
for number in 10:
    print(number)

TypeError: 'int' object is not iterable

In [17]:
for number in range(10):
    print(number)

0
1
2
3
4
5
6
7
8
9


In [14]:
a[1]

2

In [15]:
for num in a:
    print(a)

(1, 2)
(1, 2)


In [10]:
for num in a:
    number = a - b
    print(number)

[-3 -4]
[-3 -4]


In [1]:
import numpy as np


def calculate_distance(a, b, c=2, verbose=True):
    """
    Calculates the distance of two arrays. Returns a float.
    """
    if verbose and c==1:
        print("Calculating the Manhattan distance:")
    elif verbose and c==2:
        print("Calculating the Euclidean distance:")

    elif verbose and c>2:
        print("Calculating the Minkowski distance:")
    else:
        pass

    diff_ab = a - b
    abs_ab = np.absolute(diff_ab)
    power_ab = abs_ab**c
    sum_ab = power_ab.sum()
    root_ab = np.power(sum_ab, 1/c)
    output = np.round(root_ab, 3)

    return output



i = np.array([1, 2])
j = np.array([4, 6])

for num in [1, 2, 3]:
    output = calculate_distance(a=i, b=j, c=num)
    print(output)


Calculating the Manhattan distance:
7.0
Calculating the Euclidean distance:
5.0
Calculating the Minkowski distance:
4.498


Great job! 

Now, let's use the function so solve some practice problems.

## Problem 1:

Calculate the **_Euclidean Distance_** between the following points in 5-dimensional space:

Point 1: (-2, -3.4, 4, 15, 7)

Point 2: (3, -1.2, -2, -1, 7)

In [2]:
# Expected Output: 17.939899665271266
i = (-2, -3.4, 4, 15, 7)
j = np.array([3, -1.2, -2, -1, 7])

calculate_distance(a=i, b=j, c=2)

Calculating the Euclidean distance:


17.94

## Problem 2:

Calculate the **_Manhattan Distance_** between the following points in 10-dimensional space:

Point 1: \[0, 0, 0, 7, 16, 2, 0, 1, 2, 1\]  
Point 2: \[1, -1, 5, 7, 14, 3, -2, 3, 3, 6\]

In [3]:
   # Expected Output: 20
i = np.array([0, 0, 0, 7, 16, 2, 0, 1, 2, 1])
j = np.array([1, -1, 5, 7, 14, 3, -2, 3, 3, 6])

calculate_distance(a=i, b=j, c=1)

Calculating the Manhattan distance:


20.0

## Problem 3: 

Calculate the **_Minkowski Distance_** with a norm of 3.5 between the following points:

Point 1: (-2, 7, 3.4)
Point 2: (3, 4, 1.5)

In [4]:
   # Expected Output: 5.268789659188307
i = np.array([-2, 7, 3.4])
j = np.array([3, 4, 1.5])

calculate_distance(a=i, b=j, c=3.5)

Calculating the Minkowski distance:


5.269

## Summary

Great job! Now that we know how to calculate distance metrics, we can easily apply this to writing a K-Nearest Neighbors classifer from scratch!