In [10]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('ggplot')
import seaborn as sns
sns.set(context='notebook', style='darkgrid', palette='colorblind')
from ipywidgets import interact

# Vertical Line Test

## 1.1 Create two graphs, one that passes the vertical line test and one that does not.

### Passes Vertical Line Test

In [24]:
def f_x_3(vline):
    x = np.linspace(-1, 1, 100)
    y = x**3
    plt.figure(figsize=(10, 10))
    plt.plot(x, y)
    plt.axhline(y=0, color='k')
    plt.axvline(x=0, color='k')
    plt.axvline(x=vline, color='gold')
    plt.xlabel('X-Axis')
    plt.ylabel('Y-Axis')
    plt.show()
    

interact(f_x_3, vline=(-1, 1, 0.1));

interactive(children=(FloatSlider(value=0.0, description='vline', max=1.0, min=-1.0), Output()), _dom_classes=…

### Doesn't Pass Vertical Line Test

In [38]:
def f_circle(vline):
    radius = 2
    x = (np.linspace(-radius, radius, 100))
    y_top = np.sqrt(radius**2 - x**2)
    y_botm = -1 * y_top
    plt.figure(figsize=(10, 10))
    plt.plot(x, y_top, x, y_botm, color='dodgerblue')
    plt.axhline(y=0, color='k')
    plt.axvline(x=0, color='k')
    plt.axvline(x=vline, color='gold')
    plt.xlabel('X-Axis')
    plt.ylabel('Y-Axis')
    plt.xlim(-5, 5)
    plt.ylim(-5, 5)
    plt.show()
    

interact(f_circle, vline=(-1, 1, 0.1));

interactive(children=(FloatSlider(value=0.0, description='vline', max=1.0, min=-1.0), Output()), _dom_classes=…

## 1.2 Why are graphs that don't pass the vertical line test not considered "functions?"

> Because by function definition, function should map from set of inputs to set of outputs, where no inputs should map to multiple outputs, which in case of graphs like the one above fails the vertical line test due to two ouput from single $x$ value.

Formally, a function $f$ from a set $X$ to a set $Y$ is defined by a set $G$ of ordered pairs $(x, y)$ such that $x \in X$, $y \in Y$, and every element of $X$ is the first component of exactly one ordered pair in $G$. In other words, for every $x \in X$, there is exactly one element $y$ such that the ordered pair $(x, y)$ belongs to the set of pairs defining the function $f$.


#### Is a function:

![](https://upload.wikimedia.org/wikipedia/commons/thumb/8/83/Injection_keine_Injektion_2a.svg/200px-Injection_keine_Injektion_2a.svg.png)

#### Not a function:

![](https://upload.wikimedia.org/wikipedia/commons/thumb/b/bd/Injection_keine_Injektion_1.svg/200px-Injection_keine_Injektion_1.svg.png)

# Functions as Relations

## 2.1 Which of the following relations are functions? Why?

\begin{align}
\text{Relation 1: } \{(1, 2), (3, 2), (1, 3)\}
\\
\text{Relation 2: } \{(1, 3), (2, 3), (6, 7)\}
\\
\text{Relation 3: } \{(9, 4), (2, 1), (9, 6)\}
\\
\text{Relation 4: } \{(6, 2), (8, 3), (6, 4)\}
\\
\text{Relation 5: } \{(2, 6), (2, 7), (2, 4)\}
\end{align}

- Relation 1: Not a function since x value `1` maps to multiple $y$ value.
- Relation 2: Is a function, as it would pass the vertical line test.
- Relation 3: Is not a function, as it would fail vertical line test, because of x value 9.
- Relation 4: Is also not a function, as it would fail vertical line test, because of x value 6.
- Relation 5: Is again not a function, as it would fail vertical line test, because of x value 2, also it's a vertical line.

![](https://i.imgur.com/QRqgvTt.png)


Verify with an interactive link: https://www.desmos.com/calculator/u5rdbzkkwf

# Functions as a mapping between dimensions

## 3.1 For the following functions what is the dimensionality of the domain (input) and codomain (range/output)?

$$
\begin{align}
m(𝑥_1,𝑥_2,𝑥_3)=(x_1+x_2, x_1+x_3, x_2+x_3)
\\
n(𝑥_1,𝑥_2,𝑥_3,𝑥_4)=(x_2^2 + x_3, x_2x_4)
\end{align}
$$

For $m(x_1, x_2, x_3)$, maps from $\mathbb{R}^3$ to $\mathbb{R}^3$, so both have same dimensionality.

For $n(𝑥_1,𝑥_2,𝑥_3,𝑥_4)$, maps from $\mathbb{R}^4$ to $\mathbb{R}^2$, so from 4-D to 2-D.

## 3.2 Do you think it's possible to create a function that maps from a lower dimensional space to a higher dimensional space? If so, provide an example.

Yes, you can Single-number input map to Multiple-number output, this are called multivariable functions even if they don't always take more than one input.

For example,

$f(\theta)\ =\ (cos(t),sin(t))$

Another is,

$f(x)\ =\ sin(x) + 2\sqrt{x}$

Below table clarifies this further,

![](https://i.imgur.com/gndH0rO.png)

**Read more about it here: [Khan Academy, What are multivariable functions?](https://www.khanacademy.org/math/multivariable-calculus/thinking-about-multivariable-function/ways-to-represent-multivariable-functions/a/multivariable-functions)**

# Vector Transformations

## 4.1 Plug the corresponding unit vectors into each function. Use the output vectors to create a transformation matrix.

$$
\begin{align}
p(\begin{bmatrix}x_1 \\ x_2 \end{bmatrix}) = \begin{bmatrix} x_1 + 3x_2 \\-x_1 + 2x_2\\  \end{bmatrix}
\\
\\
q(\begin{bmatrix}x_1 \\ x_2 \\ x_3\end{bmatrix}) = \begin{bmatrix} 4x_1 + x_2 + 2x_3 \\ -x_1 + 2x_2 + 3x_3 \\ 5x_1 + x_2 -2x_3 \end{bmatrix}
\end{align}
$$


For $p$,

$
\begin{align}
T = \begin{bmatrix} 1 & 3 \\ -1 & 2 \end{bmatrix}
\end{align}
$

For $q$,

$
\begin{align}
T = \begin{bmatrix} 4 & 1 & 2\\ -1 & 2 & 3\\ 5 & 1 &-2 \end{bmatrix}
\end{align}
$





## 4.2 Verify that your transformation matrices are correct by choosing an input matrix and calculating the result both via the traditional functions above and also via vector-matrix multiplication.

**For $p$,**

Vector to transform, or map:

$
f(\begin{bmatrix} 1 \\ 1 \end{bmatrix})
$

Using **Traditional Functions**:

$
p(\begin{bmatrix} 1 \\ 1 \end{bmatrix}) = \begin{bmatrix} 1 + 3\\ -1 + 2 \end{bmatrix} = \begin{bmatrix} 4 \\ 1 \end{bmatrix}
$

Using **Matrix-vector** multiplication:

$
\begin{align}
\begin{bmatrix} 1 & 3 \\ -1 & 2 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 4 \\ 1 \end{bmatrix}
\end{align}
$


**For $q$,**

Vector to transform, or map:
$
f\left(\begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}\right)
$

Using **Traditional Functions**:

$
q\left(\begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}\right) = \begin{bmatrix} 4 + 1 + 2 \\ -1 + 2 + 3 \\ 5 + 1 -2 \end{bmatrix} = \begin{bmatrix} 7 \\ 4 \\ 4 \end{bmatrix}
$

Using **Matrix-vector** multiplication:

$
\begin{align}
\begin{bmatrix} 4 & 1 & 2\\ -1 & 2 & 3\\ 5 & 1 &-2 \end{bmatrix} = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 7 \\ 4 \\ 4 \end{bmatrix}
\end{align}
$

# Eigenvalues and Eigenvectors

## 5.1 In your own words, give an explanation for the intuition behind eigenvalues and eigenvectors.

When you do linear transformation, vectors that don't get knocked off their own span is an eigenvector, and by how much it get's stretched or squished is denoted by scalar - eigenvalue. 

# The Curse of Dimensionality

## 6.1 What are some of the challenges of working with high dimensional spaces?

Following are the challenged of working with high-dimensional data:

- Hard to visualize beyond 3D space
- Not every feature is as important as every other feature when it comes to capturing relationship
- Increased computation
- Increased sparcity of data where measure of distance lose meaning, and lastly low number of observations relative to dimensions increase the risk of overfitting

## 6.2 What is the rule of thumb for how many observations you should have compared to parameters in your model?

- Atleast 5 times the observations (rows) as parameters (features, columns, dimensions) in your model
- Having more observations is always better, as Ryan said, `More data covereth a multitude of sins`.
- Don't measure similarity via euclidean distances
- Discard redudant data

# Principal Component Analysis

## 7.1 Load the UCI Machine Learning Repository's [Iris Dataset](https://gist.githubusercontent.com/curran/a08a1080b88344b0c8a7/raw/d546eaee765268bf2f487608c537c05e22e4b221/iris.csv) and use PCA to isolate the dataset's first and second principal components and plot them on a graph. 

In [39]:
import pandas as pd

df = pd.read_csv(
    filepath_or_buffer='https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data', 
    header=None, 
    sep=',')

df.columns=['sepal_len', 'sepal_wid', 'petal_len', 'petal_wid', 'class']
df.dropna(how="all", inplace=True) # drops the empty line at file-end

df.head()

Unnamed: 0,sepal_len,sepal_wid,petal_len,petal_wid,class
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


In [43]:
# split data table into data X and class labels y

X = df.iloc[:,0:4]
y = df.iloc[:,4]

In [44]:
X.head()

Unnamed: 0,sepal_len,sepal_wid,petal_len,petal_wid
0,5.1,3.5,1.4,0.2
1,4.9,3.0,1.4,0.2
2,4.7,3.2,1.3,0.2
3,4.6,3.1,1.5,0.2
4,5.0,3.6,1.4,0.2


In [45]:
y.head()

0    Iris-setosa
1    Iris-setosa
2    Iris-setosa
3    Iris-setosa
4    Iris-setosa
Name: class, dtype: object

# Stretch Goal

## 1) Do NOT work on the stretch goal until you feel like you have a firm grasp of eigenvectors, eigenvalues, and PCA. Prioritize self-study over the stretch goal if you are not comfortable with those topics yet.

## 2) Explore further the intuition behind eigenvalues and eigenvectors by creating your very own eigenfaces:

![Eigenfaces](https://i.pinimg.com/236x/1c/f1/01/1cf101a9859437a5d096a04b05be06b4--faces-tattoo.jpg)

You don't necessarily have to use this resource, but this will get you started: 
[Eigenface Tutorial](https://sandipanweb.wordpress.com/2018/01/06/eigenfaces-and-a-simple-face-detector-with-pca-svd-in-python/)