# Data Science and Machine Learning Cheat Sheet

## Introduction

This notebook briefly explains some of the math notation and and library methods you'll come across when studying data science and machine learning.

## Math Notation

$$\Large x \in Y$$

Element $x$ is in set $Y$.  

Example - if $Y$ is the the set of numbers $\{2,4,6\}$, then $2 \in Y$.

---


$$\Large x \notin Y$$ 

Element $x$ is $not$ in Y. 

Example - if $Y$ is the set of numbers $\{2,4,6\}$, then $3 \notin Y$

---

$$\Large x_i$$ 

The $i^{th}$ item of $x$.  

Example - if $x$ is the vector $(1,5,6,2)$ then $x_3 = 6$.  

Notes -  Confusingly, this is often 1-indexed in the math but 0-indexed in code.  It can also get a little messy when $x$ is a dataset of vectors (as often happens in ML contexts), in that case $x_3$ will mean the third vector/datapoint in the dataset.

---

$$\Large \sum_{i \in Y} x_i$$  

The sum of all $x_i$ where $i \in Y$.  

Example - if $Y$ is $\{1,3\}$ and $x$ is the vector $(7, 9, 10)$ then this is equivalent to $x_1 + x_3 = 7 + 10 = 17$.  

Notes - often people get lazy with specifying the indices so sometimes you'll just see $\sum x_i$ to mean "sum together every $x_i$".  Using the above definition of $x$, this would mean $x_1 + x_2 + x_3 = 7 + 9  + 10 = 26$.

---

$$\Large x \cdot y$$

The [dot product](https://en.wikipedia.org/wiki/Dot_product) of $x$ and $y$.  If $x$ and $y$ are vectors of length $i$, then this is equivalent to $\sum x_iy_i$.

Example - If $x$ is the vector $(3, 5, 7)$ and $y$ is the vector $(1, 0, 2)$, then $x \cdot y$ is $x_1y_1 + x_2y_2 + x_3y_3 = (3*1) + (5*0) + (7*2) = 17$.

Notes - Sometimes we'll write [matrix multiplication](https://en.wikipedia.org/wiki/Matrix_multiplication) using this notation.  More often, we'll denote matrix multiplication by just juxtaposing the matrix symbols.  That is, if $A$ and $B$ are matrices, then $A \cdot B = AB$.  Notice how matrix multiplication is just the dot product of each row of $A$ with each column of $B$.

---

$$\Large \lVert x \rVert$$ 

This is [euclidean norm](https://www.quora.com/What-is-Euclidean-norm-and-what-are-its-uses) of vector $x$.  This equivalent to $\sqrt{x \cdot x} = \sqrt{\sum_i x_i^2}$.

Example - If $x$ is $(3,4)$, then $\lVert x \rVert = \sqrt{x_1^2 + x_2} = \sqrt{3^2 + 4^2} = 5$

Notes - This can be though of as the length of vector $x$.

---

$$\Large \lVert x \rVert^2$$ 


This is the squared euclidean norm. This is equivalent to $\sqrt{x \cdot x}^2 = \sum_i x_i^2$.

Example - If $x$ is $(3,4)$, then $\lVert x \rVert^2 = \sqrt{x_1^2 + x_2}^2 = 3^2 + 4^2 = 25$


Notes - You'll see this as a regularization term in some cost functions where the vectors being square-euclidean-normed are the parameters the model is learning.  In this context, regularization is applying quadratically increasing penalties to these parameters as they become further from zero.

---

$$\Large \overline{x}$$

The mean of $x$.  Often denoted as $\mu$ in statistics. This equivalent to $\frac{1}{n}\sum x_i$ where $n$ is the number of items in $x$.

Example - If $x$ is $(3,4)$ then $\overline{x} = \frac{1}{2} \sum x_i = \frac{1}{2} * (3 + 4) = 3.5$

---

$$\Large \hat{y}$$

This is used to denote the predictions produced by a model.  

Example - I train a linear regression on two features ($x_1$ and $x_2$).  The model I learned was $\hat{y} = 5 + 2 x_1  + 3 x_2$. 

I now have a observation with features $x_{new} = (7, 9)$, when I plug in $x_{new}$ to my model, the value I get out will be $\hat{y} = 5 + (2 * 7) + (3 * 9) = 46 $.

---

$$\Large M_{a \times b}$$


$M$ is a matrix with $a$ rows and $b$ columns. 

Example - $M_{2 \times 3} = \begin{bmatrix}4 & 5 & 1\\0 & 11 & 8\end{bmatrix}$.  $M$ has 2 rows and 3 columns.

Notes - We'll often denote the entry in the $i^{th}$ row and $j^{th}$ column of $M$ by lowercasing the matrix symbol and subscripting, e.g $m_{2,3} = 8$ in the matrix above because 8 is the entry at row 2 column 3.

In [3]:
import numpy as np
M = np.array([[4,5,1], [0, 11, 8]])
M

array([[ 4,  5,  1],
       [ 0, 11,  8]])

---

$$\Large MN$$

Multiply matrix $M$ by matrix $N$.  Also written $M \times N$ and $M \cdot N$

Example - [Here's a great explanation of the mechanics of matrix multiplication](https://math.stackexchange.com/a/2063291)

Note - A surprisingly helpful property when reasoning about math that involves matrix multiplication:  $MN$ is only defined if if $M$ has the same number of columns as $N$ has rows.  That is, if $M$'s dimensions are $a \times b$ then $N$ must be a $b \times c$ matrix. Their product will have dimension $a \times c$.

To multiply two matrices in numpy, use the following function `M.dot(N)`.

In [6]:
import numpy as np
M = np.array([[4,5,1], [0, 11, 8]])
N = np.array([[7],[0],[1]])
M.dot(N)

array([[29],
       [ 8]])

---

$$\Large M^T$$

This is the transpose of $M$.  Transposing a matrix flips it over the diagonal so that the first row becomes the first column, the second row becomes the second column, etc.

Example - $M =\begin{bmatrix}4 & 5 & 1\\0 & 11 & 8\end{bmatrix}$ then $M^T = \begin{bmatrix}4 & 0 \\ 5 & 11 \\ 1 & 8\end{bmatrix}$

Note - You'll often tranpose matrices to make the dimensions "right" for multiplication.

To take the transpose of a matrix in numpy, use `M.T`.

In [7]:
import numpy as np
M = np.array([[4,5,1], [0, 11, 8]])
M.T

array([[ 4,  0],
       [ 5, 11],
       [ 1,  8]])

---
$$\Large I_k$$

By convention, $I_k$ is the identity matrix; a square matrix with $k$ rows and columns with zeros everywhere except the diagonal entries which are 1s.

Example - $I_3 = \begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{bmatrix}$

To get $I_k$ in numpy, use `np.eye(k)`. 

In [12]:
import numpy as np
np.eye(3)

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

---

$$\Large M^{-1}$$

This is the inverse of $M$.  The inverse is defined as the matrix that makes the following true: $M \times M^{-1} = M^{-1} \times M = I$.

Only square matrices (i.e. matrices with the same number of rows and columns) have inverses and not all square matrices have an inverse.

To take the inverse of a matrix in numpy use `np.linalg.inv(M)`

In [13]:
import numpy as np
M = np.array([[4,6], [1,7]])
M_inv = np.linalg.inv(M)
print(M_inv)
M.dot(M_inv)

[[ 0.31818182 -0.27272727]
 [-0.04545455  0.18181818]]


array([[ 1.,  0.],
       [ 0.,  1.]])