# Pre-requisite Material for Foundations of Machine Learning

* The following includes of list a topics that you should know to succeed in this course. 
* This covers material from calculus, linear algebra, statistics, and programming in Python.  
* Please review the following material carefully.  
* If you do not feel confident in all of the following material, you may want to reconsider taking the course at this time. 


# Python Review

Some Python tutorials and references to go over are: 

* Learn Python the hard way: https://learnpythonthehardway.org/python3/
* Python for absolute beginners: https://www.youtube.com/playlist?list=PLS1QulWo1RIaJECMeUT4LFwJ-ghgoSH6n
* Python Docs tutorial: https://docs.python.org/3/tutorial/
* Numpy quickstart tutorial: https://docs.scipy.org/doc/numpy/user/quickstart.html
    
- Python For Data Science - A Cheat Sheet For Beginners: https://www.datacamp.com/community/tutorials/python-data-science-cheat-sheet-basics
- Other Python Cheat Sheets: https://towardsdatascience.com/collecting-data-science-cheat-sheets-d2cdff092855 (Credits to Karlijn Willems)

# Calculus Review

Calculus topics to review include:

* Derivatives: https://www.youtube.com/watch?v=9vKqVkMQHKk&list=PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr&index=2
* Chain rule and product rule: https://www.youtube.com/watch?v=YG15m2VwSjA&list=PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr&index=4
* Integration: https://www.youtube.com/watch?v=rfG8ce4nNh0&list=PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr&index=8
* Taylor Series: https://www.youtube.com/watch?v=3d6DsjIBzJ4&list=PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr&index=11

Additional resources:
* Full 3Blue1Brown Calculus series: https://www.youtube.com/playlist?list=PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr
* ML-CheatSheet for Calculus: https://ml-cheatsheet.readthedocs.io/en/latest/calculus.html

## Linear Algebra Review



Topics and definitions to know include: 
  * Vector: https://www.youtube.com/watch?v=fNk_zzaMoSs&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab&index=2&t=0s
  \begin{equation} \mathbf{x} = \left[ \begin{array}{c} x_1 \\ x_2 \\ \vdots \\ x_D\end{array} \right] \end{equation}
  * Matrix: 
  \begin{equation}
  \mathbf{X}  = \left[\!\begin{array}{c c c c}
    x_{11} & x_{12} & \cdots & x_{1n}\\
    x_{21} & x_{22} & \cdots & x_{2n}\\
    \vdots & \vdots & \ddots & \vdots\\
    x_{d1} & x_{d2} & \cdots & x_{dn}\end{array}\!\right]\! \!\in\! \mathcal{R}^{d \times n}
  \end{equation}
  * Transpose operation: 
    \begin{equation}
		 \mathbf{x}^T = \left[  x_1,  x_2 , \cdots , x_D \right]
         \end{equation}
         
    \begin{equation}\left(\mathbf{A}^T\mathbf{B}\right)^T = \mathbf{B}^T\mathbf{A}\end{equation}
         
  * Vector/Matrix scaling:  Given a vector $\mathbf{x} \in \mathbb{R}^D$ and a scalar value $a$, *what is $a\mathbf{x}$?*  *What does this operation do geometrically?* 
  * Vector/Matrix addition: Given $\mathbf{x} \in \mathbb{R}^D$ and $\mathbf{y} \in \mathbb{R}^D$, *what is $\mathbf{x} + \mathbf{y}$?  What is the geometric interpretation?*
  * Vector/Matrix subtraction: Given $\mathbf{x} \in \mathbb{R}^D$ and $\mathbf{y} \in \mathbb{R}^D$, *what is $\mathbf{x} - \mathbf{y}$? What is the geometric interpretation?*
  * Inner product: $\mathbf{x}^T\mathbf{y} = \mathbf{y}^T\mathbf{x} = \sum_{i=1}^D x_iy_i$
  * Outer product: $xy^\top \!=\! \left[\!\begin{array}{c}
x_1\\
x_2\\
\vdots\\
x_d\end{array}\!\right]\!\!
\left[\!\begin{array}{c}
y_1\\
y_2\\
\vdots\\
y_n\end{array}\!\right]^\top \!\!=\! \left[\!\begin{array}{c c c c}
x_1y_1 & x_1y_2 & \cdots & x_1y_n\\
x_2y_1 & x_2y_2 & \cdots & x_2y_n\\
\vdots & \vdots & \ddots & \vdots\\
x_dy_1 & x_dy_2 & \cdots & x_dy_n\end{array}\!\right]\! \!\in\! \mathcal{R}^{d \times n}.
$
* Linear transformations: https://www.youtube.com/watch?v=kYB8IZa5AuE&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab&index=3
* Inverse: https://www.youtube.com/watch?v=uQhTuRlWMxw&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab&index=7

* L-p norm: Given a vector, $\mathbf{x}$ and a $p$-value, the $l_p$ norm is defined as:
\begin{eqnarray}
\left\|\mathbf{x}\right\|_p = \left( \sum_{d=1}^D |x_d|^p \right)^{\frac{1}{p}}
\end{eqnarray}

So, if $p=2$, then the $l_2$ norm of a vector is:
\begin{eqnarray}
\left\|\mathbf{x}\right\|_2 = \left( \sum_{d=1}^D |x_d|^2 \right)^{\frac{1}{2}}
\end{eqnarray}

* Eigenvectors and Eigenvalues:  https://www.youtube.com/watch?v=PFDu9oVAE-g&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab&index=14



In [None]:
# Associated python code to review some of the concepts listed above.  

import numpy as np
a = 2
x = np.array([[1],[2],[3]])
y = np.array([[4],[5],[6]])

#Print Vector x
print('x:',x)

#Transpose Vector x
print('x.T:',x.T)

#Scale Vector x with scalar a
print('a*x:', a*x)

#Vector addition
print('x+y:', x+y)

#Vector subtraction
print('y-x:',y-x)

In [None]:
#Several ways to compute the inner product of vectors x and y in python/numpy

#First by using matrix multiplication operator '@' that numpy supports:
print(x.T@y)
print(y.T@x)

#Second with numpy.matmul function for matrix maultiplication:
print(np.matmul(x.T,y))
print(np.matmul(y.T,x))

#Third with numpy.inner function:
print(np.inner(x.T,y.T))

#Fourth with numpy.dot function, note that numpy.dot acts the same as '@ ' or 'np.matmul' for 2D arrays:
print(x.T.dot(y))
print(y.T.dot(x))

#for 1D arrays, acts similar to numpy.inner function:
print(np.dot([1,2,3],[4,5,6]))

In [None]:
#Compute the outer product of vectors x and y
print(np.outer(x,y))

In [None]:
print(np.outer(x,y).T)

In [None]:
#Compute l-p norm for p = 2, 3, 1
x = np.array([1, 2, 3])
print(np.linalg.norm(x,ord=2))
print((x@x)**(1/2))
print(np.linalg.norm(x,ord=3))
print(np.linalg.norm(x,ord=1))

In [None]:
#Compute l-p norm for p = 2, 3, 1, 0
x = np.array([-1, -2, -3])
print(np.linalg.norm(x,ord=2))
print((x@x)**(1/2))
print(np.linalg.norm(x,ord=3))
print(np.linalg.norm(x,ord=1))
print(np.linalg.norm(x,ord=0))

In [None]:
#Compute l-p norm for p = 2, 3, 1, 0
x = np.array([-1, 0, 3])
print(np.linalg.norm(x,ord=2))
print((x@x)**(1/2))
print(np.linalg.norm(x,ord=3))
print(np.linalg.norm(x,ord=1))
print(np.linalg.norm(x,ord=0))

### Note the notation:  
   * scalar values are unbolded (e.g., $N$, $x$)
   * vectors are lower case and bolded (e.g., $\mathbf{x}$)
   * matrices are uppercase and bolded (e.g., $\mathbf{A} \in \mathbb{R}^{D \times N}$)
   * vectors are generally assumed to be column vectors (e.g., $\mathbf{x}^T = \left(x_1, \ldots, x_N\right)$ and $\mathbf{x} =  \left(x_1, \ldots, x_N\right)^T$ )


### Additional reading and videos to review linear algebra concepts:

* Strang, Gilbert, et al. Introduction to linear algebra. Vol. 4. Wellesley, MA: Wellesley-Cambridge Press, 2009.
    Chapters 1-7
    
* Lay, David C. "Linear Algebra and its Applications, 3rd updated Edition." (2005).

* MITOpenCourseWare Linear Algebra:  https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/
        
* 3Blue1Brown Linear Algebra Review:  https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab

* SciPy Cheat Sheet: Linear Algebra in Python: https://www.datacamp.com/community/blog/python-scipy-cheat-sheet
    

# Statistics Review

Topics and definitions to know include:

- Likelihood and Probability
- Expected Value: https://www.youtube.com/watch?v=j__Kredt7vY
- Variance and covariance: https://www.youtube.com/watch?v=ualmyZiPs9w
- Random variables: https://www.youtube.com/watch?v=3v9w79NhsfI
- Probability density functions: https://www.youtube.com/watch?v=Fvi9A_tEmXQ
- Marginal and conditional probability: https://www.youtube.com/watch?v=CAXQvTKP8sg
- Independence and conditional independence: https://www.youtube.com/watch?v=uzkc-qNVoOk
- Normal/Gaussian distribution: https://www.youtube.com/watch?v=hgtMWR3TFnY
- Central Limit Theorem: https://www.youtube.com/watch?v=JNm3M9cqWyc
- Bayes' Rule: https://www.youtube.com/watch?v=XQoLVl31ZfQ

Additional Reading: Goodfellow, I. et al. "Deep Learning", MIT Press, 2016. Chapter 3: Probability and Information Theory, Pages 51-70. http://www.deeplearningbook.org/contents/prob.html