# Mathematics Foundations for Machine Learning
### 1-Hour Lecture Notebook
---
This lecture covers:
- Linear Algebra (vectors, matrices, eigenvalues)
- Calculus (gradients, optimization)
- Probability & Statistics (distributions, Bayes theorem)
- Optimization methods (SGD, Adam, convexity)

---

## Linear Algebra Basics
### Vectors and Matrices
A **vector** is a list of numbers:
$$ v = \begin{bmatrix} v_1 \\ v_2 \\ v_3 \end{bmatrix} $$

A **matrix** transforms vectors:
$$ A = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} $$

In [None]:
import numpy as np
import matplotlib.pyplot as plt

v = np.array([1, 2])
A = np.array([[2, -1], [1, 3]])
Av = A @ v

v, Av

### Eigenvalues and Eigenvectors
Solve:
$$ A x = \lambda x $$

In [None]:
np.linalg.eig(A)

## Calculus Foundations
### Gradients
The gradient of a function measures how it changes:
$$ f(x) = x^2 \quad \Rightarrow \quad \nabla f = 2x $$

In [None]:
x = np.linspace(-3, 3, 200)
f = x**2
df = 2*x

plt.plot(x, f)
plt.title('f(x) = x^2')
plt.xlabel('x')
plt.ylabel('f(x)')
plt.show()

## Probability & Statistics
### Normal Distribution
$$ p(x) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}} $$

In [None]:
mu, sigma = 0, 1
x = np.linspace(-4, 4, 300)
p = (1/np.sqrt(2*np.pi*sigma**2)) * np.exp(-(x-mu)**2/(2*sigma**2))

plt.plot(x, p)
plt.title('Normal Distribution')
plt.show()

## Optimization: SGD
### Stochastic Gradient Descent Update
$$ w := w - \eta \nabla_w L $$

In [None]:
eta = 0.1
w = 4

trajectory = []
for _ in range(20):
    grad = 2*w
    w = w - eta * grad
    trajectory.append(w)

plt.plot(trajectory)
plt.title('SGD Optimization Path for f(w)=w^2')
plt.xlabel('Iteration')
plt.ylabel('w')
plt.show()