# Week 1 - Linear Regression Model

## Hypothesis

> $h_{\theta}(x) = \theta_{0} + \theta_{1}x$

## Cost Function

> $\displaystyle J(\theta_{0},\theta_{1}) = \frac{1}{2m}\sum_{i=1}^{m}(h_{\theta}(x^{(i)}) - y^{(i)})^2$


## Gradient Descent Algorithm

repeat until convergence:

>$\displaystyle \theta_{j} := \theta_{j} - \alpha\frac{\partial}{\partial \theta_{j}} J(\theta_{0},\theta_{1}), j \in \{0,1\}$
>
> $\alpha$ is the **learning rate**.

**or**

>$\displaystyle \theta_{0} := \theta_{0} - \alpha\frac{1}{m} \sum_{i=1}^{m} (h_{\theta}(x^{(i)}) - y^{(i)})$
>
>$\displaystyle \theta_{1} := \theta_{1} - \alpha\frac{1}{m} \sum_{i=1}^{m} (h_{\theta}(x^{(i)}) - y^{(i)})\cdot x^{(i)}$


## Batch Gradient Descent

Each step of gradient descent uses all the training examples


## Matrix Notation

> $\mathbf{A} = \begin{bmatrix}
    1 & 2 & 3 \\
    4 & 5 & 6
\end{bmatrix} \in \mathbb{R}^{2 \times 3}$ 
>
> $\mathbf{A}_{ij}$ = entry in the $i^{th}$ row and the $j^{th}$ column.
>
> $\mathbf{A}_{21} = 4$

## Vector Notation

> $\mathbf{y} = \begin{bmatrix}
    460 \\
    232 \\
    315 \\
    178
\end{bmatrix} \in \mathbb{R}^{4}$
>
> $\mathbf{y_{i}} = i^{th}$ element.
>
> $\mathbf{y_2} = 232$
>
> Prefer 1-indexed vectors.

## Matrix Addition

> $\begin{bmatrix}
    1 & 0 \\
    2 & 5 \\
    3 & 1
\end{bmatrix} + \begin{bmatrix}
    4 & 0.5 \\
    2 & 5 \\
    0 & 1
\end{bmatrix} = \begin{bmatrix}
    5 & 0.5 \\
    4 & 10 \\
    3 & 2
\end{bmatrix}$
>
> Only add matrices of same dimensions.

## Scalar Multiplication

> $3 \times \begin{bmatrix}
    1 & 0 \\
    2 & 5 \\
    3 & 1
\end{bmatrix} = \begin{bmatrix}
    3 & 0 \\
    6 & 15 \\
    9 & 3
\end{bmatrix}$
>
> $\begin{bmatrix}
    4 & 0 \\
    6 & 3
\end{bmatrix} / 4 = \begin{bmatrix}
    1 & 0 \\
    \frac{3}{2} & \frac{3}{4}
\end{bmatrix}$

## Matrix Multiplication

> $\begin{bmatrix}
    1 & 3 \\
    4 & 0 \\
    2 & 1
\end{bmatrix} \begin{bmatrix}
    1 \\
    5
\end{bmatrix} = \begin{bmatrix}
    16 \\
    4 \\
    7
\end{bmatrix}$

## Matrix Matrix Multiplication

> $\begin{bmatrix}
    1 & 3 & 2 \\
    4 & 0 & 1
\end{bmatrix} \begin{bmatrix}
    1 & 3 \\
    0 & 1 \\
    5 & 2
\end{bmatrix} = \begin{bmatrix}
    11 & 10 \\
    9 & 14
\end{bmatrix}$
>
> $\mathbf{A} \times \mathbf{B} = \mathbf{C},
    \mathbf{A} \in \mathbb{R}^{m \times n},
    \mathbf{B} \in \mathbb{R}^{n \times o},
    \mathbf{C} \in \mathbb{R}^{m \times o}$
    
Matrix multiplication is not commutative.

> $\mathbf{A} \times \mathbf{B} \neq \mathbf{B} \times \mathbf{A}$

Matrix multiplication is associative.

> $(\mathbf{A} \times \mathbf{B}) \times \mathbf{C} = \mathbf{B} \times (\mathbf{A} \times \mathbf{C})$

## Identity Matrix

> $I \text{ or } I_{n \times n}$
>
> $\begin{bmatrix}
    1
\end{bmatrix} \in \mathbb{R}^{1\times 1}$
>    
> $\begin{bmatrix}
    1 & 0 \\
    0 & 1
\end{bmatrix} \in \mathbb{R}^{2\times 2}$
>
> $\begin{bmatrix}
    1 & 0 & 0 \\
    0 & 1 & 0 \\
    0 & 0 & 1
\end{bmatrix} \in \mathbb{R}^{3\times 3}$
>
> $\begin{bmatrix}
    1 & 0 & 0 & 0 \\
    0 & 1 & 0 & 0 \\
    0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 1
\end{bmatrix} \in \mathbb{R}^{4\times 4}$
>
> $\mathbf{A} \cdot \mathbf{I} = \mathbf{I} \cdot \mathbf{A}$

## Inverse Matrix

> $\mathbf{A}\mathbf{A}^{-1} = \mathbf{A}^{-1}\mathbf{A} = \mathbf{I}, \mathbf{A} \in \mathbb{R}^{m \times m}$

Only square matrices have inverses.

Matrix that don't have an inverse are "singular" or "degenerate".

> $\begin{bmatrix}
    0 & 0 \\
    0 & 0
\end{bmatrix}$

## Matrix Transpose

> $\mathbf{A} = \begin{bmatrix}
    1 & 2 & 0 \\
    3 & 5 & 9
\end{bmatrix}$
>
> $\mathbf{A}^{\top} = \begin{bmatrix}
    1 & 3 \\
    2 & 5 \\
    0 & 9
\end{bmatrix}$
>
> $\mathbf{B} = \mathbf{A}^{\top},
    \mathbf{A} \in \mathbb{R}^{m \times n},
    \mathbf{B} \in \mathbb{R}^{n \times m},
    \mathbf{B}_{ij} = \mathbf{A}_{ji}$