In [1]:
#Name: Chaeyoon Kim
#City Email: Chaeyoon.Kim@city.ac.uk

import numpy as np

Derive the the gradient of the objective function with respect to the slope,  𝑚 . Rearrange it to show that the update equation written above does find the stationary points of the objective function. By computing its derivative show that it's a minimum.

**Derive the the gradient of the objective function with respect to the slope, $m$**


\begin{align*}
  E(m, c) = \sum_{i=1}^n (y_i-mx_i-c)^2\\
  \frac{\text{d}}{\text{d}m}E(m, c) = -2\sum_{i=1}^n x_i (y_i - mx_i - c) 
\end{align*}



**Rearrange it to show that the update equation written above does find the stationary points of the objective function.**


\begin{align*}
  0 = -2\sum_{i=1}^n x_i (y_i - mx_i - c) \\
  0 = 2m\sum_{i=1}^n x^2_i -2 \sum_{i=1}^n x_i(y_i - c) \\
  2 \sum_{i=1}^n x_i(y_i - c)  = 2m\sum_{i=1}^n x^2_i\\
  m = \frac{\sum_{i=1}^n x_i(y_i - c)}{\sum_{i=1}^n x^2_i}\\
  m = \frac{\sum_{i=1}^n (y_i - c)x_i}{\sum_{i=1}^n x^2_i}
\end{align*}


                       
                       
                       
**By computing the second derivative show that its a minimum.**

\begin{align*}
  \frac{\text{d}}{\text{d}m}E(m, c) = -2\sum_{i=1}^n x_i (y_i - mx_i - c) \\
  \frac{\text{d}}{\text{d}m}E(m, c) = 2m\sum_{i=1}^n x^2_i -2 \sum_{i=1}^n x_i(y_i - c) \\
  \frac{\text{d}^2}{\text{d}m^2}E(m, c) = 2\sum_{i=1}^n x^2_i 
\end{align*}


which is positive because $x^2_i$ is positive. This indicates that the solution is a minimum.

## Multiple Input Solution with Linear Algebra

$$
f(x) = mx + c
$$


I now have to turn the multiplications and additions into a linear algebraic form, we have one multiplication ($m\times c$) and one addition ($mx + c$). But I can turn this into a inner product by writing it in the following way,
$$
f(x) = m \times x + c \times 1,
$$

$$
\mathbf{x} = \begin{bmatrix} 1\\ x\end{bmatrix}.
$$
$$
\mathbf{w} = \begin{bmatrix} c \\ m\end{bmatrix}
$$
because if we now take the inner product between these to vectors we recover
$$
\mathbf{x}\cdot\mathbf{w} = 1 \times c + x \times m = mx + c
$$
In `numpy` we can define this vector as follows

In [3]:
m = ((y - c)*x).sum()/(x**2).sum()

# define the vector w
w = np.zeros(shape=(2, 1))
w[0] = m
w[1] = c

The result of this multiplication is of the form
$$
\begin{bmatrix}c_1\\c_2 \\ \vdots \\ a_k\end{bmatrix} = 
\begin{bmatrix} b_{1,1} & b_{1, 2} & \dots & b_{1, k} \\
b_{2, 1} & b_{2, 2} & \dots & b_{2, k} \\
\vdots & \vdots & \ddots & \vdots \\
b_{k, 1} & b_{k, 2} & \dots & b_{k, k} \end{bmatrix} \begin{bmatrix}a_1\\a_2 \\ \vdots\\ c_k\end{bmatrix} = \begin{bmatrix} b_{1, 1}a_1 + b_{1, 2}a_2 + \dots + b_{1, k}a_k\\
b_{2, 1}a_1 + b_{2, 2}a_2 + \dots + b_{2, k}a_k \\ 
\vdots\\ 
b_{k, 1}a_1 + b_{k, 2}a_2 + \dots + b_{k, k}a_k\end{bmatrix}
$$

### Differentiating the Objective
**So the full gradient**

$$
\frac{\text{d}E(\mathbf{w})}{\text{d}\mathbf{w}}=\begin{bmatrix}\frac{\text{d}E(m,c)}{\text{d}m} \\ \frac{\text{d}E(m,c)}{\text{d}c}\end{bmatrix}=-2\begin{bmatrix}
\mathbf{x}^\top\mathbf{y} \\ \mathbf{1}^\top \mathbf{y}
\end{bmatrix}+2\begin{bmatrix}
\mathbf{x}^\top\mathbf{x}m + \mathbf{x}^\top\mathbf{1}c \\
\mathbf{x}^\top\mathbf{1}m + nc
\end{bmatrix}=  
\begin{bmatrix}-2\sum_{i=1}^n x_i (y_i-mx_i - c)\\
-2\sum_{i=1}^n(y_i - mx_i -c)
\end{bmatrix}.
$$