### Gradient Descent for Machine Learning

To determine the best fit line using machine learning, different combinations of slope and intercept are tested. The combination with minimum error, meaning minimum distance of point from the line (squared to ensure the value is always positive), will be the line of best fit.

Mean-squared Error:

![Mean-squared error](mserror.png)

This can be illustrated by the 3-axes diagram shown below:

![Gradient Descent Illustration](gradientdescent.png)

Separate the variables and plot them agains the mean-squared error function separately. The graph should look like a parabolic function with a minimum (show in picture below):

![Gradient Descent Illustration 2](Gd2.png)


### How to perform gradient descent (regardless of loss function used)

Step 1: Take the derivative of the Loss Function for each parameter in it. (take gradietn of loss function)

E.g.

>$\sum$ squared residuals (loss function) = $\sum(observed_{value} - predicted_{value})^2$
>
>$\sum$ squared residuals (loss function) = $\sum(observed_{value} - (intercept + slope*weight))^2$
>
>$\frac{d}{d_{intercept}}$ $\sum$ squared residuals (loss function)
>
>$\frac{d}{d_{slope}}$ $\sum$ squared residuals (loss function)

Step 2: Pick random values for the parameters.

Pick random intercept and weight values.

>$intercept = 0.5$
>
>$slope = 1$

Step 3: Plug the parameter values into the derivatives (ahem, the Gradient).

Step 4: Calculate the step sizes:

>***Step size*** = *Derived Parameter* x *Learning Rate*

> Learning rate is arbitrary. (0.001-0.1)

Step 5: Calculate the New Parameters

>***New Parameter*** = *Old Parameter* - *Step Size*
>
>Repeat Step 3-5 until **Step Size** is very small or reach *maximum* number of steps.

>Step size is directly proportional to parameter which is the partial derivatives of original loss function (the slopes). So when step size approaches 0, the parameters are optimal.

#### Import Libraries

In [3]:
from mpl_toolkits import mplot3d
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
%matplotlib widget


#### Define Loss Function


In [None]:
def f(x, y):
    return np.sin(np.sqrt(x ** 2 + y ** 2))


#### Input data arrays

In [None]:

x = np.linspace(-6, 6, 30)
y = np.linspace(-6, 6, 30)

X, Y = np.meshgrid(x, y)
Z = f(X,Y)


#### Plotting

In [None]:
fig = plt.figure(figsize=(8,10))
ax = plt.axes(projection='3d')
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap='viridis', edgecolor='none')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
ax.set_title('3D surface')
plt.show()