### Gradient Descent

When starting with your regression you initialize you regression coefficients with random variables, then use gradient descent to find minimize the error (cost function).

![Slope](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Slope.png)
_Image Source: https://davidmatablog.wordpress.com/2017/08/01/linear-regression-gradient-descent-with-python/_

Notice how we slowly inch down to the minimum of the error, this is gradient descent in action.  Lets start with a simple example!

![Iteration1](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Initial1.png)

In this example we start with an input of 3 and an actual value of 14.  The weight is 3 (this was completly random at the beginning), which makes our predicted value 9 (input 3 multiplied by the weight of 3).  This gives us an error of of -5 (predicted value of 9 minus an actual value of 14).  We now need to update the weight, we are going to do that with gradient descent.  To update the weight we need 2 things:

1. Slope of the loss function - which is 2 * error
2. Value of the input

Part one slope of the loss function is 2 * -5 = -10
Part 2 is 3

Next we muliple these numbers together (-10 x 3) and get -30.  We need to multiply this number by a learning rate.  The learning rate is a hyper-parameter that controls how much we adjust the weights of our network.  Its a good idea to use a small number so that we do not go pass the minimum value.  A common choice is to use 0.01, which is what we will use.  The new weight we will use is 3 - (0.01 x -30), which is 3.3.  Lets try it out!

![Iteration2](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_2.png)

This time our error is -4.1, which is an improvement over iteration number 1 where our error was 5.  Lets update the weights and continue.  

New slope = 3.3 - (0.01 x -24.6) = 3.546

![Iteration3](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_3.png)

Now we have an error of -3.362, which means we are still trending down.  Lets update the weight again.

New slope = 3.546 - (0.01 * -20.172) = 3.7477

![Iteration4](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_4.png)

New error is -2.7569.  Lets update the weight again.

New slope = 3.7477 - (0.01 x -16.5414) = 3.913

![Iteration5](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_5.png)

- New error: -2.261
- New slope: 3.913 - (0.01 x -13.566) = 4.049

![Iteration6](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_6.png)

- New error: -1.853
- New slope: 4.049 - (0.01 x -11.118) = 4.160

![Iteration7](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_7.png)

- New error: -1.52
- New slope: 4.160 - (0.01 x -9.12) = 4.251

![Iteration8](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_8.png)

- New error: -1.247
- New slope: 4.251 - (0.01 x -7.482) = 4.326

![Iteration9](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_9.png)

- New error: -1.022
- New slope: 4.326 - (0.01 x -6.132) = 4.387

![Iteration10](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_10.png)

- New error: -0.839
- New slope: 4.387 - (0.01 x -5.034) = 4.437

![Iteration11](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_11.png)

- New error: -0.689
- New slope: 4.437 - (0.01 x -4.134) = 4.478

![Iteration12](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration12.png)

- New error: -0.566
- New slope: 4.478 - (0.01 x -3.396) = 4.512

![Iteration13](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_13.png)

- New error: -0.464
- New slope: 4.512 - (0.01 x -2.784) = 4.540

![Iteration14](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_14.png)

- New error: -0.380
- New slope: 4.54 - (0.01 x -2.28) = 4.563

![Iteration15](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_15.png)

- New error: -0.311
- New slope: 4.563 - (0.01 x -1.1866) = 4.582

![Iteration16](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_16.png)

- New error: -0.254
- New slope: 4.582 - (0.01 x -1.524) = 4.597

![Iteration17](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_17.png)

- New error: -0.209
- New slope: 4.597 - (0.01 x -1.254) = 4.610

![Iteration18](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_18.png)

- New error: -0.170
- New slope: 4.610 - (0.01 x -1.02) = 4.620

![Iteration19](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_19.png)

- New error: -0.140
- New slope: 4.620 - (0.01 x -0.84) = 4.628

![Iteration20](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Iteration_20.png)

- New error: -0.12
- New slope: 4.628 - (0.01 x -0.72) = 4.635

After 20 iterations I have a weight of 4.628 and a predicted value of 13.88.  If I continue this iterative process, I will continue to get closer and closer to 14.  I am now going to demonstrate how to program this.

In [82]:
input_val = 3
weight = 3
actual_val = 14
error = (input_val * weight) - actual_val
learning_rate = 0.01
counter = 1
predicted_val = input_val * weight
error_list = []

while abs(error) > 0.01:    # I will stop my loop after my error is greater than 0.01
    error = (input_val * weight) - actual_val
    error_list.append(error)
    predicted_val = input_val * weight
    print ('Iteration Number {} - Error: {} - Predicted Value {} - Weight {}'.format(counter, round(error, 3), 
                                                                         round(predicted_val, 3), 
                                                                                    round(weight, 3)))
    weight = (weight - (learning_rate * (error * 2 * 3)))
    counter += 1

Iteration Number 1 - Error: -5 - Predicted Value 9 - Weight 3
Iteration Number 2 - Error: -4.1 - Predicted Value 9.9 - Weight 3.3
Iteration Number 3 - Error: -3.362 - Predicted Value 10.638 - Weight 3.546
Iteration Number 4 - Error: -2.757 - Predicted Value 11.243 - Weight 3.748
Iteration Number 5 - Error: -2.261 - Predicted Value 11.739 - Weight 3.913
Iteration Number 6 - Error: -1.854 - Predicted Value 12.146 - Weight 4.049
Iteration Number 7 - Error: -1.52 - Predicted Value 12.48 - Weight 4.16
Iteration Number 8 - Error: -1.246 - Predicted Value 12.754 - Weight 4.251
Iteration Number 9 - Error: -1.022 - Predicted Value 12.978 - Weight 4.326
Iteration Number 10 - Error: -0.838 - Predicted Value 13.162 - Weight 4.387
Iteration Number 11 - Error: -0.687 - Predicted Value 13.313 - Weight 4.438
Iteration Number 12 - Error: -0.564 - Predicted Value 13.436 - Weight 4.479
Iteration Number 13 - Error: -0.462 - Predicted Value 13.538 - Weight 4.513
Iteration Number 14 - Error: -0.379 - Predic

After 33 iterations we end up with a predicted value of 13.991.  I saved all of the error terms in a list, lets plot this out to see what happenes as our error gets smaller and smaller.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

X = np.arange(1, 34)
plt.plot(X, np.abs(error_list))
plt.xlabel('Iterations')
plt.ylabel('Absolute Value of Error')
plt.title('Absolute Value of Error by Number of Iterations')
plt.show()

![Plot](https://raw.githubusercontent.com/sik-flow/notes/master/Pics/Plot.png)

At the beginning when the error is large the value of your error decreases by a larger amount over each iteration.  As your error gets smaller, you make very small decreases in your error over each iteration.  