<img src="images/kiksmeisedwengougent.png" alt="Banner" width="1100"/>

<div>
    <font color=#690027 markdown="1">
<h1>GRADIENT DESCENT</h1>    </font>
</div>

<div class="alert alert-box alert-success">
In this notebook, you look at an example of a function that depends on just one variable. This function is a quadratic function and has one minimum. We let points on the graph of the function move to the minimum using derivatives.</div>

<div class="alert alert-box alert-info">
The <b><em>loss function</em></b> is a function of the <b><em>weights</em></b>. The <em>weights</em> are adjusted by the ML model in such a way that the <em>loss function</em> reaches its minimum value. They do this based on the slope of the tangents to the graph of the <em>loss function</em>, i.e. by using derivatives.This method is called <b><em>gradient descent</em></em>.</div>

### Import necessary modules

To solve equations, calculate coordinates, determine zero values and solve systems, you will use the SymPy module. <br>

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sympy import Symbol

<div>
    <font color=#690027 markdown="1">
<h2>1. Graph of a parabola</h2>    </font>
</div>

<div class="alert alert-block alert-warning"> 
You already learned in the notebook 'Graphs' how you can draw a graph with the Matplotlib module.</div>

Run the following code cell.

In [None]:
# GRAPH OF PARABOLA WITH GIVEN EQUATION

# choose the x-coordinates of the points that are plotted
x = np.linspace(-9.5, 9.5, 50)   
# comparison of the parabola: y = 3x² + 2x + 5
# calculate the y-value for every x-coordinatey = 3 * x**2 + 2 * x + 5

# plot parabola
# range and calibrate axes
plt.axis(xmin=-10, xmax=10, ymin=-5, ymax=100)
plt.xticks(np.arange(-10, 11, step=5))
plt.yticks(np.arange(-5, 100, step=10))
# plot parabola
plt.plot(x, y, color="blue", linewidth=1.0, linestyle="solid")     

# open drawing window
plt.show()

<div>
    <font color=#690027 markdown="1">
<h2>2. Plotting point on the parabola</h2>    </font>
</div>

Run the following code cell.

In [None]:
# PLOTTING PARABOLA WITH RANDOM POINT ON IT

# GRAPH OF PARABOLA WITH GIVEN EQUATION

# choose the x-coordinates of the points being plotted
x = np.linspace(-9.5, 9.5, 50)
# equation of the parabola: y = 3x² + 2x + 5
# calculate the y-value for each x-coordinate
y = 3 * x**2 + 2 * x + 5      # relationship between x and y for concrete values of x

# plot parabola
# range and calibration axes
plt.axis(xmin=-10, xmax=10, ymin=-5, ymax=100)
plt.xticks(np.arange(-10, 11, step=5))
plt.yticks(np.arange(-5, 100, step=10))
# plot parabola
plt.plot(x, y, color="blue", linewidth=1.0, linestyle="solid")     

# point P on the parabola, P(-5, ...)
# treat x and y as symbols and not as variables
x = Symbol("x")
y = Symbol("y")
y = 3 * x**2 + 2 * x + 5      # symbolic equation of parabola
x_P = -5
y_P = y.subs(x, -5)       # calculate y_P by replacing x with -5 in symbolic equation
plt.plot(x_P, y_P, color="purple", marker="o")  # plot point P on parabola

# open drawing window
plt.show()

<div>
    <font color=#690027 markdown="1">
<h2>3. Derivatives</h2>    </font>
</div>

The *derivative at a point* of a parabola is the slope of the tangent line at that point on the parabola. <br>Where the parabola rises, the tangent is an ascending straight line and the directional coefficient of the tangent is positive. Where the parabola falls, it is negative. <br>At the peak of the parabola, the tangent line is horizontal and the derivative is 0. <br>The steeper the tangent line, the greater the slope of the tangent line in absolute value, so the greater the derivative in absolute value.

<div>
    <font color=#690027 markdown="1">
<h2>4. Making the point move to the lowest point on the parabola</h2>    </font>
</div>

The intention is that the point P on the parabola moves to the lowest point. The x-value must therefore be adjusted by certain amounts, so that P effectively approaches the top of the parabola. Once close to the top, one must be alert not to go beyond the top.<br><br>The tangent at P also plays a role. If P is far from the peak, that tangent is steep. If P is close to the top, that tangent is no longer steep.<br>The steeper the tangent line, the greater the slope of the tangent line in absolute value, so the greater the derivative in absolute value. <br><br>If you choose the steps so that they are proportional to the slope of the tangent, you will take relatively large steps for a point P far from the top (where the tangent is steeper), and small steps for a point P close to the top (where the tangent is no longer steep). Just what you want!<br>You choose a proportionality factor $\eta$, located between 0 and 1 ($\eta$ is the Greek letter *eta*). <br>The slope of the tangent in P is negative, so you need to subtract the slope of the x-value, so that P moves to the right.

### First attempt

Execute the following code cell.

In [None]:
# move point on parabola to peak, proportional to slope of tangent line

# GRAPH OF PARABOLA WITH GIVEN EQUATION

# select the x-coordinates of the points that are plotted
x = np.linspace(-9.5, 9.5, 50)
# equation of the parabola: y = 3x² + 2x + 5
# calculate the y-value for each x-coordinate
y = 3 * x**2 + 2 * x + 5      # relationship between x and y for concrete values of x

# plot parabola
# range and calibrate axes
plt.axis(xmin=-10, xmax=10, ymin=-5, ymax=100)
plt.xticks(np.arange(-10, 11, step=5))
plt.yticks(np.arange(-5, 100, step=10))
# plot parabola
plt.plot(x, y, color="blue", linewidth=1.0, linestyle="solid")

# point P on the parabola, P(-5, ...)
# Treat x and y as symbols, not variables
x = Symbol("x")
y = Symbol("y")
y = 3 * x**2 + 2 * x + 5      # symbolic equation of parabola
Dy = 6 * x + 2            # the derivative function gives the slope of the tangent at that point for every value of x
x_P = -5
y_P = y.subs(x, -5)
print(x_P, y_P)
plt.plot(x_P, y_P, color="purple", marker="o")    # plot point P on parabola

# point moves on parabola
eta = 0.3                     # proportionality factor
for c in ["lightgreen", "lightblue", "grey", "pink", "orange", "yellow"]:
    x_P = x_P - eta * Dy.subs(x, x_P)
    y_P = y.subs(x, x_P)
    print(x_P, y_P)
    plt.plot(x_P, y_P, color=c, marker="o")      # look to change colors
    
# open drawing window
plt.show()

The yellow point is already quite close to the top, but there is still a bit to go!

### Assignment 4.1
- Adjust the proportionality factor `eta` to 0.1 and 0.5.
- Choose a value for the proportionality factor `eta` yourself.

### Challenge 4.2
- Try a very small value as well. You can use the copied script below for this.

In [None]:
# move point on parabola towards peak, proportional to the slope of the tangent line

# GRAPH OF PARABOLA WITH GIVEN EQUATION

# choose the x-coordinates of the points that are being plotted
x = np.linspace(-9.5, 9.5, 50)
# equation of the parabola: y = 3x² + 2x + 5
# calculate the y-value for each x-coordinate
y = 3 * x**2 + 2 * x + 5      # relationship between x and y for concrete values of x

# plot parabola
# range and calibrate axes
plt.axis(xmin=-10, xmax=10, ymin=-5, ymax=100)
plt.xticks(np.arange(-10, 11, step=5))
plt.yticks(np.arange(-5, 100, step=10))
# plot parabola
plt.plot(x, y, color="blue", linewidth=1.0, linestyle="solid")

# point P on the parabola, P(-5, ...)
# treat x and y as symbols and not as variables
x = Symbol("x")
y = Symbol("y")
y = 3 * x**2 + 2 * x + 5      # symbolic equation of parabola
Dy = 6 * x + 2            # derivative function gives the slope of the tangent at that point for each value of x
x_P = -5
y_P = y.subs(x, -5)
print(x_P, y_P)
plt.plot(x_P, y_P, color="purple", marker="o")    # plot point P on parabola

# point moves on parabolaeta = 0.3                         
# proportionality factor
for c in ["lightgreen", "lightblue", "grey", "pink", "orange", "yellow","lightgreen", "lightblue", "grey", "pink", "orange", "yellow", "lightgreen", "lightblue", "grey", "pink", "orange", "yellow"]:
    x_P = x_P - eta * Dy.subs(x, x_P)    
    y_P = y.subs(x, x_P)
    print(x_P, y_P)    
    plt.plot(x_P, y_P, color=c, marker="o")      # check to change colors

# open drawing window
plt.show()

<div class="alert alert-box alert-info">
If the proportionality factor is too large or too small, the minimum is not found. Therefore, it is very important to determine a suitable value for the proportionality factor. This proportionality factor is called the <b><em>learning rate</em></b>.</div>

### Second attempt
The script is slightly adjusted to reach all the way to the top.

Run the following code cell.

In [None]:
# point moving on parabola towards peak, proportional to slope of tangent line

# GRAPH OF PARABOLA WITH GIVEN EQUATION

# choose the x-coordinates of the points being plotted
x = np.linspace(-9.5, 9.5, 50)
# equation of the parabola: y = 3x² + 2x + 5
# calculate the y-value for each x-coordinate
y = 3 * x**2 + 2 * x + 5      # relationship between x and y for specific values of x

# plot parabola
# range and calibration axes
plt.axis(xmin=-10, xmax=10, ymin=-5, ymax=100)
plt.xticks(np.arange(-10, 11, step=5))
plt.yticks(np.arange(-5, 100, step=10))
# plot parabola
plt.plot(x, y, color="blue", linewidth=1.0, linestyle="solid")

# point P on the parabola, P(-5, ...)
# treat x and y as symbols, not as variables
x = Symbol("x")
y = Symbol("y")
y = 3 * x**2 + 2 * x + 5      # symbolic equation of parabola
Dy = 6 * x + 2            # derivative function gives the slope of the tangent at that point for every value of x
x_P = -5
y_P = y.subs(x, -5)
print(x_P, y_P)
plt.plot(x_P, y_P, color="purple", marker="o")    # plot point P on parabola

# point moves on parabola
eta = 0.3                         # proportionality factor
for i in range(7):                                 # adjusted for more steps    
    x_P = x_P - eta * Dy.subs(x, x_P)
    y_P = y.subs(x, x_P)
    print(x_P, y_P)    
    plt.plot(x_P, y_P, color="red", marker="o")
    
# open drawing window
plt.show()

### Assignment 4.3
The peak of this parabola is at (-1/3, ...). So you're almost there.<br>Adjust the code until you are close to (-0.3333...; ...).

In [None]:
# point moving on parabola towards the peak, proportional to the slope of the tangent line

# GRAPH OF PARABOLA WITH GIVEN EQUATION

# select the x-coordinates of the points that are plotted
x = np.linspace(-9.5, 9.5, 50)
# equation of the parabola: y = 3x² + 2x + 5
# calculate the y-value for each x-coordinate
y = 3 * x**2 + 2 * x + 5      # relationship between x and y for concrete values of x

# plot parabola
# range and calibrate axes
plt.axis(xmin=-10, xmax=10, ymin=-5, ymax=100)
plt.xticks(np.arange(-10, 11, step=5))
plt.yticks(np.arange(-5, 100, step=10))
# plot parabola
plt.plot(x, y, color="blue", linewidth=1.0, linestyle="solid")

# point P on the parabola, P(-5, ...)
# treat x and y as symbols and not as variables
x = Symbol("x")
y = Symbol("y")
y = 3 * x**2 + 2 * x + 5      # symbolic equation of parabola
Dy = 6 * x + 2            # derivative function gives slope of tangent at that point for each value of x
x_P = -5
y_P = y.subs(x, -5)
print(x_P, y_P)
plt.plot(x_P, y_P, color="purple", marker="o")    # plot point P on parabola

# point moves on parabola
eta = 0.3                         # proportionality factor
for i in range(7):                                 # adjusted for more steps    
    x_P = x_P - eta * Dy.subs(x, x_P)
    y_P = y.subs(x, x_P)
    print(x_P, y_P)    
    plt.plot(x_P, y_P, color="red", marker="o")
    
# open drawing window
plt.show()

Answer:

<div>
    <font color=#690027 markdown="1">
<h2>5. Exercise: Starting point to the right of the peak</h2>    </font>
</div>

Do the same now, starting from a point Q(4, ...) to the right of the top of the parabola. Adapt the script accordingly for this.

<div>
    <font color=#690027 markdown="1">
<h2>6. Exercise: Cubic Function</h2>    </font>
</div>

Approach the chosen point with the *gradient descent* method to the local minimum of the cubic curve $$k \leftrightarrow  y = -2 x^{3} +10 x^{2} -5 x + 10.$$

Answer:

<div>
    <font color=#690027 markdown="1">
<h2>7. Exercise: Fourth Degree Function</h2>    </font>
</div>

Approach a chosen point using the *gradient descent* method until it reaches the absolute minimum of the fourth-degree curve $$k \leftrightarrow  y = 3 x^{4} - 28 x^{3} + 84 x^{2} - 96x + 70.$$

Answer:

<div>
<h2>With support from</h2></div>

<img src="images/kikssteun.png" alt="Banner" width="1100"/>

<img src="images/cclic.png" alt="Banner" align="left" width="100"/><br><br>
Notebook KIKS, see <a href="http://www.aiopschool.be">AI At School</a>, by F. wyffels & N. Gesquière, is licensed under a <a href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.