# Gradient descent basics.

Gradient descent is a general-purpose algorithm for solving optimization problem.  We're going to use it a lot this semester.

The optimization problem we want to solve is: given a function, what input value will minimize the output of the function?

In this notebook we'll look at a single-variable function.  For gradient descent, we'll need to be able to find the derivative of the function at a certain input value.

We'll write code to compute the derivative a function using two different methods: the analytical method and the numerical method.

Then, we'll use this code to perform gradient descent, and see how the learning rate parameter of gradient descent works.

V0.0   author: Glenn Bruns

### Instructions:

Read through the code, then enter code in the cells below each numbered problem.

The instructions are not detailed.  I expect you to think and to use good judgement.

Only insert code where you see # YOUR CODE HERE.  

Please restart your notebook and run it from top to bottom before submitting.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
sns.set()
sns.set_context('notebook')
sns.set_style('whitegrid')

### Define and plot a function

Function f is our example function.  From the plot below you can see that 5.5 is about the input value that gives the minimum of the function (at least when considering inputs between 0 and 10).

In [None]:
def f(x):
    return (x - 5)**2 + 3*np.sin(2*x)

In [None]:
xs = np.linspace(0, 10, 100)
ys = f(xs)
plt.plot(xs, ys);

### Problem 1.  Write a function that gives the derivative of function f.

Use the analytic (exact) method.  Create the function by first figuring out the derivative of f(x) using calculus.

Remember that the derivative of sin(x) is cos(x)

In [None]:
def df(x):
    # YOUR CODE HERE

In [None]:
print(f'df/dx (0.5): {df(0.5):0.4g}')
print(f'df/dx (0.3): {df(0.3):0.4g}')

### Problem 2. Using df(), plot the derivative of f.

Use the array xs that was defined earlier.  

In [None]:
# YOUR CODE HERE

### Problem 3.  Write a function that gives the derivative of _any_ function

Use the numerical method.  Use a 2-sided approximation.

In [None]:
def deriv(f, h=1e-5):
    # YOUR CODE HERE

In [None]:
print(f'df/dx (0.5): {deriv(f)(0.5):0.3g}')
print(f'df/dx (0.3): {deriv(f)(0.3):0.3g}')

### Problem 4. Using functions deriv() and f(), plot the derivative of f.

Use the array xs that was defined earlier.  

In [None]:
# YOUR CODE HERE

### Problem 5.  Compare the analytic and numerical versions of the derivative of function f().

Plot the analytic derivative and the numerical derivative.

Use functions f(), df(), and deriv() to create the data for the plot.

In [None]:
# YOUR CODE HERE

### Problem 6.  Repeat problem 5, but this time use a value of 0.5 for parameter h of function deriv().

The goal is to see the impact of a change in h.

In [None]:
# YOUR CODE HERE

### Problem 7.  Define a function to perform gradient descent.

Here we will use a numeric approach.  Use your function deriv() in your code.

Add code only where shown with # YOUR CODE HERE.

Your code will be contain a loop that iterates max_iterations times.

Don't forget your return statement.

In [None]:
def argmin(f, learning_rate=0.1, max_iterations=1000, random_state=0):
    
    """ Find the argmin of function f using gradient descent.  
    Start at a random point between 0 and 1.
    """
    
    # start at a random point between 0 and 1
    np.random.seed(random_state)
    x = np.random.rand()
    
    # YOUR CODE HERE

Test argmin()



In [None]:
x = argmin(f)

In [None]:
print(f'x: {x:0.3g}, f(x): {f(x):0.3g}')

Plot function f, and draw a vertical line at value x.  The red line shows the input that argmin() computed, and should ideally be the input that gives the minimum value of f().

In [None]:
ys = f(xs)
plt.plot(xs, ys)
plt.axvline(x, color='red');

Try again, with a smaller learning rate.

In [None]:
x = argmin(f, learning_rate=0.0001)

ys = f(xs)
plt.plot(xs, ys)
plt.axvline(x, color='red');

Try again, with a larger learning rate.

In [None]:
x = argmin(f, learning_rate=0.2)

ys = f(xs)
plt.plot(xs, ys)
plt.axvline(x, color='red');

Try again, with an even larger learning rate.

In [None]:
x = argmin(f, learning_rate=0.5)

ys = f(xs)
plt.plot(xs, ys)
plt.axvline(x, color='red');

### Problem 8.  Summary

Write a paragraph or two about what you learned from derivatives and gradient descent from this notebook.

Be thoughtful.  Collect your ideas before writing.

*Replace this text with your thoughts.*