# Basic Gradient Descent

Gradient descent is a basic algorithm for minimizing a function.  Here's an example, implemented in python, from <a href="https://en.wikipedia.org/wiki/Gradient_descent#Computational_examples">https://en.wikipedia.org/wiki/Gradient_descent#Computational_examples</a>.  The program finds the minimum of

$$ f(x) =x^4−3x^3+2 $$

In [0]:
# From calculation, it is expected that the local minimum occurs at x=9/4

cur_x = 6 # The algorithm starts at x=6
gamma = 0.01 # step size multiplier
precision = 0.00001
previous_step_size = cur_x

def df(x):
    return 4 * x**3 - 9 * x**2

while previous_step_size > precision:
    prev_x = cur_x
    cur_x += -gamma * df(prev_x)
    previous_step_size = abs(cur_x - prev_x)

print("The local minimum occurs at %f" % cur_x)

## Import modules

In [0]:
import numpy as np
import scipy as sp
import matplotlib
import matplotlib.pyplot as plt

The following line is a Jupyter code that tells matplotlib to print inline within the jupyter notbook

In [0]:
%matplotlib inline

Define a function

In [0]:
def f(x):
    return  x**4 - 3 * x**3+2

In order to plot this function, we make a bunch of (x,y) coordinates on the graph.  Before doing this, let's take a look at ways we can operate on elements in a list.

In [0]:
x = np.linspace(0,5,11)
print x

In [0]:
x**2

In [0]:
np.sin(x)

In [0]:
x = np.linspace(-0,4,1000)

In [0]:
y = f(x)

In [0]:
f(2)

In [0]:
f(a)

In [0]:
from sympy import *

In [0]:
a = Symbol('a')

In [0]:
f(a)

In [0]:
matplotlib.rcParams.update({'font.size': 18, 'text.usetex': True})
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.9, 0.7]) # left, bottom, width, height (range 0 to 1)
axes.plot(x,y, 'g', label=r'$y = x^4-3x^3+2$') # g for green
axes.legend(loc=2); # upper left corner
axes.set_xlabel('$x$')
axes.set_ylabel('$y$') ;
axes.set_title('$y = x^4 - 3  x^3+2$');

In [0]:
cur_x = 6 # The algorithm starts at x=6
gamma = 0.01 # step size multiplier
precision = 0.00001
previous_step_size = cur_x

x_list = [cur_x]; y_list = [f(cur_x)]

def df(x):
    return 4 * x**3 - 9 * x**2

while previous_step_size > precision:
    prev_x = cur_x
    cur_x += -gamma * df(prev_x)
    previous_step_size = abs(cur_x - prev_x)
    x_list.append(cur_x)
    y_list.append(f(cur_x))

print "Local minimum occurs at:", cur_x
print "Number of steps:", len(x_list)

In [0]:
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.9, 0.7]) # left, bottom, width, height (range 0 to 1)
axes.plot(x,y, 'g', label=r'$y = x^4-3x^3+2$') # g for green
axes.scatter(x_list,y_list,c="r")
axes.plot(x_list,y_list,c="r",label='gradient descent steps')
axes.legend(loc=2); # upper left corner
axes.set_xlabel('$x$')
axes.set_ylabel('$y$') ;

In [0]:
xzoom = np.linspace(0,2.5,100)
yzoom=f(xzoom)


fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.9, 0.7]) # left, bottom, width, height (range 0 to 1)
axes.plot(xzoom,yzoom, 'g', label=r'$y = x^4-3x^3+2$') # g for green
axes.scatter(x_list[8:],y_list[8:],c="r")
axes.plot(x_list[8:],y_list[8:],c="r",label='gradient descent steps')
axes.legend(loc=3); # lower left corner
axes.set_xlabel('$x$')
axes.set_ylabel('$y$') ;

## 2 dimensional example

In [0]:
def g(x,y):
    return (x-2)**2*(y-1)**2

In [0]:
def grad_g(x,y):
    return np.array([2*(x-2)*(y-1)**2, (x-2)**2*2*(y-1)])

In [0]:
[a,b]=h*grad_g(3,3)
print [a,b]

In [0]:
x_old, y_old = 1,2 
h = 0.1 # step size
precision = 0.00001

x_list = [x_old]; y_list = [y_old]; z_list= [g(x_old,y_old)]

[x_new, y_new] = [x_old, y_old] - h * grad_g(x_old,y_old)

while (abs(x_new - x_old)+abs(y_new - y_old)) > precision:
    x_old, y_old = x_new, y_new
    direction = - grad_g(x_old,y_old)
    [x_new, y_new] = [x_old,y_old] + h * direction
    x_list.append(x_new)
    y_list.append(y_new)
    z_list.append(g(x_new,y_new))
print "Local minimum occurs at:", x_new,y_new
print "Number of steps:", len(z_list)

In [0]:
g(1.97076954928, 1.02923045072)