# Gradient Descent Lab

We have coded examples of gradient descent for two and three variables in class and discussion. Today, we will attempt to code gradient descent for a more general function. That is, given some random, differentiable function from $\mathbb{R}^n\rightarrow \mathbb{R}$, our function will run gradient descent and attempt to find the minimum value of the function, if it exists. Recall, the formula for gradient descent is given by

$x\leftarrow x-\eta \nabla f(x)$.

I have given you code for a function of 4 variables that has a unique minimum, for you to test your code on.

In [16]:
import numpy as np
#x is a numpy array
def f(x):
    return x[0]**2+x[1]**2+x[2]**2+x[3]**2+x[4]**2
#on your code adjust so each variable is squared
#This piece of code will take in a function, and give the number of indices in its input
def varfunc(g):
    i=1
    while True:
        try:
            array=np.zeros(i)
            g(array)
        except:
            pass
        else: 
            break
        i=i+1
    return i
def gradient(f,x,length,h=.00001):
    #Takes in function, point where the gradient is desired, the number of indices in that point, and 
    # a small number h used to estimate the partial derivative
    
    #returns the estimated gradient at a point
    gradientvector=np.zeros(length,dtype=np.float)
    for i in range(length):
        y=x.copy()
        y[i]=y[i]+h
        gradientvector[i]=(f(y)-f(x))/h
        
    return gradientvector
def gradientdescent(f,numsteps=20000,learn_par=.001):
    # Takes in a function, the number of steps and a learning parameter
    
    #randomly choose a starting vector using the normal distribution
    #Repeatedly iterate using gradient descent
    #store each point in x_val
    
    #output the final last point, the function value, and the distance between the last point, and the second to the last point
    length=varfunc(f)
    x_vals=[]
    x0=np.random.normal(0,1,length)
    for i in range(numsteps):
        grad=gradient(f,x0,length)
        x0=x0-learn_par*grad
        x_vals.append(x0)
    
    
    return x0,f(x0),np.linalg.norm(x_vals[-1]-x_vals[-2])

In [18]:
print(gradientdescent(f))

(array([ -5.00000000e-06,  -5.00000000e-06,  -5.00000000e-06]), 7.499999999989098e-11, 1.442445132346284e-20)


$f(x,y,z,w)=x^2+y^2+z^2+w^2$

$\nabla f=[2x,2y,2z,2w]$

To find the partial derivative with respect to $x_i$, we can estimate it by $(f(x+he_i)-f(x))/h$

x=[1,2,3]

$x+he_1$=[1+h,2,3]

$\nabla f(x)=\langle \frac{\partial f}{\partial x_i}\rangle$

if we are dealing with 4 variables, and $x=[x_1,x_2,x_3,x_4]$, then the gradient is estimated by $[(f([x_1+h,x_2,x_2,x_4])-f([x_1,x_2,x_3,x_4]))/h,\dots,(f([x_1,x_2,x_2,x_4+h])-f([x_1,x_2,x_3,x_4]))/h]$