## Matrix Calculus and Optimization
---
Agenda
> - Calculus in higher dimensions: Continues on iPad

> - Gradient Descent Algorithm

In [1]:
import numpy as np

### Gradient Descent Algorithm

Find a local minimum of a function $y=f(x)$ by using an iterative approach starting from some appropriate initialization $x=x_0$ for the minimum.

Start with random $x_{current}$, that is $x_0$, and repeat until convergence:</li>
<div class='eqnbox'>
$$ \large
x_{next}= x_{current} - \alpha f'(x_{current}), \qquad 
 $$
</div>

where $\alpha$ is called the 'learning rate'.



In [2]:
# Define a function that finds the approximate value of the derivative function 
def derivative(f, x_value):
    '''
    Arguments:
    f: a function provided by using lambda definition
    Return: The derivative of the function f at x_value
    '''
    h=0.01
    # Numerical differentialtion of f at x_value by central difference formula
    df_central_dif = (f(x_value+h) - f(x_value-h)) /(2*h)
    return df_central_dif

In [16]:
# Example: following is a one-liner definition
squareFunc = lambda x: x**2 - 2*x+5
print("The derivative of the function at {} = {}".format(1.0,derivative(squareFunc,1.0)))
print("The derivative of the function at {} = {}".format(2.0,derivative(squareFunc,2.0)))

The derivative of the function at 1.0 = 0.0
The derivative of the function at 2.0 = 1.9999999999999574


In [5]:
def gradient_descent(f, x0=0.0, delta=0.000001, max_iter=100, alpha=0.1):
    '''
    This function uses gradient descent method to find a minimum point of a convex function 
    and the value there. 
    Arguments:
    f: The function whose minimum is to be found, passed as an anonymous (lambda) function 
    x0: Initialization for the minimum point
    delta: the precision/tolerance of the minimum point.
    max_iter: maximum number of iterations to perform.
    alpha: the learning rate in gradient descent algorithm
    Returns:
    x_min: a point of local minimum for the function
    f(x_min): a local minimum value for the function
    '''
    x_current = x0
    for i in range(max_iter):
        # Find the derivative or approximate derivative at x_current
        df = derivative(f, x_current)
        # The gradient descent step
        x_next = x_current - alpha*df
        # Stop iterating once desired accuracy is achieved
        if(np.abs(x_next - x_current)< delta):
            break
        x_current  = x_next
        #print("\n Iteration {0:d}: {1:.8f}".format(i+1,x_current))
    return x_current, f(x_current)

In [6]:
# Example 1
squareFunc = lambda x: x**2 - 2*x+5
x_min, min_value = gradient_descent(squareFunc)
print("The minimum of the given function is f({}) = {} ".format(x_min, min_value))

The minimum of the given function is f(0.9999953231947645) = 4.000000000021872 


In [25]:
# Example 2: Note that this example is not a convex function
cubicFunc = lambda x: 2*x**3 - 9* x**2 + 12*x
x_min, min_value = gradient_descent(cubicFunc,x0=1.2, alpha=0.01)
print("The minimum of the given function is f({}) = {} ".format(x_min, min_value))


 Iteration 1: 1.20959800

 Iteration 2: 1.21953600

 Iteration 3: 1.22981440

 Iteration 4: 1.24043238

 Iteration 5: 1.25138786

 Iteration 6: 1.26267738

 Iteration 7: 1.27429606

 Iteration 8: 1.28623752

 Iteration 9: 1.29849386

 Iteration 10: 1.31105558

 Iteration 11: 1.32391158

 Iteration 12: 1.33704915

 Iteration 13: 1.35045397

 Iteration 14: 1.36411013

 Iteration 15: 1.37800017

 Iteration 16: 1.39210513

 Iteration 17: 1.40640465

 Iteration 18: 1.42087704

 Iteration 19: 1.43549942

 Iteration 20: 1.45024780

 Iteration 21: 1.46509728

 Iteration 22: 1.48002219

 Iteration 23: 1.49499624

 Iteration 24: 1.50999274

 Iteration 25: 1.52498475

 Iteration 26: 1.53994529

 Iteration 27: 1.55484756

 Iteration 28: 1.56966506

 Iteration 29: 1.58437187

 Iteration 30: 1.59894275

 Iteration 31: 1.61335337

 Iteration 32: 1.62758043

 Iteration 33: 1.64160183

 Iteration 34: 1.65539676

 Iteration 35: 1.66894587

 Iteration 36: 1.68223131

 Iteration 37: 1.69523682

 Iteratio

**HW Problem** : Modify the provided code to find the minimum for a function in two variables. Show the output for the function 
$$
f(x_1, x_2) = x_1^2+x_2^2-2x_1+4x_2+8
$$