<a href="https://colab.research.google.com/github/johanhoffman/DD2363-VT19/blob/tobzed/Lab-7/tedwards_lab7.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Lab 7: Optimization and learning**
**Tobias Edwards**
9th of March 2019


# **Abstract**

The focus of this lab was function minimization. 

#**About the code**

In [0]:
"""DD2363 Methods in Scientific Computing, """
"""KTH Royal Institute of Technology, Stockholm, Sweden."""

# Code by Tobias Edwards (tedwards@kth.se)

'KTH Royal Institute of Technology, Stockholm, Sweden.'

# **Set up environment**

In [0]:
# Load neccessary modules.
from google.colab import files
from math import *
import numpy as np
import unittest

# **Introduction**

Given an objective function $f:D \rightarrow R$, where $D$ is some predefined domain, we seek an element in $x^* \in D$ such that 
$f(x^*) \leq f(x) \quad \forall x \in D$. The motivation for the methods that I have implemented below is that a minimum is either located on a domain boundary \textit{or} where $\nabla f = 0$. The methods follow gradients and stop when the gradient is approximately $0$.

One way of approximating the minimum of $f$ is by the gradient descent method. This method, as per the name, iteratively steps along the negative gradient. 

Another method is the familiar Newton's method. However, instead of approximating roots for $f$ using the Jacobian $J(f)$, we approximate roots of $\nabla f$ replacing $J(f)$ with the Hessian of $f$ defined as $H(f) = J(\nabla f)$, i.e., $H$ is a matrix containing second partial derivatives of $f$. 


# **Results**


### Assignment 1. Gradient Descent

Note that I used the Barzilai & Borwein method to determine how each step size should be. The method is explained in detail [here](http://pages.cs.wisc.edu/~swright/726/handouts/barzilai-borwein.pdf).



In [0]:
def barzilai_borwein(x_prev,x_curr,grad_f,alpha,beta):
    delta_grad = grad_f(x_curr) - grad_f(x_prev)
    delta_x = x_curr - x_prev
    denom = delta_grad.dot(delta_grad)
    nom = delta_grad.dot(delta_x)
    if denom == 0:
        return alpha*beta
    return nom/denom

def gradient_descent(f,grad_f,x_current,TOL):
    alpha = 1 # initial stepsize
    beta = 0.9
    max_iter = 1000
    iter = 0
    while np.linalg.norm(grad_f(x_current)) >= TOL and iter < max_iter:
        old_x = x_current
        x_current = old_x - alpha*grad_f(old_x)
        alpha = barzilai_borwein(old_x,x_current,grad_f,alpha,beta)
        iter += 1
    return x_current

### Assignment 2. Newton's Method

In [0]:
def newtons_method(f,grad_f,Hf,x_curr,TOL):
    while np.linalg.norm(grad_f(x_curr)) >= TOL:
        x_delta = np.linalg.solve(Hf(x_curr),-grad_f(x_curr))
        x_curr += x_delta
    return x_curr

# Tests


In [0]:
class Lab7FunctionsTest(unittest.TestCase):

    def test_gradient_descent(self):
        f = lambda x: (x[0]-1.0)**2 + (x[1]+2.0)**2 - 3.0
        # minimum of f is at (1,-2) and f(1,-2) = -3
        grad_f = lambda x: np.array([2.0*(x[0]-1.0), 2.0*(x[1]+2.0)])
        TOL_list = [.1,.01,.001,.0001,.00001,.000001,.0000001]
        rel_error = []
        exact = np.array([1.0,-2.0])
        for TOL in TOL_list:
            x_curr = gradient_descent(f,grad_f,np.array([10.0,3.0]),TOL)
            rel_error.append(np.linalg.norm(x_curr-exact)/np.linalg.norm(exact))
        print("Results for Gradient Descent")
        print ("The relative error for step sizes:")
        for i in range(len(TOL_list)):
            print("Step size: %f | Error: %f" %(TOL_list[i],rel_error[i]))

    def test_newtons_method(self):
        f = lambda x: (x[0]-2.0)**2 + (x[1]+1.0)**2 -3
        grad_f = lambda x: np.array([2.*(x[0]-2.0),2.0*(x[1]+1.0)])
        Hf = lambda x: np.array([
                    [2.0, 0.0],
                    [0.0, 2.0]
                ])
        TOL_list = [.1,.01,.001,.0001,.00001,.000001,.0000001]
        rel_error = []
        exact = np.array([2.0,-1.0])
        for TOL in TOL_list:
            x_curr = newtons_method(f,grad_f,Hf,np.array([10.0,3.0]),TOL)
            rel_error.append(np.linalg.norm(x_curr-exact)/np.linalg.norm(exact))
        print("Results for Newton's method")
        print ("The relative error for step sizes:")
        for i in range(len(TOL_list)):
            print("Step size: %f | Error: %f" %(TOL_list[i],rel_error[i]))
if __name__ == '__main__':
    unittest.main(argv=['first-arg-is-ignored'], exit=False)

# **Discussion**

I find that understanding the method and how it is derivied mathematically is easy enough to understand. However, I have difficulty in getting intuition for what is going on.  Also, I find that testing has become quite difficult when intuitively I can't entirely see what/how to test. Currently I plot the graphs of my solutions against exact solutions and examine the difference. For instance i tried to do a squared error method but this resulted in nothing.  I couldnt tell if this was because the error was so small that it was pratically 0 or if the approixmation was giving exact points. 