# Assignment Goals

Assignment 6 requires you to implement gradient descent based optimization.  

- Minimum requirement: adapt the code from the presentation to optimize as many of the functions below as possible.
- Write a generic function that will take in 2 other functions as input, and a range of values within which to search, and then implement gradient descent to find the optimum.  The basic requirements of gradient descent are already available in the presentation.
-  For some assignments, the gradient has not been given.  You can either write the function on your own, or suggest other methods that can achieve this purpose.

In [1]:
# The following imports are assumed for the rest of the problems
import numpy as np
from numpy import cos, sin, pi, exp 

## Problem 1 - 1-D simple polynomial

The gradient is not specified.  You can write the function for gradient on your own.  The range within which to search for minimum is [-5, 5].

In [22]:
def f1(x):
    return x ** 2 + 3 * x + 8

def f2(x):
    return 2*x + 3

def gradient_descent_for_1var(func, derivative, start_point_range, learning_rate, precesion=1e-6, max_iterations=10000):
    st = np.linspace(start_point_range[0],start_point_range[1],100)
    function_values = []
    point_values = []
    for start_point in st:
        current_point = start_point
        for i in range(max_iterations):
            prev_point = current_point
            gradient = derivative(current_point)
            current_point -= learning_rate * gradient
            if abs(current_point - prev_point) < precesion:
                break
        point_values.append(current_point)
        function_values.append(func(current_point))
    return min(function_values)
        
    
gradient_descent_for_1var(f1,f2,[-5,5],0.1)

5.750000000010272

## Problem 2 - 2-D polynomial

Functions for derivatives, as well as the range of values within which to search for the minimum, are given.

In [3]:
xlim3 =  [-10, 10]
ylim3 =  [-10, 10]
def f3(x, y):
    return x**4 - 16*x**3 + 96*x**2 - 256*x + y**2 - 4*y + 262

def df3_dx(x, y):
    return 4*x**3 - 48*x**2 + 192*x - 256

def df3_dy(x, y):
    return 2*y - 4

def pythagores(x1,y1,x2,y2):
    distance = np.sqrt((x1-x2)**2 + (y1-y2)**2)
    return distance


def gradient_descent_for_2variables(func, derivative_x,derivative_y,start_point_x_range,start_point_y_range, learning_rate, precesion=1e-6, max_iterations=10000):
    st_x = np.linspace(start_point_x_range[0],start_point_x_range[1],100)
    st_y = np.linspace(start_point_y_range[0],start_point_y_range[1],100)
    
    x_vals,y_vals,z_vals = [],[],[]
    for start_point_x,start_point_y in zip(st_x,st_y):
    
        current_point_x,current_point_y = start_point_x,start_point_y
        for i in range(max_iterations):
            prev_point_x,prev_point_y = current_point_x,current_point_y
            gradient_x,gradient_y = derivative_x(current_point_x,prev_point_y),derivative_y(prev_point_x,current_point_y)

            current_point_x -= learning_rate * gradient_x
            current_point_y -= learning_rate * gradient_y

            if (pythagores(current_point_x,current_point_y,prev_point_x,prev_point_y)) < precesion:
                break
        x_vals.append(current_point_x),y_vals.append(current_point_y)
        z_vals.append(func(current_point_x,current_point_y))
        
    return min(z_vals)
        
    
gradient_descent_for_2variables(f3, df3_dx,df3_dy,xlim3,ylim3, 0.001)  

2.00001072887693

## Problem 3 - 2-D function 

Derivatives and limits given. 

In [33]:
xlim4 = [-np.pi, np.pi]
ylim4 = [-np.pi,np.pi]
def f4(x,y):
    return np.exp(-(x - y)**2)*np.sin(y)

def f4_dx(x, y):
    return -2*np.exp(-(x - y)**2)*np.sin(y)*(x - y)

def f4_dy(x, y):
    return np.exp(-(x - y)**2)*np.cos(y) + 2*np.exp(-(x - y)**2)*np.sin(y)*(x - y)


gradient_descent_for_2variables(f4, f4_dx,f4_dy,xlim4,ylim4, 0.1)  

-0.9999999999046567

## Problem 4 - 1-D trigonometric

Derivative not given.  Optimization range [0, 2*pi]

In [27]:
def f5(x):
    return np.cos(x)**4 - np.sin(x)**3 - 4*np.sin(x)**2 + np.cos(x) + 1

def gradient_descent_for_1var_2(func, d_fun, start_point_range, learning_rate, precesion=1e-6, max_iterations=10000):
    st = np.linspace(start_point_range[0],start_point_range[1],100)
    function_values = []
    point_values = []
    for start_point in st:
        current_point = start_point
        for i in range(max_iterations):
            prev_point = current_point
            gradient = d_fun(func,current_point)
            current_point -= learning_rate * gradient
            if abs(current_point - prev_point) < precesion:
                break
        point_values.append(current_point)
        function_values.append(func(current_point))
    return min(function_values)
        

In [31]:
def d_fun(foo,x):
    h = 1e-6
    der_foo = (foo(x) - foo(x-h))/h
    return der_foo

In [32]:
gradient_descent_for_1var_2(f5, d_fun, [0,2*np.pi], 0.1)

-4.045412051571511

In [None]:
def gradient_descent_multivar(func, derivative, start_point_range, learning_rate, error=1e-6, max_iterations=100000):
    for start_point in st:
        current_point = start_point
        for i in range(max_iterations):
            prev_point = current_point
            gradient = derivative(current_point)
    #         current_point -= learning_rate * gradient
            current_point = current_point - learning_rate * gradient
            if np.linalg.norm(current_point - prev_point) < error:
                break
    return func(current_point),current_point