# Homework 1
Consider the unconstrained problem
$$
\min f(x), \hspace{0.5 cm} \text{where  } f(x) = \left\{100(x_2-x_1)^2+(1-x_1)^2\right\}
$$
whose exact minimum is at $x^∗ = (1, 1)$. Find an estimate of $x^∗$ using:
* [a gradient method](#Gradient-descent),
* a Newton's method,
* a conjugate direction method.

In all cases use $x_0=(2,5)$ as starting point, backtracking line search with $\alpha=0.25$ and $\beta=0.5$, and $\|\nabla f(x_k)\|<10^{-5}$ as a stopping criterium. How many iterations have you used to find the estimate of $x^∗$ with these methods?

In [8]:
import numpy as np
from numpy.typing import ArrayLike

# Answer
First we define some common functions like the function we want to minimize, its gradient.

In [9]:
X = np.array([2,5]) # the initial point


def f(x: ArrayLike) -> float:
    """The function that we want to minimize"""
    assert len(x) == 2, "Size doesn't match, the size must be 2."
    return 100*(x[1]-x[0])**2 + (1-x[0])**2


def gradient(x: ArrayLike) -> ArrayLike:
    """Returns the value of the gradient at the given point"""
    assert len(x) == 2, "Size doesn't match, the size must be 2."
    return np.array([-2*(1-x[0]) - 200*(x[1]-x[0]), 200*(x[1]-x[0])])


def line_search(d_k: ArrayLike, x: ArrayLike) -> float:
    """Implement the line search method, this function update the time step
    until the criterion is fulfilled"""

    alpha = 0.25
    beta = 0.5
    t = 1
    while f(x + t*d_k) > f(x) + alpha * t * np.dot(gradient(x),d_k):
        t *= beta
    return t

    


The three methods that we have to implement differ only in the way we define our descend direction. The first one is gradient descent.

## Gradient descent

In this method we set our descent direction $d_k = -\nabla f(x_k)$.

As a stopping criterium we set $\|\nabla f(x_k)\|<10^{-5}$ but have to put another criterium using the number of iterations in case that the other stopping criterium is never achieved. To compute the norm of the gradient we make a dot product. 

In [10]:
def gradient_descend(stop: float, x: ArrayLike) -> tuple[ArrayLike, int]:
    """The gradient descend method using line search"""

    grad = gradient(x)
    
    for ite in range(10_000):
        descend_direction = -grad
        t = line_search(descend_direction, x)
        x = x + t * descend_direction
        grad = gradient(x)
        if np.dot(grad, grad) < stop**2: 
            return x, ite
        
    print("The stop criterium wasn't achieve in 10000 iterations.")
    return x, ite
   
    

In [11]:
gradient_descend(1.0e-05, X)

(array([1.0000059 , 1.00000595]), 2220)

## Newton's method

In this case we define the descend direction as $d_k = -(\nabla^2f(x_k))^{-1}\nabla f(x_k)$. For this method we need the Hessian matrix $(\nabla^2f(x_k))$, so we need to compute the derivatives of the functions.
$$
\frac{\partial f}{\partial x_1} = -200 (x_2-x_1)-2 (1-x_1), 
$$
$$
\frac{\partial f}{\partial x_2} = 200 (x_2-x_1).
$$
From these expressions we can see that the second derivatives are constant, so the Hessian matrix is:
$$
\nabla^2f(x)=\begin{pmatrix} 202 & -200 \\ -200 & 200\end{pmatrix}
$$

In [12]:
HESSIAN = np.array([[202, -200], [-200, 200]])


def newtons_method(stop: float, x: ArrayLike) -> tuple[ArrayLike, int]:
    pass 
