# Homework 5

## Q1
1. Compute the gradient $\nabla f(x)$ and Hessian $\nabla^{2} f(x)$ of the Rosenbrock
function

\begin{aligned}
f\left(x_{1}, x_{2}\right)=100\left(x_{2}-x_{1}^{2}\right)^{2}+\left(1-x_{1}\right)^{2}
\end{aligned}

## Q1 Solution

The gradient has the following form:

\begin{equation}
\nabla f=\left[\begin{array}{c}
\frac{\partial f}{\partial x_{1}} \\
\frac{\partial f}{\partial x_{2}}
\end{array}\right]
\end{equation}

Let's calculate the derivative with respect to $x_1$ first:

\begin{aligned}
&\frac{\partial f}{\partial x_{1}}=200\left(x_{2}-x_{1}^{2}\right)\left(-2 x_{1}\right)-2\left(1-x_{1}\right) \\
&\frac{\partial f}{\partial x_{1}}=-2\left[\left(200 x_{2}-200 x_{1}^{2}\right)\left(x_{1}\right)+1-x_{1}\right] \\
&\frac{\partial f}{\partial x_{1}}=-2\left(200 x_{1} x_{2}-200 x_{1}^{3}+1-x_{1}\right) \\
&\frac{\partial f}{\partial x_{1}}=2\left(200 x_{1}^{3}-200 x_{1} x_{2}+x_{1}-1\right)
\end{aligned}

Then, we calculate the derivative with respect to $x_2$:

\begin{equation}
\frac{\partial f}{\partial x_{2}}=200\left(x_{2}-x_{1}^{2}\right)
\end{equation}

Thus, the gradient $\nabla f(x)$ is:

\begin{equation}
\nabla f=\left[\begin{array}{c}
2\left(200 x_{1}^{3}-200 x_1 x_2+x_{1}-1\right) \\
200\left(x_2-x_{1} 2\right)
\end{array}\right]
\end{equation}

We will use the result above to derive the Hessian matrix which has the following form:

\begin{equation}
\textbf H f=\left[\begin{array}{ll}
\frac{\partial^{2} f}{\partial x_{1}^{2}} & \frac{\partial^{2} f}{\partial x_{1} \partial x_{2}} \\
\frac{\partial^{2} f}{\partial x_2 \partial x_{1}} & \frac{\partial^{2} f}{\partial x_{2}^{2}}
\end{array}\right]
\end{equation}

Using the previous results we have:

\begin{equation}
\begin{aligned}
&\frac{\partial^{2} f}{\partial x_{1}^{2}}=1200 x_{1}^{2}-400 x_{2}+2 \\
&\frac{\partial^{2} f}{\partial x_{2}^{2}}=200 \\
&\frac{\partial^{2} f}{\partial x_{1} \partial x_{2}}=-400 x_{1} \\
&\frac{\partial^{2} f}{\partial x_{2} \partial x_{1}}=-400 x_{1}
\end{aligned}
\end{equation}

And thus the Hessian matrix is:

\begin{equation}
\textbf H f=\left[\begin{array}{ll}
1200x_1^2-400x_2 +2 & -400 x_1 \\
-400 x_1 & 200
\end{array}\right]
\end{equation}

## Q2
Implement the Newton’s Method with line search given in Algorithm 1. Use the Newton’s
method to minimize the Rosenbrock function in Problem 1. Set the initial stepsize
$\bar{\alpha}=1$ Select your own choice of $\rho \in(0,1), c \in(0,1)$. First run the
algorithm from the initial point $x^{0}=(1.2,1.2)^{\top}$, and then try the more
difficult starting point $x^{0}=(-1.2,1)^{\top}$. For each starting point, print out
the step length $\alpha^{k}$ used by the algorithm as well as the point $x^{k}$ for
every step $k$. You should observe that Newton's Method converges very fast.

![Algorithm 1](algo_1.jpg)

In [2]:
import numpy as np

x_0 = np.array([1.2, 1.2])
alpha = 1
rho = 0
c = 0

k = 0
eps = 10 ** -4

def f(x):
    return 100 * (x[1] - x[0] ** 2) ** 2 + (1 - x[0]) ** 2

def df(x):
    nabla = np.zeros(2)
    nabla[0] = 2 * (200 * x[0] ** 3 - 200 * x[0] * x[1] + x[0] - 1)
    nabla[1] = 200 * (x[1] - x[0] ** 2)

    return nabla

def d2f(x):
    hess = np.zeros((2, 2))
    hess[0, 0] = 1200 * x[0] ** 2 - 400 * x[1] + 2
    hess[1, 1] = 200
    hess[0, 1] = hess[1, 0] = -400 * x[0]

    return hess

d_0 = - np.linalg.inv(d2f(x_0)) @ df(x_0)
x = x_0

while np.linalg.norm(df(x)) > eps:
    pass