In [12]:
import sympy as sp
import numpy as np
sp.init_printing(use_latex=True)

Throughout the assignment we will define $x:=x_1$, $y:=x_2$, and $\bold x:=[x,y]^T$ to make the assignment concistent as subscripts can not be used when defining variables in SymPy

The Rosenbrock function is defined as 
$$
f(\bold x) = 100(y-x^2)^2+(1-x)^2
$$

In [8]:
x, y = sp.symbols("x y")
f = 100*(y - x**2)**2 + (1 - x)**2

⎡        ⎛   2    ⎞          ⎤
⎢- 400⋅x⋅⎝- x  + y⎠ + 2⋅x - 2⎥
⎢                            ⎥
⎢             2              ⎥
⎣      - 200⋅x  + 200⋅y      ⎦

# 1

In [9]:
df = sp.Matrix([sp.diff(f, x), sp.diff(f, y)])
df

⎡        ⎛   2    ⎞          ⎤
⎢- 400⋅x⋅⎝- x  + y⎠ + 2⋅x - 2⎥
⎢                            ⎥
⎢             2              ⎥
⎣      - 200⋅x  + 200⋅y      ⎦

To compute the gradient, we compute the partial derivatives for *f* w.r.t. *x* and *y*.
$$
\frac{d}{dx}f(\bold x)=2\cdot 100(y-x^2)\cdot (-2x)+2(1-x)\cdot (-1)=-400x(y-x^2)-2(1-x)
$$
$$
\frac{d}{dx}f(\bold y)= 2\cdot 100(y-x^2)\cdot=200(y-x^2)
$$

We now have all we need to construct the gradient
$$
\nabla f(\bold x)=\begin{bmatrix}\frac{d}{dx}f(\bold x) \\ \frac{d}{dy}f(\bold x)\end{bmatrix} = \begin{bmatrix} -400x(y-x^2)-2(1-x)\\200(y-x^2)\end{bmatrix}
$$

The Hessian contains the second order derivatives
$$
H = \begin{bmatrix}\frac{d}{dx^2}f(\bold x) & \frac{d}{dxdy}f(\bold x)\\
\frac{d}{dydx}f(\bold x) & \frac{d}{dy^2}f(\bold x)\end{bmatrix}

In [11]:
d2f = sp.Matrix([
    [sp.diff(df[0], x), sp.diff(df[0], y)],
    [sp.diff(df[1], x), sp.diff(df[1], y)]
])
d2f

⎡      2                    ⎤
⎢1200⋅x  - 400⋅y + 2  -400⋅x⎥
⎢                           ⎥
⎣      -400⋅x          200  ⎦

$$\frac{d}{dx^2}f(\bold x)=-400(y-x^2)-400x\cdot (-2x)+2=-400(y-x^2)+800x^2+2=1200x^2-400y+2$$
$$\frac{d}{dy^2}f(\bold x)=200$$

Luckily, the Hessian is symmetric so we only need to calculate one of the mixed derivatives.

$$\frac{d}{dxdy}f(\bold x)=-400x$$

We can now state the Hessian
$$
H = \begin{bmatrix}1200x^2-400y+2 & -400x\\
-400x & 200
\end{bmatrix}
$$

# 2

To ensure that $x^*=[1,1]^T$ is the only local minimizer, we will start by determining all the stationary points, as all local minimizers must satisfy the first-order optimality condition.

We thus want to solve the equation
$$
\nabla f(\bold x)= \begin{bmatrix} -400x(y-x^2)-2(1-x)\\200(y-x^2)\end{bmatrix} = 0
$$

Looking at the second equation $200(y-x^2)=0$ we see that this equation only holds when $y=x^2$.

By inserting this requirements into the other equation, we can greatly simplify it
$$-400x(x^2-x^2)-2(1-x)=0\Leftrightarrow$$
$$-2+2x=0\Leftrightarrow$$
$$x=1$$

Since we know that $y=x^2$ we have that $y=1$.

We now know that the function only has one stationary point $(1,1)$

Let's now insert the stationary point into the Hessian to determine if it's a minimizer or maximizer
$$H([1,1]) = \begin{bmatrix} 1200+400+2 & -400 \\ -400 & 200\end{bmatrix}=\begin{bmatrix} 1602 & -400 \\ -400 & 200\end{bmatrix}$$

In [15]:
eigs = np.linalg.eig(np.array([[1602, -400], [-400, 200]]))
print(eigs)

EigResult(eigenvalues=array([1708.09417047,   93.90582953]), eigenvectors=array([[ 0.96657849,  0.25637086],
       [-0.25637086,  0.96657849]]))


Since the eigenvalues are positive, (1,1) is a local minimizer