# Exercise 5
* Due: 11.2.2016 at noon
* Max points: 8

## General rules
* Send the source code or the written notes of your answers as an attachment to markus.hartikainen@jyu.fi before the due time
* You will get feedback about your answers a week from the due date
* From the second week on, the exercises will be given on the previous Wednesday's lecture

## Exercises

For exercises 1-2, we study optimization problem
$$
\begin{align}
\min \qquad & x_1^2+x_2^2 + x_3^3+(1-x_4)^2\\
\text{s.t.}\qquad &x_1^2+x_2^2-1=0\\
    &x_1^2+x_3^2-1=0\\
    &x_1,x_2\in\mathbb R
\end{align}
$$

1. **(2 points)** Use the SQP method to solve the above problem.
2. **(2 points)** Use the Lagrangian method to solve the above problem.
3. **(2 points)** Solve the problem
$$
\begin{align}
\min   \  & x_1^2 + x_2^2\\
\text{s.t. } & x_1 + x_2 \geq 1.
\end{align}$$ by using just the optimality conditions.
4. **(2 points)** Consider a problem
$$\begin{align}
\min   \  & f(x)\\
\text{s.t. } & h_k(x)=0, \text{ for all } k=1,\dots,K,
\end{align}
$$
where all the functions are twice differentiable. Show, that the gradient of the augmented Lagrangian function is zero in the minimizer $x^*$ of the above problem. In other words, show that $\nabla_xL_A(x^*,\mu^*,\varrho)=0$, where $\mu^*\in R^n$ is the corresponding optimal Lagrange multiplier vector.




### Exercise 5.1

In [104]:
def f_ex5(x):
    return \
        x[0] ** 2 + x[1] ** 2 + x[2] ** 3 + (1 - x[3]) ** 2,\
        [],\
        [x[0] ** 2 + x[1] ** 2 - 1, x[0] ** 2 + x[2] ** 2 - 1]


In [162]:
import numpy as np
import ad

#if k=0, returns the gradient of lagrangian, if k=1, returns the hessian
def diff_L(f, x, m, k):
    #Define the lagrangian for given m and f
    L = lambda x_: f(x_)[0] + (np.matrix(f(x_)[2]) * np.matrix(m).transpose())[0,0]
    return ad.gh(L)[k](x)

#Returns the gradients of the equality constraints
def grad_h(f,x):
    return  [ad.gh(lambda y: \
                   f(y)[2][i])[0](x) for i in range(len(f(x)[2]))] 

#Solves the quadratic problem inside the SQP method
def solve_QP(f,x,m):
    left_side_first_row = np.concatenate((\
        np.matrix(diff_L(f, x, m, 1)),\
        np.matrix(grad_h(f, x)).transpose()), axis = 1)
    left_side_second_row = np.concatenate((\
        np.matrix(grad_h(f, x)),\
        np.matrix(np.zeros((len(f(x)[2]), len(f(x)[2]))))), axis = 1)
    right_hand_side = np.concatenate((\
        -1 * np.matrix(diff_L(f, x, m, 0)).transpose(),
        -np.matrix(f(x)[2]).transpose()), axis = 0)
    left_hand_side = np.concatenate((\
                                    left_side_first_row,\
                                    left_side_second_row), axis = 0)
    temp = np.linalg.solve(left_hand_side, right_hand_side)
    return temp[:len(x)], temp[len(x):]
    
    

def SQP(f,start,precision):
    x = start
    m = np.ones(len(f(x)[2]))
    f_old = float('inf')
    f_new = f(x)[0]
    while abs(f_old - f_new) > precision:
        f_old = f_new
        (p, v) = solve_QP(f, x, m)
        x = x + np.array(p.transpose())[0]
        m = m + v
        f_new = f(x)[0]
    return x

In [163]:
from math import sqrt

o = SQP(f_ex5, [0.5, -sqrt(0.75), -sqrt(0.75), 1], 0.001)
print("Optimal is ", o)
print("Values of f(x) and constraints: ", f_ex5(o))

Optimal is  [ 0.99267448 -0.12274973 -0.12274973  1.        ]
Values of f(x) and constraints:  (0.99862059207569676, [], [0.00047012339004348647, 0.00047012339004348647])


### Exercise 5.2

In [156]:
def augmented_langrangian(f, x, mu, c):
    second_term = float(numpy.matrix(mu) * numpy.matrix(f(x)[2]).transpose())
    third_term = 0.5 * c * numpy.linalg.norm(f(x)[2]) ** 2
    return f(x)[0] - second_term + third_term

In [178]:
from scipy.optimize import minimize
import numpy
def augmented_langrangian_method(f, start, mu0, c0, constraintNormAccuracy = 0.00001):
    x_old = [float('inf')] * 2
    x_new = start
    mu = mu0
    c = c0
    while numpy.linalg.norm(f(x_new)[2]) > constraintNormAccuracy:
        res = minimize(lambda x:augmented_langrangian(f, x, mu, c), x_new)
        x_old = x_new
        #mu = float(mu - numpy.matrix(c) * numpy.matrix(f(x_old)[2]).transpose())
        
        mu = mu - c * numpy.matrix(f(x_old)[2])
        #print(mu)
        x_new = res.x
        #print(numpy.linalg.norm(f(x_new)[2]))
        c = 2 * c
    return x_new,c

In [164]:
from scipy.optimize import minimize
import numpy
def penalty_function_method(f, start, c0, constraintNormAccuracy = 0.00001):
    x_old = [float('inf')]*2
    x_new = start
    c = c0
    mu = numpy.zeros(len(f(x_new)[2]))
    while numpy.linalg.norm(f(x_new)[2]) > constraintNormAccuracy:
        res = minimize(lambda x : augmented_langrangian(f, x, mu, c), x_new)
        x_old = x_new
        x_new = res.x
        c = 2 * c
    return x_new, c

In [165]:
initial_x = [0, 0, 0, 0]
initial_mu = [0, 0]
initial_c = 1

In [180]:
result = augmented_langrangian_method(f_ex5, initial_x, initial_mu, initial_c, 0.0001)

print("C at optimal: ", result[1])
print("Optimal x: ", result[0])
print("Optimal f(x): ", f_ex5(result[0]))

C at optimal:  4194304
Optimal x:  [ -1.68422245e-08   1.00002368e+00  -1.00004147e+00   9.99999992e-01]
Optimal f(x):  (-7.7063103568638885e-05, [], [4.7360642478677661e-05, 8.2947443986070013e-05])


In [172]:
result = penalty_function_method(f_ex5, initial_x, initial_c, 0.000001)

print("C at optimal: ", result[1])
print("Optimal x: ", result[0])
print("Optimal f(x): ", f_ex5(result[0]))

C at optimal:  2097152
Optimal x:  [  9.99999509e-01   2.39204391e-07   9.90253476e-04   1.00000001e+00]
Optimal f(x):  (0.99999901893313459, [], [-9.8203790999118468e-07, -1.4360205247143654e-09])


Optimal seems to be [0, 1, -1, 1], f(x) = 0 with Augmented Lagrangian method. Penalty function method converges to different point [1, 0, 0, 1], f(x) = 1 for what ever reason (bug in code or AL uses different path to optimal).

In [182]:
print("Optimal suggested by Augmented Lagrangian: ", f_ex5([0, 1, -1, 1]))
print("Optimal suggested by Penalty Method: ", f_ex5([1, 0, 0, 1]))

Optimal suggested by Augmented Lagrangian:  (0, [], [0, 0])
Optimal suggested by Penalty Method:  (1, [], [0, 0])


In [147]:
# TEST CODE FOR METHOD VALIDATION, optimal [0.5 0.5]

def f_constrained2(x):
    return sum([i**2 for i in x]),[],[sum(x)-1]

print("Augmented: ", augmented_langrangian_method(f_constrained2,[0,0],1,1))
print("Penalty: ", penalty_function_method(f_constrained2,[0,0],1))

Augmented:  (array([ 0.5,  0.5]), 2)
Penalty:  (array([ 0.49999618,  0.49999618]), 262144)


### Exercise 5.3

We can figure out the relations of the varibles from the stationary rule:
$$
L(x,\lambda,\mu) = (x_1^2+x_2^2)-\mu_1(x_1+x_2-1),
$$
then
$$
\begin{align}
\nabla_x L(x,\lambda,\mu) &= (2x_1+2x_2)-\mu_1(1+1)\\
&= 2(x_1+x_2)-2\mu_1
\end{align}
$$
so
$$
\mu_1=x_1+x_2
$$
for the complementary rule $\mu_1(x_1+x_2-1)=0$ to hold, we see that $x_1=x_2=0.5$ (making $x_1+x_2-1=0$) and thus $\mu_1=x_1+x_2=1$


### Exercise 5.4

In essence
$$
\begin{align}
\nabla_x L(x,\lambda,\mu) &= 0
\end{align}
$$
is the balance of
$$
\begin{align}
\nabla f(x) &= \sum_{i=1}^n\mu_i\nabla h_i(x)
\end{align}
$$
So at optimal $x^*$ the Lagrange multipliers $\mu_i$ balance the gradients of constraints to the gradient of the objective function.