# Exercise 6
* Due: 18.2.2016 at noon
* Max points: 8

## General rules
* Send the source code or the written notes of your answers as an attachment to markus.hartikainen@jyu.fi before the due time
* You will get feedback about your answers a week from the due date
* From the second week on, the exercises will be given on the previous Wednesday's lecture

## Exercises

1. **2 points** Solve optimization problem
$$
\begin{align}
\min \quad &\log(x_1^2+1)+x_2^4+x_1x_3\\
\text{s.t. }\quad &x_1^3-x_2^2\geq 1\\
& x_1,x_3\geq 0, \ x_2\in\mathbb R
\end{align}
$$
using any method that you have available.
2. **2 points** Study multiobjective optimization problem
$$
\begin{align}
\min & \{\|x-(1,0)\|^2,\|x-(0,1)\|^2\}\\
\text{s.t. }&x\in \mathbb R^2.
\end{align}
$$
Chracterize algeraicly the full set of Pareto optimal solutions.
3. **2 points** Calculate the ideal and nadir vectors for the problem of 2. You can use any methods available.
4. **2 points** Try to generate a representative set of Pareto optimal solutions using the weighting method for the problem of 2. Compare this set to the set of Pareto optimal solutions from 2. What do you notice?

### Task 1

In [62]:
import math
def ex6task1(x):
    return math.log(x[0] ** 2 + 1) + x[1] ** 4 + x[0] * x[2]
def ex6constraint1(x):
    return x[0] ** 3 - x[1] ** 2 - 1

In [101]:
bounds = ((0, float("inf")), (-float("inf"), float("inf")), (0, float("inf")))
initial_x = [1., 0., 1.]

In [102]:
import numpy as np
from scipy.optimize import minimize
import ad

constraints = ({'type': 'ineq',
                'fun' : ex6constraint1,
                'jac' : ad.gh(ex6constraint1)[0]})

res = minimize(
    ex6task1,
    initial_x, 
    method='SLSQP',
    jac = ad.gh(ex6task1)[0],
    constraints = constraints,
    bounds = bounds,
    options = {'disp' : True})
print("Given optimal ", res.x)
print("f(x) at optimal ", ex6task1(res.x))
print("Constraint value at optimal ", ex6constraint1(res.x))

Optimization terminated successfully.    (Exit mode 0)
            Current function value: 0.69314718056
            Iterations: 2
            Function evaluations: 2
            Gradient evaluations: 2
Given optimal  [ 1.  0.  0.]
f(x) at optimal  0.69314718056
Constraint value at optimal  -1.33226762955e-15


#### How about penalty function method?

In [83]:
def alpha(x, f):
    (_, ieq, eq) = f(x)
    return sum([min([0, ieq_j]) ** 2 for ieq_j in ieq]) + sum([eq_k ** 2 for eq_k in eq])
def penalized_function(x, f, r):
    return f(x)[0] + r * alpha(x, f)

In [103]:
from scipy.optimize import minimize

def ex6task1_constrained(x):
    return ex6task1(x), [ex6constraint1(x), x[0], x[2]], []

res = minimize(
    lambda x: penalized_function(x, ex6task1_constrained, 100000),
    initial_x,
    method = 'Nelder-Mead',
    options = {'disp': True})
print("Given optimal ", res.x)
print("f(x) at optimal ", ex6task1(res.x))
print("Constraint value at optimal ", ex6constraint1(res.x))

Optimization terminated successfully.
         Current function value: 0.693230
         Iterations: 104
         Function evaluations: 195
Given optimal  [  1.00006057e+00  -4.82302320e-03   2.27404039e-05]
f(x) at optimal  0.693230493818
Constraint value at optimal  0.000158462259558


Thoughts on this; Penalty function method doesn't look too bad actually as the problem is simple and runs fast despite many function evaluations and iterations. Penalty function as a constraint handling mechanism works fine here too. 

After trying with different starting points, best f(x) I can find is approximately 0.69.

### Task 2

We can minimize the first objective function to zero with vector $x = (1, 0)$. The second can be minimized with $x = (0, 1)$, respectively. So, the full set of pareto optimal solutions lie on a line between these two points, which is intuitive as we are calculating the distance of the solution to the two points. This makes the full set of pareto optimal solutions 
$$
\{x \in \mathbb R^2, x_1, x_2\geq0 \mid x_1+x_2=1\}
$$

### Task 3

$z^{ideal}$ is trivially $(0, 0)$ as both objective functions can be minimized to zero with their respective parameter vectors ($(1, 0)$ for the first and $(0, 1)$ for the second).

$z^{nadir}$ is $(f_1([0, 1]), f_2([1, 0])) = (\sqrt{2}, \sqrt{2})$ as the objective functions achieve these maximum values when parameter vectors are at the opposite extremes of Pareto optimal solutions.

### Task 4

In [104]:
import numpy
def ex6fun(x):
    return [numpy.linalg.norm(x - numpy.array([1, 0])), numpy.linalg.norm(x - numpy.array([0, 1]))]

In [105]:
import math
def ex6_normalized(x):
    z_ideal = [0, 0]
    z_nadir = [math.sqrt(2), math.sqrt(2)]
    z = ex6fun(x) 
    return [(zi - zideali) / (znadiri - zideali) for 
            (zi, zideali, znadiri) in zip(z, z_ideal, z_nadir)]

In [106]:
import numpy as np
from scipy.optimize import minimize
import ad
def weighting_method(f, w):
    points = []
    bounds = ((0, 1),(0,1)) #Bounds of the problem
    for wi in w:
        res = minimize(
            #weighted sum
            lambda x: sum(np.array(wi) * np.array(f(x))), 
            [0.5, 0.5], method='SLSQP'
            #Jacobian using automatic differentiation
            ,jac = ad.gh(lambda x: sum(np.array(wi) * np.array(f(x))))[0]
            #bounds given above
            ,bounds = bounds, options = {'disp' : False})
        points.append(res.x)
    return points

In [110]:
w = np.random.random((100, 2)) #n random weights
repr = weighting_method(ex6_normalized, w)

  lc_wrt_args = [y*x**(y - 1), 0.]
  qc_wrt_args = [y*(y - 1)*x**(y - 2), 0.]
  lc_wrt_vars[var1] += dh*fdv1
  qc_wrt_vars[var1] += dh*f.d2(var1) + d2h*fdv1**2
  tmp = dh*f.d2c(var1, var2) + d2h*f.d(var1)*f.d(var2)


In [113]:
import matplotlib.pyplot as plt
m = np.matrix(repr)
plt.scatter(m[:, 0], m[:, 1])
plt.show()

When plotted, we see basically 2 or 3 solutions found on the line (though many overlapping solutions at the extremes). Looking at the representation variables a lot of the solutions have NaN elements, which also indicates numerical problems while obtaining the results. The warnings during the optimization tell the same story (divide by zero). So as demonstrated during the class, weighting method is not the best one at visualizing the whole Pareto front.