In [1]:
from IPython.core.display import HTML
HTML("<style>.container { width:95% !important; }</style>")

# Lecture 11, Solution methods for multiobjective optimization 

## Reminder:

### Mathematical formulation of multiobjective optimization problems

Multiobjective optimization problems are often formulated as
$$
\begin{align} \
\min \quad &\{f_1(x),\ldots,f_k(x)\}\\
\text{s.t.} \quad & g_j(x) \geq 0\text{ for all }j=1,\ldots,J\\
& h_q(x) = 0\text{ for all }q=1,\ldots,Q\\
&a_i\leq x_i\leq b_i\text{ for all } i=1,\ldots,n\\
&x\in \mathbb R^n,
\end{align}
$$
where $$f_1,\ldots,f_k:\{x\in\mathbb R^n: g_j(x) \geq 0 \text{ for all }j=1,\ldots,J \text{ and } h_q(x) = 0\text{ for all }q=1,\ldots,Q\}\mapsto\mathbb R$$ are the objective functions.

## Pareto optimality
A feasible solution $x_1$ is Pareto optimal to the above multiobjective optimization problem, if there does not exist a feasible solution $x_2$, $x_1\neq x_2$, such that 
$$
\left\{
\begin{align}
&f_i(x_2)\leq f_i(x_1)\text{ for all }i\in \{1,\ldots,k\}\\
&f_j(x_2)<f_j(x_1)\text{ for some }j\in \{1,\ldots,k\}.\\
\end{align}
\right.
$$

### Basic concepts

* There is no single optimal solution, instead we have a set of solutions called **Pareto optimal** set.
* All the objectives don’t have the same optimal solution → optimality needs to be modified 

### PARETO OPTIMALITY (PO)
* A solution is Pareto optimal if none of the objectives can be improved without impairing at least one of the others

It means:                   
$$
\text{“Take from Sami to pay Anna”}
$$

* Optimal solutions are located at the boundary to the down & left (for minimization problems)*
![alt text](images/po4.png)

* There are two spaces connected to the problem: the space $\mathbb R^n$ is called the decision space and $\mathbb R^k$ is called the objective space. 

1. **Decision space**: includes the **Pareto optimal solution set**
2. **Objective space**: consists of the image of Pareto optimal solutions (**Pareto frontier**) 

![alt text](images/basic_definitions2.svg "Multiobjective optimization")



## Some more concepts:

In addition to Pareto optimality, two more concepts are important, which are called the ideal and the nadir vector. 

* **Ideal objective vector $𝒛^{ideal}$:** best values for each objective (when optimized independently)
* **Nadir objective vector $𝒛^{𝑛𝑎𝑑ir}$:** worst values for each objective in PO set 

![alt text](images/nadir_ideal.svg "Nadir and the ideal vectors")

Mathematically the ideal vector $z^{ideal}$ can be defined as having 
$$
z^{ideal}_i = \begin{align} \
\min \quad &f_i(x)\\
\text{s.t.} \quad &x\text{ is feasible}
\end{align}
$$
for all $i=1,\ldots,k$ (i.e., by **solving single-objective optimization problems, one for each objective**). 

The nadir vector $z^{nadir}$ on the other hand has
$$
z^{nadir}_i = 
\begin{align}
\max \quad &f_i(x)\\
\text{s.t.} \quad &x\text{ is Pareto optimal},
\end{align}
$$
for all $i=1,\ldots,k$ (**not as straightforward as calculating the ideal points for more than two objectives**).

## Optimization problem formulation
* By optimizing only one criterion, the rest are not considered
* Objective vs. constraint
* Summation of the objectives
   * adding apples and oranges
* Converting the objectives (e.g., as costs)
   * not easy, includes uncertainteis
* Multiobjective formulation reveals interdependences between the objectives

## Example

Consider multiobjective optimization problem
$$
\min \{f_1(x,y)=x^2+y,\quad f_2(x,y)=1-x\}\\
\text{s.t. }x\in[0,1], y\geq0.
$$

#### Pareto optimal solutions
Now, the set of Pareto optimal solutions is

$$
\{(x,0):x\in[0,1]\}.
$$

How to show this?

Let's show that $(x',0)$ is Pareto optimal for all $x'\in[0,1]$. *The idea of the proof: assume that $(x',0)$ is not Pareto optimal and then deduce a contradiction.*

Let's assume there exist a feasible solution $(x,y) \neq (x',0) $ with $x\in[0,1]$ and $y\geq0$ such that

$$
\left\{
\begin{align}
f_1(x,y)=x^2+y\leq x'^2=f_1(x',0),\textbf{ and}\\
f_2(x,y)=1-x\leq 1-x'=f_2(x',0).
\end{align}
\right.
$$

and

$$
\left\{
\begin{align}
f_1(x,y)=x^2+y< x'^2 =f_1(x',0)\textbf{ or}\\
f_2(x,y)=1-x< 1-x'=f_2(x',0).
\end{align}
\right.
$$

Second inequality in the first system of inequalities gives $x\geq x'$. This yields from the first inequality in that same system of inequalities

$$
y\leq x'^2-x^2\leq 0.
$$

Thus, $y=0$. This means that $x=x'$ using again the first inequality.

This means that the solution cannot satisfy the second system of strict inequalities. We have a contradiction and, therefore, $(x',0)$ has to be Pareto optimal.

Now, we show that any other feasible solution can not be Pareto optimal. Let's assume a solution $(x,y)$, where $x\in[0,1]$ and $y>0$ and show that this is not Pareto optimal:

By choosing solution $(x,0)$, we have 

$$
\left\{
\begin{align}
f_1(x,0)=x^2<x^2+y=f_1(x,y) ,\text{ and}\\
f_2(x,0)=1-x\leq 1-x=f_2(x,y).
\end{align}
\right.
$$

Thus, the solution $(x,y)$ cannot be Pareto optimal.

#### Ideal
Now

$$
\begin{align}
\min f_1 = x^2+y\\
\text{s.t. }x\in[0,1],\ y\geq0
\end{align}
= 0
$$

and

$$
\begin{align}
\min f_2 = 1-x\\
\text{s.t. }x\in[0,1],\ y\geq0
\end{align}
= 0.
$$

Thus, the ideal is

$$
z^{ideal} = (0,0)^T
$$

#### Nadir
Now,

$$
\begin{align}
\max x^2+y\\
\text{s.t. }x\in[0,1],\ y=0
\end{align}
= 1
$$

and

$$
\begin{align}
\max 1-x\\
\text{s.t. }x\in[0,1],\ y=0
\end{align}
= 1.
$$

Thus, 

$$
z^{nadir}=(1,1)^T.
$$

# What means solving a multiobjective optimization problem?

* **Find all Pareto optimal solutions**
  * As we learned in the previous lecture, there can be infinitely many Pareto optimal solutions for problems having real valued variables $\rightarrow$ extremely difficult and possible only in some simple special cases

* **Find a set of solutions that approximate the set of all Pareto optimal solutions**
  * How to evaluate the goodness of the approximation? (closeness, spread, ...)
  * The number of solutions required for a good approximation grows exponentially with the number of objectives!
  * Works well with two objectives and, in some cases, for three objectives

* **Find a solution/solutions that best satisfies the preferences of a decision maker**
  * Usually, in practical problems, one solution has to be finally selected for further analysis
  * Sometimes, more than one (but not that many) are needed $\rightarrow$, e.g., choose the best design for different types of cars to be manufactured (small and economical sedan, spacious wagon, efficient sports model, etc.)
  * does not depend on the number of objectives

If you want to know more about the topic of this lecture, I urge you to read Professor Miettinen's book Nonlinear Multiobjective Optimization

![Nonlinear Multiobjective Optimization](images/Miettinen2.gif)

## Scalarization
* One way to solve a multiobjective optimization problem is to convert it to a single objective subproblem whose solution is Pareto optimal for the original problem
* The subproblem is called a *scalarization* and it can be solved by using a suitable single objective optimization method
* By changing the values of the parameters in the scalarization, different (Pareto optimal) solutions can be computed

## Classification of methods

Methods for multiobjective optimization are often characterized by the involvement of the decision maker in the process.

The types of methods are
* **no preference methods**, where the decision maker does not play a role,
* **a priori methods**, where the decision maker gives his/her preference information at first and then the optimization method finds the best match to that preference information,
* **a posteriori methods**, where the optimization methods try to characterize all/find a good representation of the Pareto optimal solutions and the decision maker chooses the most preferred one of those,
* **interactive methods**, where the optimization method and the decision maker alternate in iterative search for the most preferred solution.

## Multiple Criteria Decision Making (MCDM)
* The related research field is called multiple criteria decision making
* More information in the website of the <a href="http://www.mcdmsociety.org/">International Society on MCDM</a>

##  Our example problem for this lecture

We study a hypothetical decision problem of buying a car, when you can choose to have a car with power between (denoted by $p$) 50 and 200 kW and average consumption (denoted by $c$) per 100 km between 3 and 10 l. However, in addition to the average consumption and power, you need to decide the volume of the cylinders (v), which may be between 1000 $cm^3$ and 4000 $cm^3$. Finally, the price of the car follows now a function 

$$
P = \left(\sqrt{\frac{p-50}{50}}\\
+\left(\frac{p-50}{50}\right)^2+0.3(10-c)\\ +10^{-5}\left(v-\left(1000+3000\frac{p-50}{150}\right)\right)^2\right)10000\\+5000
$$

in euros. This problem can be formulated as a multiobjective optimization problem

$$
\begin{align}
\min \quad & \{c,-p,P\},\\
\text{s.t. }\quad
&50\leq p\leq 200\\
&3\leq c\leq 10\\
&1000\leq v\leq 4000,\\
\end{align}
$$

In [2]:
#Let us define a Python function which returns the value of this
import math
def car_problem(c,p,v):
#    import pdb; pdb.set_trace()
    return [#Objective function values
        c,-p,
        (math.sqrt((p-50.)/50.)+((p-50.)/50.)**2+
        0.3*(10.-c)+0.00001*(v-(1000.+3000.*(p-50.)/150.))**2)*10000.
        +5000.] 

In [3]:
print("Car with 3 l/100km consumption, 50kW and 1000cm^3 engine would cost "
      +str(car_problem(3,50,1000)[2])+"€")
print("Car with 3 l/100km consumption, 100kW and 2000cm^3 engine would cost "
      +str(car_problem(3,100,2000)[2])+"€")
print("Car with 3 l/100km consumption, 100kW and 1000cm^3 engine would cost "
      +str(car_problem(3,100,1000)[2])+"€")

Car with 3 l/100km consumption, 50kW and 1000cm^3 engine would cost 26000.0€
Car with 3 l/100km consumption, 100kW and 2000cm^3 engine would cost 46000.0€
Car with 3 l/100km consumption, 100kW and 1000cm^3 engine would cost 146000.0€


## Normalization of the objectives

**In many of the methods, the normalization of the objectives is necessary.**

We can normalize the objectives using the nadir and ideal vectors and setting the normalized objective as
$$ \tilde f_i = \frac{f_i-z_i^{ideal}}{z_i^{nadir}-z_i^{ideal}}$$

## Calculating the ideal

**Finding the ideal for problems is usually easy, if you can optimize the objective functions separately.**

For the car problem, ideal can be computed easily using scipy:

In [6]:
#Calculating the ideal
from scipy.optimize import minimize
import ad
def calc_ideal(f):
    ideal = [0]*3 #Because there are three objectives
    solutions = [] #list for storing the actual solutions, which give the ideal
    bounds = ((3,10),(50,200),(1000,4000)) #Bounds of the problem
    starting_point = [3,50,1000]
    for i in range(3):
        res=minimize(
            #Minimize each objective at the time
            lambda x: f(x[0],x[1],x[2])[i], starting_point, method='SLSQP'
            #Jacobian using automatic differentiation (note: SLSQP can estimate gradiants itself with some extra function evaluations)
            ,jac=ad.gh(lambda x: f(x[0],x[1],x[2])[i])[0]
            #bounds given above
            ,bounds = bounds
            ,options = {'disp':True, 'ftol': 1e-20, 'maxiter': 1000})
        solutions.append(f(res.x[0],res.x[1],res.x[2]))
        ideal[i]=res.fun
    return ideal,solutions

In [7]:
ideal, solutions= calc_ideal(car_problem)
print ("ideal is "+str(ideal))

Optimization terminated successfully    (Exit mode 0)
            Current function value: 3.0
            Iterations: 1
            Function evaluations: 1
            Gradient evaluations: 1
Optimization terminated successfully    (Exit mode 0)
            Current function value: -200.0
            Iterations: 5
            Function evaluations: 5
            Gradient evaluations: 5
Optimization terminated successfully    (Exit mode 0)
            Current function value: 5000.0
            Iterations: 4
            Function evaluations: 3
            Gradient evaluations: 3
ideal is [3.0, -200.0, 5000.0]


## Pay-off table method

**Finding the nadir value is, however, usually much more complicated..**

Usually, the nadir value is estimated using the so-called pay-off table method.

The pay-off table method does not guarantee finding the exact nadir for problems with more than two objectives. <!--(One of your exercises this week will be to show this.)--> 

The method is, however, a generally accepted way of approximating the nadir vector.

In the pay-off table method:
1. the objective values for attaining the individual minima are added to the table
2. the nadir is estimated by each objective maxima in the table.
3. the ideal values are located in the diagonal of the pay-off table

![alt text](images/payoff.jpg "Pay-off table")

### $x^{(*,i)} =$ optimal solution for $f_i$ 

### The nadir for the car selection problem
The table now becomes by using the *solutions* that we returned while calculating the ideal

In [8]:
for solution in solutions:
    print(solution) 

[3.0, -50.0, 26000.0]
[3.0, -200.0, 1033320.5080756888]
[10.0, -50.0, 5000.0]


Thus, the esimation of the nadir vector is 
$$(10,-50,1033320.5080756888)$$

This is actually the real Nadir vector for this problem.

### Normalized car problem

In [9]:
#Let us define a Python function which returns the value of this
import math
def car_problem_normalized(c,p,v):
    z_ideal = [3.0, -200.0, 5000]
    z_nadir = [10,-50,1033320.5080756888]
    z = car_problem(c,p,v) 
    return [(zi-zideali)/(znadiri-zideali) for 
            (zi,zideali,znadiri) in zip(z,z_ideal,z_nadir)]

<a href="https://docs.python.org/3.3/library/functions.html#zip">the zip function</a> in Python

In [10]:
print("Normalized value of the car problem at (3,50,1000) is "
      +str(car_problem_normalized(3,50,1000)))
print("Normalized value of the car problem at (3,125,2500) is "
      +str(car_problem_normalized(3,125,2500)))
print("Normalized value of the car problem at (10,100,1000) is "
      +str(car_problem_normalized(10,100,1000)))

Normalized value of the car problem at (3,50,1000) is [0.0, 1.0, 0.020421648537670038]
Normalized value of the car problem at (3,125,2500) is [0.0, 0.5, 0.054212133547970276]
Normalized value of the car problem at (10,100,1000) is [1.0, 0.6666666666666666, 0.11669513450097163]


**So, value 1 now indicates the worst value on the Pareto frontier and value 0 indicates the best values**

Let's set the ideal and nadir for later reference:

In [11]:
z_ideal = [3.0, -200.0, 5000]
z_nadir = [10.,-50,1033320.5080756888]

**From now on, we will deal with the normalized problem, although, we write just $f$.** The aim of this is to simplify presentation.