# Multi-objective robust decision making (MORDM)


This exercise demostrates the application of MORDM on the lake model, which was used in earlier exercises.

MORDM has four main steps:

(i)	    **problem formulation** based on a systems analytical problem definition framework 

(ii)	**searching** for candidate solutions that optimize multiple objectives by using multi-objective evolutionary algorithms 

(iii)	generating an ensemble of scenarios to **explore** the effects of uncertainties 

(iv)	using **scenario discovery** to detect the vulnerabilities of candidate solutions and improving thecandidate solutions



## Step 1: Problem formulation
### Lake Model

Remember the lake problem used in the assignments in previous weeks. The lake problem is a hypothetical case where the inhabitants of a lake town decide on the amount of annual pollution they release into a lake. It the pollution in the lake passes a threshold, it will suffer irreversible eutrophication.

The lake problem has 4 **outcome indicators**: 
   - **max_P**: maximum pollution over time, to be minimized
   - **utility**: economic benefits obtained from polluting the lake, to be maximized
   - **inertia**: the percentage of significant annual changes in the anthropogenic pollution rate, to be maximized
   - **reliability**: the percentage of years where the pollution level is below the critical threshold, to be maximized
    
See the lake model exercise for the formulation of these outcome variables.

The lake problem is characterized by both stochastic uncertainty and **deep uncertainty**. The stochastic uncertainty arises from the natural inflow. To reduce this stochastic uncertainty, multiple replications are performed and the average over the replication is taken. Deep uncertainty is presented by uncertainty about the mean $\mu$ and standard deviation $sigma$ of the lognormal distribution characterizing the natural inflow, the natural removal rate of the lake $\beta$, the natural recycling rate of the lake $q$, and the discount rate $\delta$. The table below specifies the ranges for the deeply uncertain factors, as well as their best estimate or default values. 

|Parameter	|Range	        |Default value|
|-----------|--------------:|------------:|
|$\mu$    	|0.01 – 0.05	|0.02         |
|$\sigma$	|0.001 – 0.005 	|0.0017       |
|$b$      	|0.1 – 0.45	    |0.42         |
|$q$	    |2 – 4.5	    |2            |
|$\delta$	|0.93 – 0.99	|0.98         |


The lake problem in previous assignments had 100 decision **levers**, meaning that the decision makers independently decide on the amount of anthropogenic pollution at every time step (100). Then a 'policy' was a set of values for these 100 levers, which you composed by sampling from the range [0, 0.1].   

In this exercise, we will use a more advanced way of deciding on the amout of anhtropogenic polution. We will use a **closed loop** version of the lake model, meaning that $a_t$ (anthropogenic pollution) is dependent on $X_t$ (the pollution level at time t). For instance, the rate of anthropogenic pollutions is lowered if the pollution level is approaching a critical threshold. Here, we use "cubic radial basis functions" following [Quinn et al. 2017](http://www.sciencedirect.com/science/article/pii/S1364815216302250) and formulate $a_t$ as follows:

\begin{equation}
    a_{t} =  min\Bigg(max\bigg(\sum\limits_{j=1}^{n} w_{j}\left\vert{\frac{X_{t,i}-c_{j}}{r_{j}}}\right\vert^3, 0.01\bigg), 0.1\Bigg) \\
    s.t. \\
    -2 \leq c_{j} \leq 2 \\
    0 \leq r_{j} \leq 2 \\ 
    0 \leq w_{j} \leq 1 \\
    \sum\limits_{j=1}^{n} w_{j} = 1
\end{equation}


The parameters that define this function also define the pollution strategy over time. Hence, the decision **levers** are the five parameters of this functions, namely $c_1$, $c_2$, $r_1$, $r_2$ and $w_1$. ($w_2$ = 1 - $w_1$).

Note:: i is index for the realization, given m realizations; j is the index for the radial basis function, given 2 radial basis functions. 

**To formulate this problem, do the following:**

**1) Import the lake model function from dps_lake_model.py**

**2) Create an ema_workbench interface for this problem, with corresponding uncertainties, levers and outcomes as specified above**



In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D 
from ema_workbench.analysis import parcoords

from ema_workbench import (Model, RealParameter, ScalarOutcome, TimeSeriesOutcome, perform_experiments,
                           Constant, MultiprocessingEvaluator, ema_logging, Constraint, save_results,
                           Policy)


ema_logging.log_to_stderr(ema_logging.INFO)

  return f(*args, **kwds)
  return f(*args, **kwds)
  return f(*args, **kwds)


<Logger EMA (DEBUG)>

In [2]:
from dps_lake_model import lake_model

In [3]:
model = Model('lakeproblem', function=lake_model)

#specify uncertainties
model.uncertainties = [RealParameter('b', 0.1, 0.45),
                       RealParameter('q', 2.0, 4.5),
                       RealParameter('mean', 0.01, 0.05),
                       RealParameter('stdev', 0.001, 0.005),
                       RealParameter('delta', 0.93, 0.99)]

# set levers
model.levers = [RealParameter("c1", -2, 2),
                RealParameter("c2", -2, 2),
                RealParameter("r1", 0, 2),
                RealParameter("r2", 0, 2),
                RealParameter("w1", 0, 1)]

#specify outcomes
model.outcomes = [ScalarOutcome('max_P', ScalarOutcome.MINIMIZE),
                  ScalarOutcome('utility', ScalarOutcome.MAXIMIZE),
                  ScalarOutcome('inertia', ScalarOutcome.MAXIMIZE),
                  ScalarOutcome('reliability', ScalarOutcome.MAXIMIZE)]

# override some of the defaults of the model
model.constants = [Constant('alpha', 0.41),
                   Constant('nsamples', 150),
                   Constant('myears', 100)]

## Step 2: Searching for candidate solutions

In the second step of MORDM, candidate strategies are identified which are pareto optimal conditional on a reference scenario. These candiate strategies are identified through search with multi-objective evolutionary algorithms, that iteratively evaluate a large number of alternatives on multiple objectives until they find the best candidates. For instance, when we optimize the lake model levers, the lake model function will be called for each candidate evaluation, and the corresponding four objective values will be generated. 

Take the model interface developed in the previous step and use the optimization functionality of the workbench to identify the pareto approximate set of solutions. Try the following:
* change the epsilon values between 0.01 and 0.1, what changes, why?
* change the number of function evaluations from 1000 to 10.000 (this requires using multiprocessing unless you are very patient). What is the difference? You can use  convergence as explained in the assignment 7 for this

In [None]:
#Need to identify epsilons for every outcome

with MultiprocessingEvaluator(model) as evaluator:
    results = evaluator.optimize(nfe=1000, epsilons=[0.01, 0.1])

In [None]:
results

**plot the tradeoffs you have found using a parallel axis plot**

We can visualize these tradeoffs on **parallel axis plots**. In these plots, each dimension is shown as a vertical axis. Each solution is represented by a line on this plot, which crosses the objective axes at the corresponsing value. You can use the [parcoords functionality](https://emaworkbench.readthedocs.io/en/latest/ema_documentation/analysis/parcoords.html) for this that comes with the ema_workbench. Ensure that the direction of desirability is the same for the four objectives.|



In [None]:
outcomes = results.loc[:, ['max_P', 'utility', 'inertia', 'reliability']]
outcomes

In [None]:
limits = parcoords.get_limits(outcomes)
axes = parcoords.ParallelAxes(limits)
axes.plot(outcomes)

# we invert this axis so direction of desirability is the same 
axes.invert_axis('max_P') 
plt.show()

**What does this plot tell us about the tradeoffs and conflicting objectives?**

There is a significant trade-off between utility with max_P and reliability.
Depending on the 

## Step 3: Re-evaluate candidate solutions under uncertainty

We now have a large number of candidate solutions (policies), we can re-evaluate them over the various deeply uncertain factors to assess their robustness against uncertainties.

For this robustness evaluation, we need to explore the scenarios for each solution. It means that, if we would like to run for instance 1000 scenarios for each solution, we might have to execute a very large number of runs.

Here, to simplify the case, let's suppose that decision makers have a hard constrain on *reliability*. No solution with less than 90% reliability is acceptable for them. Therefore, we can reduce the size of the solution set according to this constraint. 

**Apply this constraint of reliability on the results, and create a new dataframe named new_reults**


In [None]:
constraints = [Constraint("max reliability", outcome_names="reliability",
                          function=lambda x: abs(min(0.9, x) - 0.9))]

In [None]:
with MultiprocessingEvaluator(model) as evaluator:
    results = evaluator.optimize(nfe=1000, searchover='levers',
                                    epsilons=[0.01, 0.01, 0.01, 0.01],
                                    constraints=constraints)

In [None]:
results

In [None]:
results.to_csv("new_results.csv")

write the new_results to a csv file. You will need them to complete assignment 9. 


**From new_results, which is the reduced dataframe of candidate solutions, make a list of policies in a format that can be inputed to the *perform_experiments* function of the EMA workbench.**

*hint: you need to transform each policy to a dict, and then use this dict as input for the Policy class that comes with the workbench*

In [None]:
new_results = pd.read_csv("new_results.csv", index_col=0)
new_results.head()

In [None]:
#from ema_workbench import Policy
#policies = [Policy("no release", **{k.name:0 for k in model.levers})]

In [None]:
#Change to dictionary
policy = new_results.iloc[:, :5]
policy = policy.to_dict(orient="index")

In [None]:
#Add them into the policies frame
policies = []

for i in policy:
    policies.append(Policy(str(i), **policy[i]))

In [None]:
policies

In [None]:
with MultiprocessingEvaluator(model) as evaluator:
    experiments, outcomes = evaluator.perform_experiments(scenarios = 1000, policies = policies)

In [None]:
pd.DataFrame(experiments).to_csv("experiments.csv")
pd.DataFrame(outcomes).to_csv("outcomes.csv")

**Perform 1000 scenarios for each of the policy options. Depending on how many solutions are left after implementing the constraint, consider using multiprocessing or ipyparallel to speed up calculations.**

If you want to use ipyparallel, don't forget to start ipcluster.

We can now evaluate the **robustness** of each of the policy options based on these scenario results. We can calculate the robustness of a policy option in terms of its performance on an outcome indicator across the 1000 scenarios. In other words, we can identify how robust a policy is in terms of each outcome indicator, and investigate the robustness tradeoffs.  

There are multiple metrics to quantify robustness. On of them is the *signal to noise ratio*, which is simply the mean of a dataset divided by its standard deviation. For instance, for an outcome indicator to be maximized, we prefer a high average value across the scenarios, and a low standard deviation, implying a narrow uncertaintiy range. Therefore, we want to maximize the signal-to-noise ratio. For an outcome indicator to be minimized, a lower mean and a lower standard deviation is preferred. Therefore the formulation is different.

**Write a function to calculate the signal-to-noise ratio for both kinds of outcome indicators. Calculate the signal-to-noise ratios for each outcome and each policy option. Plot the tradeoffs on a parallel axis plot. Which solutions look like a good compromise policy?**

In [42]:
experiments = pd.read_csv("experiments.csv", index_col = 0)
outcomes = pd.read_csv("outcomes.csv", index_col=0)

In [43]:
results = experiments.join(outcomes)

In [44]:
results

Unnamed: 0,b,delta,mean,q,stdev,c1,c2,r1,r2,w1,scenario,policy,model,max_P,utility,inertia,reliability
0,0.109796,0.988740,0.039896,2.096698,0.003824,0.348639,0.234149,1.001437,0.594397,0.053911,0,0,lakeproblem,10.314222,2.254142,0.970733,0.020000
1,0.381877,0.971309,0.041008,4.469304,0.003480,0.348639,0.234149,1.001437,0.594397,0.053911,1,0,lakeproblem,0.134897,0.192583,0.990000,1.000000
2,0.220966,0.948442,0.010555,3.307183,0.003605,0.348639,0.234149,1.001437,0.594397,0.053911,2,0,lakeproblem,0.102834,0.151757,0.990000,1.000000
3,0.117147,0.965979,0.038028,2.623277,0.004904,0.348639,0.234149,1.001437,0.594397,0.053911,3,0,lakeproblem,9.692053,0.911673,0.973600,0.059133
4,0.256448,0.948138,0.045853,3.960270,0.002585,0.348639,0.234149,1.001437,0.594397,0.053911,4,0,lakeproblem,0.229748,0.135220,0.990000,1.000000
5,0.381027,0.944867,0.021704,2.947896,0.003944,0.348639,0.234149,1.001437,0.594397,0.053911,5,0,lakeproblem,0.095152,0.154196,0.990000,1.000000
6,0.444575,0.962092,0.012454,4.156412,0.003627,0.348639,0.234149,1.001437,0.594397,0.053911,6,0,lakeproblem,0.074020,0.260341,0.990000,1.000000
7,0.272795,0.930839,0.033831,4.493968,0.001524,0.348639,0.234149,1.001437,0.594397,0.053911,7,0,lakeproblem,0.162096,0.116428,0.990000,1.000000
8,0.366202,0.974489,0.026868,2.952568,0.002957,0.348639,0.234149,1.001437,0.594397,0.053911,8,0,lakeproblem,0.106536,0.215664,0.990000,1.000000
9,0.164369,0.967542,0.049079,3.660982,0.001908,0.348639,0.234149,1.001437,0.594397,0.053911,9,0,lakeproblem,6.986131,0.842850,0.973133,0.145333


In [106]:
def signal_to_noise(df, ind):
    results_groupby = results.groupby(by = "policy")
    mean = results_groupby[ind].mean()
    std = results_groupby[ind].std()
    sn = mean / std
    
    results.merge(sn, on = "policy")

In [107]:
signal_to_noise(results, "max_P")

ValueError: can not merge DataFrame with instance of type <class 'pandas.core.series.Series'>

In [102]:
mean_utility = results.groupby("policy")["utility"].mean()
std_utility = results.groupby("policy")["utility"].std()
sn_utility = mean_utility / std_utility
sn_utility

policy
0     0.980157
1     1.002590
2     1.299499
3     1.333364
4     1.175503
5     1.054837
6     1.365269
7     1.533969
8     1.449310
9     1.490067
10    1.099823
11    1.252143
12    1.189842
13    1.429659
Name: utility, dtype: float64

In [12]:
def signal_to_noise(data):
    mean = np.mean(data)
    std = np.std(data)
    sn = mean / std
    return sn

In [17]:
signal_to_noise(outcomes["max_P"].iloc[:999])

0.6784739233899816

In [None]:
MAXIMIZE = ScalarOutcome.MAXIMIZE  # @UndefinedVariable
MINIMIZE = ScalarOutcome.MINIMIZE  # @UndefinedVariable

robustnes_functions = [
    ScalarOutcome(
        'mean p',
        kind=MINIMIZE,
        variable_name='max_P',
        function=np.mean),
    ScalarOutcome(
        'std p',
        kind=MINIMIZE,
        variable_name='max_P',
        function=np.std),
    ScalarOutcome(
        'sn reliability',
        kind=MAXIMIZE,
        variable_name='reliability',
        function=signal_to_noise)]

In [None]:
with MultiprocessingEvaluator(model) as evaluator:
    evaluator.robust_optimize(robustnes_functions, scenarios=10, nfe=1000,
                              epsilons=[0.1, ] * len(robustnes_functions),
                              population_size=5)

Another robustness metric is **maximum regret**, calculated again for each policy and for each outcome indicator. *Regret* is defined for each policy under each scenario, as the difference between the performance of the policy in a specific scenario and the berformance of a no-regret (i.e. best possible result in that scenario) policy. The *maximum regret*  is then the maximum of such regret values across all scenarios. We of course favor policy options with low *maximum regret* values. 

**Write a function to calculate the maximum regret. Calculate the maximum regret values for each outcome and each policy option. Plot the tradeoffs on a parallel plot. Which solutions look like a good compromise policy?**