# Uncertainty analysis

In this part the candidate solutions provide by the previous step are tested on their robustness level as this is prefered by our client, Rijkswaterstaat. Two metrics are applied to identify the level of robustness being; the Signal-to-Noise ratio and the Maximum regret value.

> 2 csv files gemaakt met outcomes en experiments van de 9 policies over 5000 scenarios
> Kort uitleggen wat signal-to-noise en maximum regret eigenlijk zijn



Eerst heb je weer de resultaten vanuit stap 2 met de verschillende policy options.
Wanneer dit er teveel zijn kan je nog verder verminderen met extra constraints (kijk document van Kwakkel)
De runs van de scenarios moeten in een aparte python file komen te staan.

> importeren van de 2 csv files zoals gegenereerd in dike_model_uncertainty_simulation

In [9]:
# Import necessary libraries
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

from ema_workbench import (Model, RealParameter, ScalarOutcome,
                           MultiprocessingEvaluator, ema_logging,
                           Constant, SequentialEvaluator, Policy)
from ema_workbench.analysis import parcoords

# Import the dike_model - onduidelijk of dit nodig is
from problem_formulation import get_model_for_problem_formulation

# Define model
dike_model, planning_steps = get_model_for_problem_formulation(2)

In [10]:
policies = pd.read_csv('data/output_data/policies.csv')
policies_to_evaluate = []

for i, policy in policies.iterrows():
    policies_to_evaluate.append(Policy(str(i), **policy.to_dict()))

In [11]:
uncertainty_outcomes = pd.read_csv('data/output_data/outcomes_uncertainty_simulation_5000s.csv')
uncertainty_experiments = pd.read_csv('data/output_data/experiments_uncertainty_simulation_5000s.csv')

In [6]:
uncertainty_outcomes

Unnamed: 0.1,Unnamed: 0,Expected Annual Damage,Dike Investment Costs,RfR Investment Costs,Evacuation Costs,Expected Number of Deaths,policy
0,0,1.298340e+07,2.272627e+08,30700000.0,622.811506,0.001405,0
1,1,8.960759e+06,2.272627e+08,30700000.0,621.736842,0.001399,0
2,2,3.102178e+07,2.272627e+08,30700000.0,1749.951117,0.003745,0
3,3,2.204922e+09,2.272627e+08,30700000.0,25186.549868,0.182174,0
4,4,4.365725e+06,2.272627e+08,30700000.0,286.858792,0.000312,0
...,...,...,...,...,...,...,...
44995,44995,2.576930e+07,1.252766e+08,351800000.0,1870.258571,0.002317,8
44996,44996,0.000000e+00,1.252766e+08,351800000.0,0.000000,0.000000,8
44997,44997,2.898542e+08,1.252766e+08,351800000.0,2804.461927,0.027663,8
44998,44998,4.459176e+08,1.252766e+08,351800000.0,6174.357411,0.044204,8


In [7]:
uncertainty_experiments

Unnamed: 0.1,Unnamed: 0,A.0_ID flood wave shape,A.1_Bmax,A.1_Brate,A.1_pfail,A.2_Bmax,A.2_Brate,A.2_pfail,A.3_Bmax,A.3_Brate,...,A.3_DikeIncrease 2,A.4_DikeIncrease 0,A.4_DikeIncrease 1,A.4_DikeIncrease 2,A.5_DikeIncrease 0,A.5_DikeIncrease 1,A.5_DikeIncrease 2,scenario,policy,model
0,0,119,323.428446,1.5,0.377191,131.061323,10.0,0.293888,124.479343,1.0,...,1,5,1,2,7,0,0,0,0,dikesnet
1,1,121,167.993621,1.0,0.901348,42.789516,10.0,0.288916,196.810449,1.5,...,1,5,1,2,7,0,0,1,0,dikesnet
2,2,82,176.902307,1.5,0.485454,314.770470,1.0,0.107622,198.576705,1.5,...,1,5,1,2,7,0,0,2,0,dikesnet
3,3,25,259.339025,1.5,0.027976,117.269882,1.0,0.785194,327.156934,1.5,...,1,5,1,2,7,0,0,3,0,dikesnet
4,4,109,179.274614,1.0,0.655612,53.193314,1.0,0.737245,163.865646,10.0,...,1,5,1,2,7,0,0,4,0,dikesnet
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
44995,44995,23,132.589527,10.0,0.444795,185.448554,1.0,0.965250,279.006844,1.0,...,0,2,3,0,4,0,0,4995,8,dikesnet
44996,44996,47,225.003940,1.0,0.542024,243.679358,1.5,0.344594,318.874694,10.0,...,0,2,3,0,4,0,0,4996,8,dikesnet
44997,44997,19,55.639644,1.5,0.074683,249.368578,1.0,0.668730,138.086117,10.0,...,0,2,3,0,4,0,0,4997,8,dikesnet
44998,44998,42,101.033240,1.0,0.069266,41.679187,10.0,0.639812,335.528377,10.0,...,0,2,3,0,4,0,0,4998,8,dikesnet


## Signal-to-noise ratio

The Signal-to-noise ratio is executed in three steps. First the definition on how to calculate the Signal-to-noise ratio is given. Next, the Signal-to noise ratio is calculated for every unique policy of which the data is stored in the correct data frame form. Lastly, this data is used to determine the limits of the results and to visualize the outcomes.


In [8]:
#Definition is created on how to calculate the Signal-to-Noise ratio

def s_to_n(uncertainty_outcomes, direction):
    mean = np.mean(uncertainty_outcomes)
    std = np.std(uncertainty_outcomes)

#The Signal-to-noise ratio is calculated based upon the direction of each policy.
    if direction==ScalarOutcome.MAXIMIZE:
        return mean/std
    else:
        return mean*std

The second step in the process executes the defined method of calculating the Signal-to-Noise ratio and saves the data in a new dataframe. First, the code iterates over every unique policy in the experiments. The next iteration is on the outcomes in which the outcome is pulled for the current policy. These outcomes are used to calculate the Signal-to-noise ratio by utilizing the definition in the previous cell. In the last step these Signal-to-noise results are stored in the "scores" dictionary and later added to the "overall-scores" dictionary. Finally these results are converted to a data frame and used in the next step.

In [15]:
overall_scores = {}
for policy in np.unique(uncertainty_experiments['policy']):
    scores = {}

    logical = uncertainty_experiments['policy']==policy

    for outcome in dike_model.outcomes:
        value  = uncertainty_outcomes[uncertainty_outcomes.name][logical]
        sn_ratio = s_to_n(value, uncertainty_outcomes.kind)
        scores[uncertainty_outcomes.name] = sn_ratio
    overall_scores[policy] = scores
scores = pd.DataFrame.from_dict(overall_scores).T
scores

AttributeError: 'DataFrame' object has no attribute 'name'

The final step of the Signal-to-noise calculation focuses on the visualization of the results. For this purpose the parcoords method is used and imported from the EMA workbench. As a first step the scores data frame is renamed to "data". In a next step the limits of the results for each column are determined and the lower bound of the results is set to zero. Then the data with the limits is plotted using the parcoords method. The axis of the max_P is inverted as a low score for this objective is preferred.

In [None]:
from ema_workbench.analysis import parcoords

#The data frame is renamed and the result limits are determined for the visualization.
data = scores
limits = parcoords.get_limits(data)
limits.loc[0, ['utility', 'inertia', 'reliability', 'max_P']] = 0

#The data is being plotted and the inverted axis is taken of max_P
paraxes = parcoords.ParallelAxes(limits)
paraxes.plot(data)
paraxes.invert_axis('max_P')
plt.show()

## Maximum regret

The maximum regret value provides another metric to determine the robustness of a policy. First the regret for each single policy needs to be tested under each scenario. The definition of regret is the difference in performance of a policy between a specific scenario and a reference scenario. The maximum height of this regret is than named as the maximum regret and is preferred to be as low as possible.

As we are dealing with outcomes that are preferred to be maximized and minimized, the regret values will not align and will lead to undesirable results. To fix this the absolute value is taken for the regret values so that it is possible to compare the results.


The crucial part of the maximum regret approach is the identification of the best possible outcome for each scenario. To ensure this the following steps were taken in the code.
1. Two empty dictionaries are created to capture both the overall regret of the policies and the maximum regret.
2. The creation of a dataframe that consists out of the outcome together with the policy and the scenario.
3. The dataframe is reshaped by indexing on the policy and the scenario id
4. Here the maximum value of each row is taken and applied to calculate the regret.
5. Lastly, all the regret results are converted into a dictionary. Moreover, the maximum regret is selected from these results and converted into a separate dictionary.

In [None]:
experiments, outcomes = results

overall_regret = {}
max_regret = {}
for outcome in model.outcomes:
    policy_column = experiments['policy']

    # create a DataFrame with all the relevent information
    # i.e., policy, scenario_id, and scores
    data = pd.DataFrame({outcome.name: outcomes[outcome.name],
                         "policy":experiments['policy'],
                         "scenario":experiments['scenario']})

    # reorient the data by indexing with policy and scenario id
    data = data.pivot(index='scenario', columns='policy')

    # flatten the resulting hierarchical index resulting from
    # pivoting, (might be a nicer solution possible)
    data.columns = data.columns.get_level_values(1)

    # we need to control the broadcasting.
    # max returns a 1d vector across scenario id. By passing
    # np.newaxis we ensure that the shape is the same as the data
    # next we take the absolute value
    #
    # basically we take the difference of the maximum across
    # the row and the actual values in the row
    #
    outcome_regret = (data.max(axis=1).values[:, np.newaxis] - data).abs()

    overall_regret[outcome.name] = outcome_regret
    max_regret[outcome.name] = outcome_regret.max()

In [None]:
max_regret = pd.DataFrame(max_regret)
sns.heatmap(max_regret/max_regret.max(), cmap='viridis', annot=True)
plt.show()