# Policy robustness

Within this notebook we analyse robustness of policies found with directed search and further constrained according to acceptability criteria of the client. This allows for investigating robustness trade-offs as well as selecting most robust policies.

There exists a multitude of robustness metric, stemming from the fact that there is no single definition of robustness (McPhail et al., 2018). Thus, we will filter the policies using two commonly used metrics, each having a different focus.

In the first step we calculate *signal to noise ratio (SnS)* which is an expected value robustness metric. *Signal to noise ratio* indicates an expected level of performance across a range of scenarios, and in case of desiring low values it is simply a product of mean and standard deviation of the outcome value across all scenarios. We keep policies that are on the (epsilon) pareto front in terms of the minimal values of *signal to noise ratio* across the five outcomes of interest.

In the second step we take the remaininig policies and filter them in a similar fashion, this time using *maximum regret*, which is as the name suggests a regret-based metric. *Regret* of a policy is defined as the
difference between the performance (outcome value) of the selected policy for a particular scenario and the
performance of the best possible policy for that scenario. Maximum regret is calculated per policy, per outcome across all scenarios. We then keep policies on the (epsilon) pareto front in terms of the minimal values of *maximum regret* across the five outcomes of interest.

In [1]:
from ema_workbench.analysis import parcoords

import numpy as np
import pandas as pd
import pareto
from problem_formulation import get_model_for_problem_formulation
import matplotlib.pyplot as plt
import seaborn as sns

from ema_workbench import (
    MultiprocessingEvaluator,
    ScalarOutcome,
    Policy
)

In [2]:
results = pd.read_csv('output/policies__constraints_filtered.csv')
model, steps = get_model_for_problem_formulation('A4 Only')

In [3]:
policies = results.set_index('Policy Name')
policies = policies[[o.name for o in model.levers]]
policies

Unnamed: 0_level_0,EWS_DaysToThreat,rfr_0_t0,rfr_0_t1,rfr_0_t2,rfr_1_t0,rfr_1_t1,rfr_1_t2,rfr_2_t0,rfr_2_t1,rfr_2_t2,...,A2_DikeIncrease_t2,A3_DikeIncrease_t0,A3_DikeIncrease_t1,A3_DikeIncrease_t2,A4_DikeIncrease_t0,A4_DikeIncrease_t1,A4_DikeIncrease_t2,A5_DikeIncrease_t0,A5_DikeIncrease_t1,A5_DikeIncrease_t2
Policy Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
s81588_p10,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,2.0,0.0,0.0,2.0,0.0,0.0,2.0,0.0,0.0
s81588_p11,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,2.0,0.0,0.0,3.0,0.0,0.0,2.0,0.0,0.0
s81588_p12,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0
s81588_p13,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,3.0,0.0,0.0,3.0,0.0,0.0,3.0,0.0,0.0
s81588_p15,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,...,0.0,0.0,0.0,0.0,2.0,0.0,0.0,1.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
sReference_p24,4.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0
sReference_p26,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0
sReference_p33,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,...,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0
sReference_p34,3.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,...,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0


In [4]:
#TODO: remove later. we will be getting less than 50 here anyway
policies = policies.head(50)

In [5]:
policies_to_evaluate = []

for _, policy in policies.iterrows():
    policies_to_evaluate.append(Policy(str(_), **policy.to_dict()))

In [None]:
n_scenarios = 30
#TODO: change the number
# n_scenarios = 1000
with MultiprocessingEvaluator(model) as evaluator:
    results2 = evaluator.perform_experiments(n_scenarios,
                                            policies_to_evaluate)

## Calculating signal-to-noise ratio

In [None]:
# function to calculate SnS metric
def s_to_n(data, direction):
    mean = np.mean(data)
    std = np.std(data)

    if direction==ScalarOutcome.MINIMIZE:
        return mean*std
    else:
        return mean/std

In [None]:
# hardcoded inputs to processing
outcomes_of_interest = ['A4_Expected_Annual_Damage', 'A4_Expected_Number_of_Deaths',
                        'Total_Expected_Annual_Damage', 'Total_Expected_Number_of_Deaths',
                        'Total_Infrastructure_Costs']
outcome_epsilons = [100, 0.01, 100, 100, 0.01]
outcome_columns=[0,1,2,3,4]

In [None]:
#generation of SnS dataframe

experiments, outcomes = results2

overall_scores = {}
for policy in np.unique(experiments['policy']):
    scores = {}
    
    logical = experiments['policy']==policy
    
    for outcome in model.outcomes:
        if(outcome.name in outcomes_of_interest):
            value  = outcomes[outcome.name][logical]
            sn_ratio = s_to_n(value, outcome.kind)
            scores[outcome.name] = sn_ratio
    overall_scores[policy] = scores
sns_scores = pd.DataFrame.from_dict(overall_scores).T

In [None]:
sns_scores.head()

In [None]:
# SnS scores visualisation

data = sns_scores
limits = parcoords.get_limits(data)
limits.loc[0, outcomes_of_interest] = 0

# sns.color_palette() has 10 colors.
colors = sns.color_palette("husl", 50)
paraxes = parcoords.ParallelAxes(limits, fontsize=10)

for i, (index, row) in enumerate(data.iterrows()):
    paraxes.plot(row.to_frame().T, label=str(index), color=colors[i])
#TODO: maybe enable printing the legend
#paraxes.legend()

plt.show()

In [None]:
# Prepare df for pareto rows selection
sns_scores['Policy Name'] = sns_scores.index

Note that, passing epsilons to pareto sorting results in getting a smaller set of policies (as then only one out of policies falling into one epsilon cell is kept). If one wants to keep more polices, one can remove that parameter).

In [None]:
# Selection of policies on the pareto front for SnS.
nondominated_sns = pareto.eps_sort([list(sns_scores.itertuples(False))], objectives=outcome_columns, epsilons=outcome_epsilons)
nondominated_sns_df = pd.DataFrame(columns=list(sns_scores.columns))

for row in nondominated_sns:
    nondominated_sns_df.loc[len(nondominated_sns_df)] = row
    
nondominated_sns_df

In [None]:
robust_policy_names =  list(nondominated_sns_df['Policy Name'])

## Calculating maximum regret

In [None]:
# helper function to calculate regret
def calculate_regret(data, best):
    return np.abs(best-data)

In [None]:
# generation of regret df
overall_regret = {}
max_regret = {}
for outcome in model.outcomes:
    policy_column = experiments['policy']
    
    # create a DataFrame with all the relevent information
    # i.e., policy, scenario_id, and scores
    data = pd.DataFrame({outcome.name: outcomes[outcome.name], 
                         "policy":experiments['policy'],
                         "scenario":experiments['scenario']})
    
    # Filter out rows that are not for policies kept according to signal to noise ratio. 
    data = data[data['policy'].isin(robust_policy_names)]

    # reorient the data by indexing with policy and scenario id
    data = data.pivot(index='scenario', columns='policy')
    
    # flatten the resulting hierarchical index resulting from 
    # pivoting, (might be a nicer solution possible)
    data.columns = data.columns.get_level_values(1)

    # we need to control the broadcasting. 
    # max returns a 1d vector across scenario id. By passing
    # np.newaxis we ensure that the shape is the same as the data
    # next we take the absolute value
    #
    # basically we take the difference of the maximum across 
    # the row and the actual values in the row
    #
    outcome_regret = (data.max(axis=1).values[:, np.newaxis] - data).abs()
    
    overall_regret[outcome.name] = outcome_regret
    max_regret[outcome.name] = outcome_regret.max()

In [None]:
# plotting regret heatmap
max_regret = pd.DataFrame(max_regret)
sns.heatmap(max_regret/max_regret.max(), cmap='viridis', annot=True)
plt.show()

In [None]:
# Max regret visualisation

# TODO: set the number of colors to the number of policies for best visualisation. sns.color_palette() has less 10 colors.
colors = sns.color_palette("husl", 5)

data = max_regret

limits = parcoords.get_limits(data)
limits.loc[0, outcomes_of_interest] = 0

paraxes = parcoords.ParallelAxes(limits, fontsize=10)

for i, (index, row) in enumerate(data.iterrows()):
    paraxes.plot(row.to_frame().T, label=str(index), color=colors[i])
paraxes.legend()
    
plt.show()

In [None]:
# Prepare df for pareto rows selection
max_regret['Policy Name'] = max_regret.index

In [None]:
# Selection of policies on the pareto front for max regret
nondominated_max_regret = pareto.eps_sort([list(max_regret.itertuples(False))], objectives=outcome_columns, epsilons=outcome_epsilons)
nondominated_max_regret_df = pd.DataFrame(columns=list(max_regret.columns))

for row in nondominated_max_regret:
    nondominated_max_regret_df.loc[len(nondominated_max_regret_df)] = row

nondominated_max_regret_df.head()

## Selecting and ranking the most robust policies

If we get more than 3 robust policies at this point we want to manually drive the number down. Also we want to rank the 3 policies. We do it by inspection of the mean values across 1000 scenarios for outcomes of interest.

In [None]:
# list of robust policies
more_robust_policy_names =  list(nondominated_max_regret_df['Policy Name'])

In [None]:
# calculate mean scores df
overall_scores = {}
for policy in more_robust_policy_names:
    scores = {}
    
    logical = experiments['policy']==policy
    
    for outcome in model.outcomes:
        if(outcome.name in outcomes_of_interest):
            value  = outcomes[outcome.name][logical]
            scores[outcome.name] = np.mean(value)
    overall_scores[policy] = scores
mean_scores = pd.DataFrame.from_dict(overall_scores).T

In [None]:
mean_scores.head()

In [None]:
# mean scores visualisation

# TODO: set the number of colors to the number of policies for best visualisation. sns.color_palette() has less 10 colors.
colors = sns.color_palette("husl", 5)

data = mean_scores

limits = parcoords.get_limits(data)
limits.loc[0, outcomes_of_interest] = 0

paraxes = parcoords.ParallelAxes(limits, fontsize=10)

for i, (index, row) in enumerate(data.iterrows()):
    paraxes.plot(row.to_frame().T, label=str(index), color=colors[i])
paraxes.legend()
    
plt.show()

Based on visual inspection the 3 preferred policies are:
1. s65779_p53 - least damages in ring A4, biggest focus of the client. Also least costs. However unlikely to be accepted by all stakeholders due to high total impacts.
2. s65779_p18 - middleground.
3. s65779_p36 - lowest total damages, but highest for dike 4. the opposite of policy 1.

In [None]:
#Careful, these are selected manually.
most_robust_policy_names =  ['s65779_p53','s65779_p18','s65779_p36']

robust_policies = policies[policies.index.isin(most_robust_policy_names)]
robust_policies = robust_policies.reset_index()

In [None]:
robust_policies.head()

In [None]:
robust_policies.to_csv('output/policies__robust_filtered.csv')