## 1. Open Exploartion

In this notebook, we explore and discover the outcomes of the 'base case' scenario; the base case is defined as the 'doing nothing policy' which results in a policy in which all policy levers are set to zero. This exploration is done to explore the uncertainty space and get a quick overview of how our KPI's behave. 

Afterwards, the worst scenario's for our KPI's have been selected and analysed. Following from this analysis, we can define the uncertainty-ranges the model will act most worse in. Later on, these uncertainties will be given extra attention when finding the most robust policies.

For our experiments we will use the latin hypercube sampling (LHS) to generate the points in the parameter space defined by the uncertainties and 'base case' policy. 

In [1]:
# Import libraries
import numpy as np
import scipy as sp
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import networkx as nx

# Import workbench libraries and the model itself
from ema_workbench import (Model, CategoricalParameter, ScalarOutcome, IntegerParameter, RealParameter)
from ema_workbench import (MultiprocessingEvaluator, Policy, Scenario, SequentialEvaluator)
from ema_workbench.em_framework.evaluators import perform_experiments
from ema_workbench.em_framework.samplers import sample_uncertainties
from ema_workbench.util import ema_logging, utilities
from ema_workbench import save_results
from ema_workbench import load_results

from problem_formulation_new import get_model_for_problem_formulation
from dike_model_function import DikeNetwork 

import time

ema_logging.log_to_stderr(ema_logging.INFO)

<Logger EMA (DEBUG)>

Here we define the problem formulation. For our analysis, we pick problem formulation 5, since pf 5 results in the most disaggregate form of the outcomes. Later on we aggregate the outcomes in the KPIs that are of most use to us.

In [2]:
# # Import case function
# #choose problem formulation number, between 0-5

# # 5 objectives PF
# dike_model, planning_steps = get_model_for_problem_formulation(5)

# def sum_over(*args):
#     return sum(args)

First we run a base exploration, with all policy levers set to zero. We chose to run for 5000 scenario's, as this will give an broad overview in which the full uncertainty space is more or less covered (relatively seen).

Some code is commented out, because it was run and afterwards saved to e.g. a CSV-file.

In [7]:
# # Define the base case policy, which sets all the levers that are present within the dike_model to zero.
# # do nothing bolicy
# policies = [Policy("base case", **{lever.name: 0 for lever in dike_model.levers})]

# #Running the experiments (commented out)
# with MultiprocessingEvaluator(dike_model, n_processes = 8) as evaluator:
#      results = evaluator.perform_experiments(scenarios=5000, policies=policies)

# utilities.save_results(results, 'results/5000_scenarios_base_case.csv')
# experiments, outcomes = results

# Load in the results from the CSV we had saved from running the experiments
results = utilities.load_results('results/5000_scenarios_base_case.csv')
experiments, outcomes = results

# We create a 'combined' dataframe, in which the experiments and outcomes are merged within one pandas dataframe.
df_outcomes = pd.DataFrame(outcomes)
combined = pd.concat([experiments,df_outcomes],axis=1,sort=False)

combined.head()

[MainProcess/INFO] results loaded succesfully from C:\Users\ASUS\EPA1361\epa1361_final\results\5000_scenarios_base_case.csv


Unnamed: 0,A.0_ID flood wave shape,A.1_Bmax,A.1_Brate,A.1_pfail,A.2_Bmax,A.2_Brate,A.2_pfail,A.3_Bmax,A.3_Brate,A.3_pfail,...,A.3_Dike Investment Costs 2,A.3_Expected Number of Deaths 2,A.4_Expected Annual Damage 2,A.4_Dike Investment Costs 2,A.4_Expected Number of Deaths 2,A.5_Expected Annual Damage 2,A.5_Dike Investment Costs 2,A.5_Expected Number of Deaths 2,RfR Total Costs 2,Expected Evacuation Costs 2
0,82.0,112.588522,10.0,0.060112,207.259283,10.0,0.681905,298.34665,10.0,0.204375,...,0,0.208533,0.0,0,0.0,0.0,0,0.0,0.0,0.0
1,14.0,51.029922,1.0,0.277459,302.598285,1.5,0.960055,156.91659,10.0,0.004221,...,0,1.197254,0.0,0,0.0,0.0,0,0.0,0.0,0.0
2,128.0,56.198749,1.5,0.880576,136.816986,10.0,0.344347,221.692894,10.0,0.175895,...,0,0.870198,0.0,0,0.0,0.0,0,0.0,0.0,0.0
3,96.0,123.505345,1.5,0.797417,322.362756,1.5,0.977793,292.806136,10.0,0.765065,...,0,0.050221,20649670.0,0,0.011789,0.0,0,0.0,0.0,0.0
4,52.0,311.328249,10.0,0.433875,130.977495,1.0,0.64696,95.169589,10.0,0.962059,...,0,0.0,35802770.0,0,0.018065,0.0,0,0.0,0.0,0.0


### Visualization of the retrieved outcomes ###

In order to acquire information from the experiment performed earlier, we visualize the retrieved data in a form that is readable and ready to communicate with our problem owner. 

To do so, we aggregated the outcomes from problem formulation 57 into our relevant KPIs. The following KPIs are considered: 
 - Expected number of deaths (total)
 - Expected annual damage (per province)
 - Dike Investment Costs (per province)
 - RfR Total Costs
 - Expected Evacuation Costs
 
Since the 'base case' policy does not consider any investment in the dikes and RfR, all of the costs-outcomes are zero and, therefore, are not taken into account within this visualization or further analysis for the base case.

In [None]:
# The following function is used to aggregate the outcomes into the KPI we want
# This iterates over all the locations and round numbers, and creates a new column summing the values per location and round.
# If we want to aggregate over the location, "aggregate" equals "location" and therefore the KPI is added per location and not in total.
# On the contrary, if aggregate equals "total", the total value is appended to the dataframe.
def aggregate_kpi(data, kpi, aggregate):
    locations = ["A.1", "A.2", "A.3", "A.4", "A.5"]
    kpi_columns = []
    
    if kpi == "RfR Total Costs" or kpi == "Expected Evacuation Costs":
        kpi_columns.append(kpi + " 0")
        kpi_columns.append(kpi + " 1")
        kpi_columns.append(kpi + " 2")
        
        data[kpi] = data[kpi_columns].sum(axis=1)
    else:
        if aggregate is "total":
            for location in locations:
                kpi_columns.append(location + "_" + kpi + " 0")
                kpi_columns.append(location + "_" + kpi + " 1")
                kpi_columns.append(location + "_" + kpi + " 2")

            data[kpi] = data[kpi_columns].sum(axis=1)

        else:
            for location in locations:
                kpi_columns = []
                kpi_columns.append(location + "_" + kpi + " 0")
                kpi_columns.append(location + "_" + kpi + " 1")
                kpi_columns.append(location + "_" + kpi + " 2")

                data[location + "_" + kpi] = data[kpi_columns].sum(axis=1)
                
    return data

# Append the KPIs we would like to analyse. The RfR total costs and expected evacuation costs are included for later on in the analysis, but are not used for the base case analysis, as they are still 0.
combined = aggregate_kpi(combined, "Expected Number of Deaths", "location")
combined = aggregate_kpi(combined, "Expected Number of Deaths", "total")
combined = aggregate_kpi(combined, "Expected Annual Damage", "location")
combined = aggregate_kpi(combined, "Expected Annual Damage", "total")
combined = aggregate_kpi(combined, "RfR Total Costs", "total")
combined = aggregate_kpi(combined, "Expected Evacuation Costs", "total")