In [1]:
from pyDOE import *
import pandas as pd
import numpy as np

In [2]:
limits = pd.read_csv("uncertain_variables.csv").set_index("name")

These limits were put randomly for the illustration put need to be updated. The "unknown" parameters in the transport model (i.e. elasticity of transport demand to cost increase) need to be added.

In [3]:
limits

Unnamed: 0_level_0,variable description,min,current hypothesis,max
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
ini_infiltration,initial inflitration (mm),0.5,1.0,2.0
drainage,drainage (mm/h),1,2.0,4.0
runoff,runoff (%),20,50.0,70.0
threshold,threshold for transport disruption (cm),20,40.0,50.0
elas_demand,elasticity of transport demand to cost,-2,-1.0,-0.5
elas_switch,elasticity for mode switch,-2,-1.0,-0.5
cost_multiplier,transport unit cost multiplier (applied to all...,0.5,1.0,1.5
traffic_growth,traffic growth factor,2,3.0,5.0
duration_multiplier,duration of incidents multiplier,0.5,1.0,2.0
frequency,incidents per year,based on the GCM results,,


I am dropping the parameters that enter only the economic model for now because the economic model can be run as many times as we want

In [4]:
limits = limits.drop(["frequency","cost_multiplier","traffic_growth","duration_multiplier"])
limits[['min','max']] = limits[['min','max']].astype(float)

## This is a phased sampling strategy in which we explore model outputs one by one instead of combining everything from the begining

### we first limit the exploration of uncertainty to scalgo

In [5]:
limits_flood_model = limits.loc[["ini_infiltration","drainage","runoff"],:]

In [6]:
limits_flood_model

Unnamed: 0_level_0,variable description,min,current hypothesis,max
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
ini_infiltration,initial inflitration (mm),0.5,1.0,2.0
drainage,drainage (mm/h),1.0,2.0,4.0
runoff,runoff (%),20.0,50.0,70.0


#### here I put 10 scenarios but if possible we could run more

In [7]:
scenarios_flood=lhs(len(limits_flood_model.index),samples=10,criterion="corr")*np.diff(limits_flood_model[['min','max']].values).T+limits_flood_model['min'].values
scenarios_flood=pd.DataFrame(scenarios_flood,columns=limits_flood_model.index)

new candidate solution found with max,abs corrcoef = 0.9999977602461374


In [8]:
scenarios_flood

name,ini_infiltration,drainage,runoff
0,0.64678,1.15018,48.806338
1,0.972315,3.316981,57.001041
2,1.33979,2.771678,33.038617
3,1.862928,1.49517,43.058856
4,1.840135,1.641037,22.905541
5,1.598211,3.774615,39.30802
6,1.141723,3.60211,51.346521
7,1.46874,2.426772,65.929028
8,0.751857,2.915168,63.552843
9,0.921544,2.020141,29.47114


We can also select a subset of the rainfall events (i.e. 2, 6, 10, 14, 20mm/hr), so that we run scalgo only 50 times.

In [9]:
all_simulations_scalgo = pd.DataFrame()
for event in [2, 6, 10, 14, 20]:
    subset = scenarios_flood
    subset["rainfall_event"] = event
    all_simulations_scalgo = all_simulations_scalgo.append(subset)

In [10]:
all_simulations_scalgo.to_csv("all_simulations_scalgo.csv",index=False)

### We then analyze the scalgo results in terms of (i) water depth on critical infrastructure and (ii) flood extend for each of the rainfall events, and we identify the uncertain variables that significantly change the results (if there are significant changes)

In [11]:
outputs_scalgo = pd.read_excel("expected_results_scalgo.xlsx")

here we can do some clustering of the results and select only a few representative scenarios. I have clustering algorithms that can do that.

I select scenarios randomly here for the purpose of illustrating the concept but we need to give some thought to the selection especially on the rainfall events

In [12]:
scalgo_representative_scenarios = outputs_scalgo[["ini_infiltration","drainage","runoff","rainfall_event","water_depth_critical_infra1"]].sample(5)

### We select a few representative scenarios from the set above and from the set of rainfall events, and we combine them with a sample of scenarios from the transport model

In [13]:
limits_transport_model = limits.loc[["threshold","elas_demand","elas_switch"],:]
scenarios_transport=lhs(len(limits_transport_model.index),samples=10,criterion="corr")*np.diff(limits_transport_model[['min','max']].values).T+limits_transport_model['min'].values
scenarios_transport=pd.DataFrame(scenarios_transport,columns=limits_transport_model.index)

new candidate solution found with max,abs corrcoef = 0.9996933655798615


In [14]:
scenarios_transport

name,threshold,elas_demand,elas_switch
0,26.279374,-0.969242,-0.623846
1,48.233552,-0.838448,-1.6413
2,24.300924,-0.658684,-1.77108
3,40.662435,-1.806294,-1.082222
4,33.707947,-1.447188,-1.428699
5,22.258712,-1.948176,-1.308378
6,35.377287,-1.387693,-1.962909
7,42.993215,-1.194158,-1.210852
8,46.856264,-1.655583,-0.72519
9,30.355188,-0.608436,-0.854698


I combine the uncertainty on transport with the scalgo results/hypotheses

In [15]:
full_transport_runs = pd.DataFrame()
for index, row in scalgo_representative_scenarios.iterrows():
    subset = pd.concat([scenarios_transport,pd.DataFrame(len(scenarios_transport)*[row.values],columns=row.index)],axis=1)
    full_transport_runs = full_transport_runs.append(subset)

In [17]:
full_transport_runs.head(5)

Unnamed: 0,threshold,elas_demand,elas_switch,ini_infiltration,drainage,runoff,rainfall_event,water_depth_critical_infra1
0,26.279374,-0.969242,-0.623846,1.315946,3.887733,21.196971,2.0,
1,48.233552,-0.838448,-1.6413,1.315946,3.887733,21.196971,2.0,
2,24.300924,-0.658684,-1.77108,1.315946,3.887733,21.196971,2.0,
3,40.662435,-1.806294,-1.082222,1.315946,3.887733,21.196971,2.0,
4,33.707947,-1.447188,-1.428699,1.315946,3.887733,21.196971,2.0,


In [18]:
full_transport_runs.to_csv("all_simulations_visum.csv",index=False)

### We run visum for all these scenarios and then we can analyze the results and do the economic analysis

In [19]:
outputs_visum = pd.read_excel("expected_results_visum.xlsx")

In [20]:
outputs_visum

Unnamed: 0,threshold,elas_demand,elas_switch,ini_infiltration,drainage,runoff,rainfall_event,km of roads flooded,km of brt lane flooded,Hindered passenger trips,Hindered truck trips,BRT travellength km/day,BRT traveltime hours/day,Other passenger travellength km/day,Other passenger traveltime hours/day,Truck travellength km/day,Truck traveltime hours/day
0,24.928514,-1.650624,-1.092844,1.206559,2.364415,47.882976,14,,,,,,,,,,
1,27.91079,-1.940239,-1.568047,1.206559,2.364415,47.882976,14,,,,,,,,,,
2,37.900446,-1.216311,-1.16395,1.206559,2.364415,47.882976,14,,,,,,,,,,
3,43.137063,-0.878914,-0.696889,1.206559,2.364415,47.882976,14,,,,,,,,,,
4,46.215596,-0.720969,-1.738271,1.206559,2.364415,47.882976,14,,,,,,,,,,
5,24.928514,-1.650624,-1.092844,1.683215,1.063521,29.898387,10,,,,,,,,,,
6,27.91079,-1.940239,-1.568047,1.683215,1.063521,29.898387,10,,,,,,,,,,
7,37.900446,-1.216311,-1.16395,1.683215,1.063521,29.898387,10,,,,,,,,,,
8,43.137063,-0.878914,-0.696889,1.683215,1.063521,29.898387,10,,,,,,,,,,
9,46.215596,-0.720969,-1.738271,1.683215,1.063521,29.898387,10,,,,,,,,,,


Here of course we'll need to add the uncertainty on the frequency and duration of flood events, traffic growth, (and on the unit cost of transport ?) in the economic analysis. Traffic growth is not very interesting because if I understand correctly it does not have any effect on congestion in the model.

#### The economic model can be easily written in a small code to be able to run it systematically under many scenarios and then calculate expected annual losses

To be added later