## Travel time analysis iterations

Notebook to analyse how much replications, run length and cooldown time are required.

In [48]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import scipy.stats as st
import numpy as np

### The importance of cooldowns

Cooldown (or warmup) is an essential part of agent-based and discrete-event modelling. A model should have either:
 - Time to come to a steady state (with a warmup), if (aggegrated) model variables are measured
 - Time to get all entities out of the system (with a cooldown), if agent/entity variables are measured

In this case, we're measuring agent-based variable, so we should be using a cooldown to get all agents out of the system, and in that process, report their data.

In [49]:
# Read in travel time results and SourceSink data
cooldown = [0, 6]
results_1 = {}
for c in cooldown:
    results_1[c] = pd.read_csv(f"../experiments/iteration/scen_4_12_hours_5_reps_{c}h_cooldown.csv", index_col=0)
    results_1[c].drop("VehicleID", axis="columns", inplace=True)
source_sinks = pd.read_csv(f"../experiments/source_data.csv", index_col=0)

In [50]:
def proces_results(results, do_print=False, do_export=False):
    #calculating 95% confidence intervals of the economic losses
    confidence = 0.95

    average = {}
    low_bound = {}
    high_bound = {}
    interval_range = {}
    economic_interval = {}

    for i in results.keys():
        results_list = results[i]["Travel_Time"].tolist()
        average[i] = np.mean(results_list)
        low_bound[i], high_bound[i] = st.norm.interval(alpha=confidence, loc=average[i], scale=st.sem(results_list))
        interval_range[i] = high_bound[i] - low_bound[i]
        economic_interval[i] = interval_range[i] / average[i] * 100

        if do_print:
            print(f'Average travel time (95% confidence interval) for scenario {i}: {average[i]:.3f} ({low_bound[i]:.3f}, {high_bound[i]:.3f}), economic interval: {economic_interval[i]:.3f}%')

    df = pd.DataFrame({
        "Average (min)": average,
        "Low bound (min)": low_bound,
        "High bound (min)": high_bound,
        "Interval range (min)": interval_range,
        "Economic interval (%)": economic_interval})
    if do_print:
        df.to_csv("../results/travel_times.csv", index_label="Scenario")
    return df

In [51]:
df1 = proces_results(results_1)
df1

Unnamed: 0,Average (min),Low bound (min),High bound (min),Interval range (min),Economic interval (%)
0,200.917212,195.700219,206.134206,10.433987,5.193177
6,343.374288,337.449867,349.298708,11.848842,3.450707


With 0 cooldown the average travel time is 201 minutes, while with 6 hours of cooldown it's 343.

So lets do some more runs to see how much cooldown we need. This time, we use a bit longer run time of 48 hours, and will test cooldowns from 0 to 48 hours in 6-hour increments.

In [52]:
cooldown = list(range(0,54,6))
results_2 = {}
for c in cooldown:
    results_2[c] = pd.read_csv(f"../experiments/iteration/scen_4_48_hours_5_reps_{c}h_cooldown.csv", index_col=0)
    results_2[c].drop("VehicleID", axis="columns", inplace=True)

df2 = proces_results(results_2)
df2

Unnamed: 0,Average (min),Low bound (min),High bound (min),Interval range (min),Economic interval (%)
0,780.000939,775.246798,784.755081,9.508283,1.219009
6,858.374349,853.661335,863.087363,9.426028,1.098126
12,874.604985,870.158428,879.051543,8.893115,1.016815
18,953.234349,948.575589,957.893108,9.317519,0.977464
24,1030.740568,1026.227075,1035.254061,9.026986,0.875777
30,1037.395833,1032.732421,1042.059245,9.326823,0.899061
36,1011.840216,1007.365071,1016.315362,8.950291,0.884556
42,1101.74745,1096.806066,1106.688835,9.882769,0.897009
48,1111.363581,1106.495641,1116.231521,9.735879,0.87603


From the results above, the travel time has largely stabalized after a cooldown of 24 hours. Due to the low number of replications there is some noise in there.

We will continue with a cooldown of 24 hours (24 * 60 = 1440 steps) for our further iteration on the model run parameters.

## Run length

Now that we have a proper cooldown period, the run length should introduce a lot less bias towards faster arriving vehicles. Still, it has to be a certain length to allow for a high enough number of vehicles (samples) to get statically representative averages and other statistics.

We will try run lengths of 12 to 84 hours (3.5 days) with 12 hour increments. Again with 5 replications (this will introduce some noise) and 24 hour cooldown.

In [53]:
run_length = list(range(12,96,12))
results_3 = {}
for r in run_length:
    results_3[r] = pd.read_csv(f"../experiments/iteration/scen_4_{r}_hours_5_reps_24h_cooldown.csv", index_col=0)
    results_3[r].drop("VehicleID", axis="columns", inplace=True)

df3 = proces_results(results_3)
df3

Unnamed: 0,Average (min),Low bound (min),High bound (min),Interval range (min),Economic interval (%)
12,859.030599,851.201972,866.859225,15.657253,1.822665
24,937.314113,931.326449,943.301778,11.975329,1.277622
36,941.747066,936.844918,946.649215,9.804297,1.041075
48,1030.740568,1026.227075,1035.254061,9.026986,0.875777
60,991.87601,987.970952,995.781067,7.810115,0.787408
72,987.811795,984.160696,991.462893,7.302198,0.73923
84,1040.204411,1036.681689,1043.727132,7.045443,0.677313


From the results above, the confidence interval decreases as the run length increases. From 48 hours, the 95% confidence interval is smaller than one percent of the average, which we will accept as small enough for this research.

However, while the confidence interval for each set of runs decreases, the average travel time doesn't converge yet between experiments. For that we need more replications in each experiment.

### Replications


Because in each replication other bridges break, all replications will have different average travel times. Using only 5 replications leaves a lot of variability. Because of that, we will vary the number of replications from 5 to 25 in increments of 5.

As determined above, we use a run length of 48 hours with 24 hour cooldown.

In [73]:
replications_suffix = ['', '_1', '_2', '_3', '_4']
results_4 = {}
for e, rs in enumerate(replications_suffix):
    df = pd.read_csv(f"../experiments/iteration/scen_4_48_hours_5_reps_24h_cooldown{rs}.csv", index_col=0)
    df.drop("VehicleID", axis="columns", inplace=True)
    if rs != '':
        r = (int(rs.replace('_', ''))+1)*5
        results_4[r] = pd.concat([results_4[r-5], df])
    else:
        results_4[5] = df
df4 = proces_results(results_4)
df4

Unnamed: 0,Average (min),Low bound (min),High bound (min),Interval range (min),Economic interval (%)
5,1030.740568,1026.227075,1035.254061,9.026986,0.875777
10,1002.225594,999.069477,1005.381711,6.312234,0.629822
15,992.153112,989.587302,994.718922,5.131621,0.517221
20,991.532514,989.306867,993.758162,4.451295,0.448931
25,996.268943,994.261127,998.276759,4.015632,0.403067


As can be seen, from 15 replications and upwards, the 95% confidence interval is around 0.5% of the average or lower.

Since replications can be run overnight, double the replications will be run for each scenario, 30 each.