# BootComp Tutorial. Experimental Developments

This notebook provides a step-by-step guide to using BootComp to perform multiple comparisons of simulation output 

This notebook is used for development of BootComp procedures related to common random numbers

## Preamble

To help with management and house keeping BootComp is split up over a number of Python modules.  We need to import these into the notebook before we can run any analysis.

1. Bootstrap = core bootstrap routines for resampling and calculating test statistics. 
2. BootIO = file input and output functions as well as functions for displaying results on screen
3. ConvFuncs - utility module contains functions for converting between datatypes


In [None]:
import Bootstrap as bs
import BootIO as io
import ConvFuncs as cf

Bootstrap_crn = experimental routines in development for handling CRN

In [None]:
#DEV
import Bootstrap_crn as crn

We are also going to make use of a Python Data Science Library called PANDAS and a plotting library called Matplotlib

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

## 1. Parameters and setup

In [None]:
N_BOOTS = 2000
INPUT_DATA = "data/real_scenarios.csv"

Scenarios need to be in a .csv file.  Each scenario needs to have the same number of replications/batch means.

In [None]:
help(crn.load_scenarios)

In [None]:
scenario_data = crn.load_scenarios(INPUT_DATA)
N_SCENARIOS = scenario_data.shape[1]
print("Loaded data. {0} scenarios".format(N_SCENARIOS))

The replications for the scenarios/competing systems are returned from load_scenarios as a numpy array.  
Numpy is a python datascience library.  Manipulation of a numpy array substantially more efficient than using
a standard python list

In [None]:
scenario_data.shape

In [None]:
df_peek = pd.DataFrame(scenario_data)
df_peek.head(5)

Before we run the bootstrap we need to setup a BootstrapArguments class.  The arguments class is a a container class that encapsulates all arguments for the bootstrap.  Some of these arguments are simple value based ones e.g. the number of bootstrap resamples and others are complex functions e.g. the test statisic we will calculate on each bootstrap resample.  

The ones we are going to use in this example are:

1. nboot = the number of bootstrap resamples to run.
2. nscenarios = the number of scenarios to compare
3. point_estimate_func = A function that takes a set of bootstrap samples as parameter and calculates a point estimate.        e.g the mean or standard deviation.  signature = func(data, args).
4. difference_func = a function that calculates the difference between two scenarios (e.g a mean difference). method signature is func(first_scenario_data, second_scenario_data)
5. summary_func = how do you want to summarise the comparisons?  Percentile CIs? Proportion Sj > Si?  Or graphically?  Specifiy a function of signature func(data, args).

For this example we will use a test statistic of the mean difference and the comparison will be the proportion of Sj > Si

In [None]:
args =  bs.BootstrapArguments()

args.nboots = N_BOOTS
args.nscenarios = N_SCENARIOS

args.point_estimate_func = bs.bootstrap_mean
args.difference_func = bs.boot_mean_diff
args.summary_func = bs.proportion_x2_lessthan_x1

## 1.a Check if CRN has worked

If common random numbers are working correctly then the variance of the differences between scenarios will be less than the sum of the variances.  

In [None]:
help(crn.variance_reduction_results)

In [None]:
crn.variance_reduction_results(scenario_data)

## 2. Analysis Procedures

To generate a dataset containing resampled point estimates for each scenario you need to run the resample_all_scenarios function from the Bootstrap_crn module.  It takes two parameters i.) the scenario data ii.) the number of bootstraps

### 2.1 Resampling

In [None]:
help(crn.resample_all_scenarios)

In [None]:
boot_data = crn.resample_all_scenarios(scenario_data, N_BOOTS)

There has been a total of 20 * 2000 * 59 samples taken

In [None]:
boot_data.shape

We can cleanly visualise the bootstrap samples by converting the numpy array to a pandas dataframe

In [None]:
df_boots = pd.DataFrame(boot_data, columns = [i+1 for i in range(N_SCENARIOS)])
df_boots.index = df_boots.index+1
df_boots.head(5)

### 2.2. Comparison Scenarios

In [None]:
help(bs.compare_scenarios_pairwise)

In [None]:
# cast from numpy array to list of lists for backward compatability
boot_data2 = boot_data.T.tolist()
results = bs.compare_scenarios_pairwise(boot_data2, args) 

### 2.3. Printing Results

In [None]:
matrix = io.results_to_matrix(results)
io.insert_inverse_results(matrix, args.nscenarios)
df = cf.matrix_to_dataframe(matrix, io.scenario_headers(args.nscenarios))
df.style.applymap(io.colour_cells_by_proportion)

## 2.4 Ranking Systems

### 2.4.1 Rank by wins

To help identify the top systems found in experimentation use Bootstrap.rank_systems_min and Bootstrap.rank_systems_max.  These return a frequency table reporting the number of proportion of bootstrap resamples where system i was the best (minimum or maximium).  Results exclude systems that never came top.

In [None]:
help(bs.rank_systems_min)

In [None]:
ranks = bs.rank_systems_min(df_boots, args)
ranks

### 2.4.2 Top m

Alternatively you can find the proportion of resamples where a system was in the best m systems.
I.e. What were the proportion of 2000 bootstrap resamples where the results for system 1 was in the smallest 5 of all of the systems.  

There are two two functions:
1. Bootstrap.rank_systems_msmallest(DataFrame, BootstrapArguments, m)
2. Bootstrap.rank_systems_mlargest(DataFrame, BootstrapArguments, m)

Note that when m = 1 these functions are equivalent to those introduced in section 2.4.1

For this example the output measure related to waiting times.  We therefore use function 1 above converned with finding systems that consistently rank amoung the lowest 5 waiting times.

In [None]:
msmallest = bs.rank_systems_msmallest(df_boots, args, 5)
msmallest

If you want to visualise the results as a bar chart then we can call the plot() function of the pandas dataframe.

In [None]:
ax= msmallest.plot(x=msmallest.index, y='f_x', kind='bar')
ax.set_ylabel('frequency (within best 5 systems)')

plt.show()

### 2.4.3 Rerunning the comparisons with smaller subset

The ranking analysis from 2.4.2. provides a shortlist of the systems that consistently appear in the best m systems.  This information can be used to perform a comparison of the subset of systems with the most promising outputs.  First we need to do two things:
1.) Get the list of system/scenario ids as a list
2.) Get the subset of bootstrap resamples corresponding to this subset.  

In [None]:
subset_indexes = msmallest.index.values.tolist()
#zero indexed
subset_indexes_zero = [x - 1 for x in subset_indexes]
subset = cf.subset_of_list(boot_data, subset_indexes_zero)
args.nscenarios = len(subset)
results = bs.compare_scenarios_pairwise(subset, args) 
matrix = io.results_to_matrix(results) 
io.insert_inverse_results(matrix, args.nscenarios)

In [None]:
df_subset = cf.matrix_to_dataframe(matrix, [str(i) for i in subset_indexes])
df_subset = df_subset.reindex_axis(sorted(df_subset.columns), axis=1)
df_subset.reindex_axis(sorted(df_subset.columns), axis=0).style.applymap(io.colour_cells_by_proportion)

## 2.5. Visualise the subset bootstrap samples

In [None]:
subset_indexes = msmallest.index.values.tolist()
ax = df_boots[subset_indexes].plot.box()