# HOW TO USE THIS TEMPLATE

1. Copy this template notebook and rename to reflect the scenario results described within.
2. Within the scenario results notebook, content _outside_ of brackets "<", ">" should remain exactly as-is in the scenario result notebook--this includes the code used to generate displayed results. Text _within_ these brackets must be replaced with the scenario-specific information noted.
3. **Delete this Markdown block** in the scenario results notebook before delivering the report!

# Summary

<A summary of the scenario group is provided here.>

# Contents

**Experimental Setup**: This covers the objective of the study and its methodology, describes the KPIs and success indicators (threshold inequalities), the protocol and environmental sweep parameters and their initial ranges, and finally summarizes the computational complexity of the simulations themselves.

**Adaptive Grid Results**: The evolution of the parameter selection process is presented as a visualization, showing the convergence of the protocol parameter ranges as different success criteria are achieved.

**Protocol Parameter Recommendations**: Based upon the adaptive grid results, the recommended parameter ranges are presented.

**Decision Tree and Parameter Importance**: Using the adaptive grid results, a machine-learning process is applied to infer the importance of different parameters on the associated KPI-based threshold inequalities. This provides a method of assessing whether one or more parameters are 'crucial' to success, in the sense that they have an outsized impact on the success criteria. This approach leverages decision trees that are fit to the results of the entire adaptive grid process.

**Parameter Impact on KPIs**: A density approach (histogram) can be used to assess the impact of protocol parameters on the KPIs of the scenario. The KPI densities are shown for each protocol parameter sweep value, providing a visual indication of the impact of the parameter on the density shape and location.

**Conclusion**: An overall assessment of the scenario results is provided, highlighting any problems, caveats, implications and possibilities for future/extended work.


# Experimental Setup

## Objective and Methodology

<The objective of the scenario group is stated here.>

- **System Goals Targeted**:
  <A list of associated system goals is provided here.>
- **Design**: <a brief statement of the scenario design is provided here.>
- **Testing Variables**:
  - Environmental:
      <A list of the environmental parameters is provided here, with each item prefaced by "Introduce"--each list item contains a definition and description.>
  - Protocol:
      <A list of the protocol parameters is provided here, with each item prefaced by "Assess"--each list item contains a definition and description.>

## KPIs

<A list of KPIs is provided here--each list item contains the KPI number, definition and brief description.>

## Threshold Inequalities

<A list of threshold inequalities is provided here--each list item contains the name of the inequality and the actual threshold values used in the scenarios. Each item's reference in the code, usually with a suffix `_success`, is given.>

## Sweep Parameters

### Swept Protocol Parameters

<An enumerated list of the protocol parameters is given, using their `name` in the code and any relevant abbreviation.>

### Swept Environmental Parameters

<An enumerated list of the environmental parameters is given, using their `name` in the code and any relevant abbreviation.>

## Initial Parameter Sweep Ranges

Protocol and Environmental parameters were initialized for the first adaptive grid search according to:
1. Values found in the specifications provided to BlockScience ([V1 Mechanism spreadsheet](https://docs.google.com/spreadsheets/d/1Gpnw3ZXNh9lWFgmrbrg8wTqVKPD2M8QqJdAiAALru38/edit?usp=sharing), [V1 Minting spreadsheet](https://docs.google.com/spreadsheets/d/1QYe6NzuiyimsXs5cT1BSM-UT1DtX_K38cOZsEFJOtdA/edit?usp=sharing)),
2. Discussions with the Pocket team, and
3. BlockScience best practice. 

### Environmental Parameters

| Full Name |  Sweep Variable Name | Sweep Values | Units |
| --- | --- | ---| ---|
<A list of table entries corresponding to initial ranges for the environmental parameters is given here.>

### Protocol Parameters

| Full Name |  Sweep Variable Name | Sweep Values | Units |
| --- | --- | ---| ---|
<A list of table entries corresponding to initial ranges for the protocol parameters is given here.>

## Computational Complexity

**Total number of parameter constellations**: <Show the total number derived here.>

**Total number of Monte Carlo runs per constellation**: <Show the number of MC runs per parameter here.>

**Total number of experiments per adaptive grid**: <The product of the two immediately preceding numbers is given here.>

**Number of adaptive grid searches**: <The total number of adaptive grid searches performed is shown here.>

**Total number of parameter constellations evaluated**: <The product of the two immediately preceding numbers is here.>

# Adaptive Grid Results

In [None]:
import os
os.chdir("..")
from psuu import load_all_kpi_comparison_data, load_scenario_kpi_comparison_data, threshold_comparison_plot, decision_tree_feature_importance_plot

In [None]:
scenario_sweep_category = <INSERT STRING REPRESENTING SCENARIO GROUP HERE>
kpis = load_scenario_kpi_comparison_data(scenario_sweep_category)
threshold_comparison_plot(kpis[scenario_sweep_category])

# Protocol Parameter Recommendations

From the adaptive grid results, the recommended parameter ranges for the swept protocol parameters are:

| Full Name |  Abbreviation | Recommended Range | Units |
| --- | --- | ---| ---|
<A list of table entries corresponding to the final ranges for the protocol parameters is given here.>

# Decision Tree and Parameter Importance

## Decision Tree Classification

A decision tree is a machine-learning-based classifier. Given the simulation results, for each threshold inequality the tree recursively associates different _samples_ from the results, according to sorting criteria based upon one or more of the protocol parameters of the simulation.

Each decision tree below corresponds to one of the threshold inequalities stated above. Where the decision tree is 'empty', the threshold inequality was either 1) always fulfilled during the simulations, or 2) never fulfilled during the simulations. In this case no sensitivity analysis can be performed, as the threshold inequalities do not vary according to the different parameter combinations that were swept.

The title of the decision tree includes the threshold inequality under scrutiny, in addition to a technical 'score' (usually "100%") and the number of simulation results used as the dataset. Within the decision tree presented, each **non-terminal** 'node' is labeled with the following information:

1. The sorting variable used and its cutoff value used for classification, in the form of `parameter_name <= x` where `x` is the cutoff value. Branches to the left of this node indicate satisfaction of this inequality, while branches to the right indicate violations, i.e. `parameter_name > x`.
2. A Gini coefficient representing the method of recursive association used.
3. The total number of simulation results ("samples = y%") as a percentage "y" that are considered at this node.
4. The breakdown of the simulation results considered into the left and right branches ("value = [p, 1-p]"), where "p" is the fraction of results that satisfy the `parameter_name = x` constraint, and "1-p" the fraction satisfying `parameter_name > x`.
5. The classification of the majority of the simulation results at this node (note that this is not a final classification, as it appears in a non-terminal node, and can be arbitrary if the results are split equally across classes).

**Terminal** nodes ("leaves") represent the final classification of that proportion of the simulation results that arrive at the node, and have most of the same information as a non-terminal node, with the exception that there is no branching performed and hence no sorting variable displayed. Here the most important information is the classification (last line). 

Non-terminal and terminal nodes colored in blue correspond to the threshold inequality being met, and by following blue boxes from a terminal node up to the root of tree a set of `parameter_name <= x` and/or `parameter_name > x` sorting criteria can be chained together. 

Upon successful classification, it is usual for the terminal node to have a breakdown "value = [1.0, 0.0]" or "value = [0.0, 1.0]", indicating that 100% of the remaining simulation results treated are either satisfying the threshold inequality under treatment (left value is 1.0), or not satisfying the threshold inequality (right value is 1.0).

For further information regarding the decision tree approach adopted here please see the [Decision Trees](https://scikit-learn.org/stable/modules/tree.html#) documentation from the `scikit-learn` library.

## Feature Importance

Below each non-empty decision tree is a bar graph indicating the relative importance of each swept protocol parameter ("feature") in determining the satisfaction of the threshold inequality. This leverages the use of "random forests", which is a technique to average over many different decision tree realizations over different subsets of the simulation results, assessing the relative contribution of each protocol parameter in the branching of each tree. Roughly speaking, the more times a protocol parameter was used in the branching process, the higher its importance is to the threshold inequality--in other words, the protocol parameter carries a larger 'weight' in determining satisfaction or violation of the inequality, and so the inequality is more sensitive to the values of the parameter.

For further information regarding the random forest and feature importance approach adopted here please see the [Random Forest Classifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html) documentation from the `scikit-learn` library.

In [None]:
decision_tree_feature_importance_plot(scenario_sweep_category)

# Parameter Impact on KPIs

The simulation results provide, for each combination of swept protocol parameter values ("parameter constellations"), a series of outcomes distinguished by different random realizations of stochastic ("Monte Carlo") distributions. Thus, each of the KPIs can be computed for each simulation result, and a _frequency distribution_ or _density_ of KPI values can be generated for each of the iterations of the adaptive grid approach. These densities can be further broken down by protocol parameter value, and the changes in the densities across different values can be visualized. This provides a method of assessing the sensitivity of the KPIs to the protocol parameters, both at the initial adaptive grid implementation, before optimization is performed, and at the end of the implementation, when the recommended parameter ranges have been achieved.

Each group of figures below corresponds to one of the protocol parameters. Each row in a figure group corresponds to a different KPI, and each column to the initial adaptive grid and final adaptive grid simulation results. Within a figure, the density corresponding to each swept value of the associated protocol parameter is presented (generally, the lower sweep value is displayed in blue, while the upper sweep value is in red, although this may not always be the case). 

By examining the changes in the shape of the KPI densities across sweep values and across the adaptive grid results, a qualitative visual assessment of how sensitive the KPI under scrutiny is to the swept protocol parameter can be made.

In [None]:
from psuu.parameter_impact_plots import *
merged_df = read_and_format_data(scenario_sweep_category=scenario_sweep_category)

In [None]:
latest_adaptive_grid = max(kpis[scenario_sweep_category])
for param_name in kpis[scenario_sweep_category][latest_adaptive_grid]['variable_params']:
    make_initial_vs_final_plot(df = merged_df,
                           scenario_sweep_category = scenario_sweep_category,
                           param_name = "param_" + param_name, fig_width = 30)

# Conclusion

<A concluding paragraph is provided here, which can be used to highlight features of the results, caveats, problems, and directions for future scenario work.>