# Experiment README

## Table of Contents

* [Overview of Experiment Architecture](#Overview-of-Experiment-Architecture)
* [Experiment Workflow](#Experiment-Workflow)
    * [Modifying State Variables](#Modifying-State-Variables)
    * [Modifying System Parameters](#Modifying-System-Parameters)
    * [Executing Experiments](#Executing-Experiments)
    * [Post-processing and Analysing Results](#Post-processing-and-Analysing-Results)
    * [Visualizing Results](#Visualizing-Results)
* [Creating New, Customized Experiment Notebooks](#Creating-New,-Customized-Experiment-Notebooks)
    * Step 1: Select an experiment template
    * Step 2: Create a new notebook
    * Step 3: Customize the experiment
    * Step 4: Execute the experiment
* [Advanced Experiment-configuration & Simulation Techniques](#Advanced-Experiment-configuration-&-Simulation-Techniques)
    * [Setting Simulation Timesteps and Unit of Time `dt`](#Setting-Simulation-Timesteps-and-Unit-of-Time-dt)
    * [Changing the Ethereum Network Upgrade Stage](#Changing-the-Ethereum-Network-Upgrade-Stage)
    * [Performing Large-scale Experiments](#Performing-Large-scale-Experiments)

# Overview of Experiment Architecture

The experiment architecture is composed of the following four elements – the **model**, **default experiment**, **experiment templates**, and **experiment notebooks**:

1. The **model** is initialized with a default Initial State and set of System Parameters defined in the `model` module.
2. The **default experiment** – in the `experiments.default_experiment` module – is an experiment composed of a single simulation that uses the default cadCAD **model** Initial State and System Parameters. Additional default simulation execution settings such as the number of timesteps and runs are also set in the **default experiment**.
3. The **experiment templates** – in the `experiments.templates` module – contain pre-configured analyses based on the **default experiment**. Examples include... To be created!
4. The **experiment notebooks** perform various scenario analyses by importing existing **experiment templates**, optionally modifying the Initial State and System Parameters within the notebook, and then executing them.

# Experiment Workflow

If you just want to run (execute) existing experiment notebooks, simply open the respective notebook and execute all cells.

Depending on the chosen template and planned analysis, the required imports might differ slightly from the below standard dependencies:

In [1]:
# Import the setup module:
# * sets up the Python path
# * runs shared notebook-configuration methods, such as loading IPython modules
import setup

# External dependencies
import copy
import logging
import pandas as pd
import plotly.express as px
from pprint import pprint
import importlib as imp

# Project dependencies
import model.constants as constants
from experiments.run import run
from experiments.utils import display_code
import experiments.notebooks.visualizations as visualizations

> [0;32m/home/bowd/Workspace/job/celo/mento2-model/model/generators/factroy.py[0m(12)[0;36m__init__[0;34m()[0m
[0;32m     10 [0;31m    [0;32mdef[0m [0m__init__[0m[0;34m([0m[0mself[0m[0;34m,[0m [0mcomponents[0m[0;34m:[0m [0mList[0m[0;34m[[0m[0mType[0m[0;34m[[0m[0mGenerator[0m[0;34m][0m[0;34m][0m[0;34m=[0m[0;34m[[0m[0;34m][0m[0;34m)[0m[0;34m:[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m     11 [0;31m        [0;32mfrom[0m [0mIPython[0m[0;34m.[0m[0mcore[0m[0;34m.[0m[0mdebugger[0m [0;32mimport[0m [0mset_trace[0m[0;34m;[0m [0mset_trace[0m[0;34m([0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m---> 12 [0;31m        [0mself[0m[0;34m.[0m[0mcomponents[0m [0;34m=[0m [0mcomponents[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m     13 [0;31m[0;34m[0m[0m
[0m[0;32m     14 [0;31m    [0;32mdef[0m [0mbefore_subset[0m[0;34m([0m[0mself[0m[0;34m,[0m [0mcontext[0m[0;34m:[0m [0mContext[0m [0;34m=[0m [0;32mNone

In [2]:
imp.reload(setup)

The autotime extension is already loaded. To reload it, use:
  %reload_ext autotime
The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


<module 'setup' from '/home/bowd/Workspace/job/celo/mento2-model/experiments/notebooks/setup.py'>

time: 22.5 ms (started: 2022-02-02 15:44:58 +01:00)


We can then import the default experiment, and create a copy of the simulation object – we create a new copy for each analysis we'd like to perform:

In [3]:
import experiments.default_experiment as default_experiment
simulation_analysis_1 = copy.deepcopy(default_experiment.experiment.simulations[0])

time: 20.1 ms (started: 2022-02-02 15:44:58 +01:00)


We can use the `display_code` method to see the configuration of the default experiment before making changes:

In [4]:
display_code(default_experiment)  # In this example equivalent to display_code(simulation_analysis_1.)

time: 84.9 ms (started: 2022-02-02 15:44:58 +01:00)


Alternatively to modifying the default experiment in a notebook as shown in the next section, we can also load predefined experiment templates: 

In [5]:
import experiments.templates.monte_carlo_analysis as monte_carlo_analysis
simulation_analysis_2 = copy.deepcopy(monte_carlo_analysis.experiment.simulations[0])
display_code(monte_carlo_analysis)

time: 61.8 ms (started: 2022-02-02 15:44:58 +01:00)


## Modifying State Variables

To view what the Initial State (radCAD model-configuration setting `initial_state`) of the State Variables are, and to what value they have been set, we can inspect the dictionary as follows:

In [6]:
pprint(simulation_analysis_1.model.initial_state)

{'celo_price': 3.3843676128627402,
 'cusd_price': 1.0,
 'floating_supply': {'celo': 187391026.43773282, 'cusd': 59011440.89484415},
 'market_price': {'cusd_usd': 1},
 'mento_buckets': {'celo': 0.0, 'cusd': 0.0},
 'mento_rate': 3.3843676128627402,
 'reserve_account': {'account_id': 0, 'celo': 120000000.0, 'cusd': 0.0},
 'timestamp': None,
 'virtual_tanks': {'usd': 59011440.89484415}}
time: 12.1 ms (started: 2022-02-02 15:44:58 +01:00)


To modify the value of **State Variables** for a specific analysis, you need to select the relevant simulation and update the chosen model Initial State. For example, updating the `floating_supply` Initial State to `100e6` CELO and `123e5` cUSD.

In [7]:
simulation_analysis_1.model.initial_state.update({
    'floating_supply': {
        'celo': 100e6,
        'cusd': 123e5},
})

time: 19 ms (started: 2022-02-02 15:44:58 +01:00)


In [8]:
simulation_analysis_1.model.initial_state.update({
    'market_price': {
        'cusd_usd': 1},
})

time: 18.3 ms (started: 2022-02-02 15:44:59 +01:00)


Show updated initial `floating_supply`:

In [9]:
pprint(simulation_analysis_1.model.initial_state)

{'celo_price': 3.3843676128627402,
 'cusd_price': 1.0,
 'floating_supply': {'celo': 100000000.0, 'cusd': 12300000.0},
 'market_price': {'usd': 1},
 'mento_buckets': {'celo': 0.0, 'cusd': 0.0},
 'mento_rate': 3.3843676128627402,
 'reserve_account': {'account_id': 0, 'celo': 120000000.0, 'cusd': 0.0},
 'timestamp': None,
 'virtual_tanks': {'usd': 59011440.89484415}}
time: 18.8 ms (started: 2022-02-02 15:44:59 +01:00)


## Modifying System Parameters

To view what the System Parameters (radCAD model configuration setting `params`) are, and to what value they have been set, we can inspect the dictionary as follows:

In [10]:
pprint(simulation_analysis_1.model.params)

{'bucket_update_frequency_seconds': [300],
 'cusd_demand': [10000000],
 'date_irps': [datetime.datetime(2022, 3, 1, 0, 0)],
 'date_stability_providers': [datetime.datetime(2022, 10, 1, 0, 0)],
 'drift_market_price': [0],
 'dt': [1],
 'max_sell_amount': [0.1],
 'model': [<MarketPriceModel.GBM: 'gbm'>],
 'reserve_fraction': [0.01],
 'spread': [0.005],
 'volatility_market_price': [0.1]}
time: 20.8 ms (started: 2022-02-02 15:44:59 +01:00)


To modify the value of **System Parameters** for a specific analysis, you need to select the relevant simulation, and update the chosen model System Parameter (which is a list of values). For example, updating the `reserve_fraction` System Parameter to a sweep of two values, `0.001` and `0.01`:

In [11]:
simulation_analysis_1.model.params.update({
    "reserve_fraction": [0.001, 0.1],
})

time: 19.9 ms (started: 2022-02-02 15:44:59 +01:00)


## Executing Experiments

We can now execute our custom analysis and retrieve the post-processed Pandas DataFrame using the `run(...)` method:

In [12]:
df, exceptions = run(simulation_analysis_1)

2022-02-02 15:44:59,312 - root - INFO - Running experiment
> [0;32m/home/bowd/.local/share/virtualenvs/mento2-model-qQKcNmL1/lib/python3.8/site-packages/radcad/wrappers.py[0m(130)[0;36m_before_subset[0;34m()[0m
[0;32m    128 [0;31m    [0;32mdef[0m [0m_before_subset[0m[0;34m([0m[0mself[0m[0;34m,[0m [0mcontext[0m[0;34m:[0m [0mContext[0m[0;34m=[0m[0;32mNone[0m[0;34m)[0m[0;34m:[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m    129 [0;31m        [0;32mfrom[0m [0mIPython[0m[0;34m.[0m[0mcore[0m[0;34m.[0m[0mdebugger[0m [0;32mimport[0m [0mset_trace[0m[0;34m;[0m [0mset_trace[0m[0;34m([0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m--> 130 [0;31m        [0;32mif[0m [0mself[0m[0;34m.[0m[0mbefore_subset[0m[0;34m:[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m    131 [0;31m            [0mself[0m[0;34m.[0m[0mbefore_subset[0m[0;34m([0m[0mcontext[0m[0;34m=[0m[0mcontext[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m    132 [

KeyError: 'cusd_usd'

time: 3.54 s (started: 2022-02-02 15:44:59 +01:00)


## Post-processing and Analysing Results

We can see that we had no exceptions for the single simulation we executed:

In [None]:
exceptions[0]['exception'] == None

We can simply display the Pandas DataFrame to inspect the results. This DataFrame already has some default post-processing applied (see [experiments/post_processing.py](../post_processing.py)). For example, parameters that change in the parameter grid (if there are any) are attached as columns to the end of the dataframe.

In [None]:
df[df.simulation==0]

In [None]:
# Show which reserve_fraction values were used in the grid
df.groupby('subset')['reserve_fraction'].unique()

We can also use Pandas for numerical analyses:

In [None]:
# Get the maximum mento_rate for each subset: in this example each reserve_fraction value used in the grid.
df.groupby('subset')['mento_rate'].max()

## Visualizing Results

Once we have the results post-processed and in a Pandas DataFrame, we can use Plotly for plotting our results:

In [None]:
# Plot the mento_rate for each subset (each parameter grid combination)
px.line(df, x='timestep', y='mento_rate', facet_col='subset')

In [None]:
# Plot using visualizations predefined in the visualizations module
visualizations.plot_celo_price(df)

# Creating New, Customized Experiment Notebooks

If you want to create an entirely new analysis, you'll need to create a new experiment notebook, which entails the following steps:
* Step 1: Select an experiment template from the `experiments/templates/` directory to start from. If you'd like to create your own template, the [example_analysis.py](../templates/example_analysis.py) template gives an example of extending the default experiment to override default State Variables and System Parameters that you can copy.
* Step 2: Create a new notebook in the `experiments/notebooks/` directory, using the [template.ipynb](./template.ipynb) notebook as a guide, and import the experiment from the experiment template.
* Step 3: Customize the experiment for your specific analysis.
* Step 4: Execute your experiment, post-process and analyze the results, and create Plotly charts!

# Advanced Experiment-configuration & Simulation Techniques

## Setting Simulation Timesteps and Unit of Time `dt`

In [None]:
from experiments.simulation_configuration import TIMESTEPS, DELTA_TIME

We can configure the number of simulation timesteps `TIMESTEPS` from a `simulation_time_seconds` divided by the product of `blocktime_seconds` and the number of blocks per timestep `DELTA_TIME`:

`DELTA_TIME` is a variable that sets how many blocks are simulated for each timestep. Sometimes, if we don't need a finer granularity (1 block per timestep, for example), we can then set `DELTA_TIME` to a larger integer value for better performance. The default value is 1 block per timestep which means we simulate on a per block basis.

```python
simulation_time_seconds = 365 * 24 * 60 * 60  # If we choose 1 year
TIMESTEPS = simulation_time_seconds // (constants.blocktime_seconds * DELTA_TIME)
```

In [None]:
TIMESTEPS

Finally, to set the simulation timesteps (note, additionally you may have to update the environmental processes that depend on the number of timesteps, and override the relevant parameters):

In [None]:
simulation_analysis_1.timesteps = TIMESTEPS

## Changing the Mento1 -> Mento2 Upgrade Stage

The model operates over different Mento1 -> Mento2 upgrade stages. The default experiment operates in the "mento1" stage (so no upgrades).

`Stage` is an Enum; we can import it and see what options we have:

In [None]:
from model.types import Stage

The model is well documented, and we can view the Python docstring to see what a Stage is, and create a dictionary to view the Enum members:

In [None]:
print(Stage.__doc__)
{e.name: e.value for e in Stage}

The `Mento1` stage, for example, assumes the mento1 setup w/o stability providers, IRPS, etc.

In [None]:
display_code(Stage)

As before, we can update the "stage" System Parameter to set the relevant Stage:

In [None]:
simulation_analysis_1.model.params.update({
    "stage": [Stage.Mento1]
})

## Performing Large-scale Experiments (NOT WORKING YET!)

When executing an experiment, we have three degrees of freedom - **simulations, runs, and subsets** (parameter sweeps).

We can have multiple simulations for a single experiment, multiple runs for every simulation, and we can have multiple subsets for every run. Remember that `simulation`, `run`, and `subset` are simply additional State Variables set by the radCAD engine during execution – we then use those State Variables to index the results for a specific dimension, e.g. simulation 1, run 5, and subset 2.

Each dimension has a generally accepted purpose:
* Simulations are used for A/B testing
* Runs are used for Monte Carlo analysis
* Subsets are used for parameter sweeps

In some cases, we break these "rules" to allow for more degrees of freedom or easier configuration.

One example of this is the `eth_price_eth_staked_grid_analysis` experiment template we imported earlier:

In [None]:
display_code(eth_price_eth_staked_grid_analysis)

Here, we create a grid of two State Variables – ETH price and ETH staked – using the `eth_price_process` and `eth_staked_process`.

Instead of sweeping the two System Parameters to create different subsets, we pre-generate all possible combinations of the two values first and use the specific `run` to index the data, i.e. for each run we get a new ETH price and ETH staked sample.

This allows the experimenter (you!) to use a parameter sweep on top of this analysis if they choose, and we have kept one degree of freedom.

### Composing an Experiment Using **simulations, runs, and subsets**

In [None]:
from radcad import Experiment, Engine, Backend


# Create a new Experiment of three Simulations:
# * Simulation Analysis 1 has one run and two subsets – a parameter sweep of two values (BASE_REWARD_FACTOR = [64, 32])
# * Simulation Analysis 2 has one run and one subset – a basic simulation configuration
# * Simulation Analysis 3 has 400 runs (20 * 20) and one subset – a parameter grid indexed using `run`
experiment = Experiment([simulation_analysis_1, simulation_analysis_2, simulation_analysis_3])

### Configuring the radCAD Engine for High Performance

To improve simulation performance for large-scale experiments, we can set the following settings using the radCAD `Engine`. Both Experiments and Simulations have the same `Engine`; when executing an `Experiment` we set these settings on the `Experiment` instance:

In [None]:
# Configure Experiment Engine
experiment.engine = Engine(
    # Use a single process; the overhead of creating multiple processes
    # for parallel-processing is only worthwhile when the Simulation runtime is long
    backend = Backend.SINGLE_PROCESS,
    # Disable System Parameter and State Variable deepcopy:
    # * Deepcopy prevents mutation of state at the cost of lower performance
    # * Disabling it leaves it up to the experimenter to use Python best-practises to avoid 
    # state mutation, like manually using `copy` and `deepcopy` methods before
    # performing mutating calculations when necessary
    deepcopy = False,
    # If we don't need the state history from individual substeps,
    # we can get rid of them for higher performance
    drop_substeps = True,
)

# Disable logging
# For large experiments, there is lots of logging. This can get messy...
logger = logging.getLogger()
logger.disabled = True

# Execute Experiment
raw_results = experiment.run()

### Indexing a Large-scale Experiment Dataset

In [None]:
# Create a Pandas DataFrame from the raw results
df = pd.DataFrame(experiment.results)
df

In [None]:
# Select each Simulation dataset
df_0 = df[df.simulation == 0]
df_1 = df[df.simulation == 1]
df_2 = df[df.simulation == 2]

datasets = [df_0, df_1, df_2]

# Determine size of Simulation datasets
for index, data in enumerate(datasets):
    runs = len(data.run.unique())
    subsets = len(data.subset.unique())
    timesteps = len(data.timestep.unique())
    
    print(f"Simulation {index} has {runs} runs * {subsets} subsets * {timesteps} timesteps = {runs * subsets * timesteps} rows")

In [None]:
# Indexing simulation 0, run 1 (indexed from one!), subset 1, timestep 1
df.query("simulation == 0 and run == 1 and subset == 1 and timestep == 1")