# Experiment Notebook: Compounding Yields for Pool Validators (Model Extension 5 - WIP)

# Table of Contents
* [Experiment Summary](#Experiment-Summary)
* [Experiment Assumptions](#Experiment-Assumptions)
* [Experiment Setup](#Experiment-Setup)
* [Analysis 1: Extended Time-domain](#Analysis-1:-Extended-Time-domain)
* [Analysis 2: Sweep of Pool Size](#Analysis-2:-Sweep-of-Pool-Size)

# Experiment Summary 

Each discrete validator requires a 32 ETH deposit when initialized. A validator's effective balance – the value used to calculate validator rewards – is a maximum of 32 ETH. Any rewards a validator earns above and beyond the 32 ETH requirement do not contribute to their yields until they accrue an additional 32 ETH and create another validator instance. This prevents a solo validator from reinvesting their yields to receive compound interest.

On the other hand, stakers that utilise validator pools, on exchanges for example, can compound their returns by pooling the returns of multiple validators to initialize another validator with 32 ETH. The pooling of returns and initialization of a shared validator effectively results in compound interest for those utilising staking pools, potentially resulting in much higher yields, especially over longer periods of time, than that of solo / distributed validators.

The following experiment notebook investigates ...


# Experiment Assumptions

* AVG Pool Size captures the ***initial, average pool size*** accross all pool environments

* In order to ensure consistent analysis on the effect of 'average pool size', new validators initialised externally to pools (i.e. from 'validator_process') assemble new pools as oposed to joining existing pools. Consequently, pool sizes grow only when new shared validators are intialized. 

* Pooling begins simultenously across all pools. Because pool sizes are captured as an average, new shared validator instances are initialised simultaneously across all pools. This leads to sudden 'jumps' in new shared validator instances as pools accrue the target stake ammount at the same time. In reality, such jumps are unlikely to occur due to variations across validator pool environments.

* The current implementation assumes all pool validators engage in pool yield compounding. The model could include a parameter accounting for the fraction of validator pools engaging in pool compounding once more data is known.


# Experiment Setup

We begin with several experiment-notebook-level preparatory setup operations:

* Import relevant dependencies
* Import relevant experiment templates
* Create copies of experiments
* Configure and customize experiments 

In [None]:
# Import the setup module:
# * sets up the Python path
# * runs shared notebook configuration methods, such as loading IPython modules
import setup

import copy
import logging
import numpy as np
import pandas as pd
import plotly.express as px

import experiments.notebooks.visualizations as visualizations
from experiments.run import run
from experiments.utils import display_code
from model.types import Stage
from model.constants import epochs_per_day, epochs_per_week, epochs_per_month
from model.state_variables import validator_count_distribution
from model.system_parameters import pool_validator_indeces

In [None]:
# Enable/disable logging
logger = logging.getLogger()
logger.disabled = False

In [None]:
# Import experiment templates
import experiments.templates.time_domain_analysis as time_domain_analysis
import experiments.templates.pool_size_sweep_analysis as pool_size_sweep_analysis

# Analysis 1: Extended Time-domain

(Simulate the model over a 10 year period and plot relevent metrics.)

In [None]:
experiment = time_domain_analysis.experiment
experiment.engine.deepcopy = True 
simulation_1 = copy.deepcopy(time_domain_analysis.experiment.simulations[0])

In [None]:
# Experiment configuration:
DELTA_TIME = epochs_per_week  # epochs per timestep 

SIMULATION_TIME_MONTHS = 10 * 12
TIMESTEPS = epochs_per_month * SIMULATION_TIME_MONTHS // DELTA_TIME

simulation_1.timesteps = TIMESTEPS

normal_adoption = simulation_1.model.params['validator_process'][0](_run=None, _timestep=None)

simulation_1.model.params.update({
    "dt": [DELTA_TIME], # (default: per week)
    "stage": [Stage.ALL],
    "avg_pool_size": [1, 10, 100, 1000], # AVG initial pool size
    "eth_price_process": [lambda _run, _timestep: 3000],
    'validator_process': [lambda _run, _timestep: normal_adoption * 1],
    
})

In [None]:
# Calculate inititial number of pools (derived from 'avg_pool_size' parameter list)

avg_pool_size_list = simulation_1.model.params['avg_pool_size']
nValidatorEnvironments = len(validator_count_distribution)
number_of_pools_list = np.zeros((len(avg_pool_size_list), nValidatorEnvironments))


for i in range(len(avg_pool_size_list)): 
    for y in range(nValidatorEnvironments):
        if y in pool_validator_indeces:
            number_of_pools_list[i][y] = np.round(validator_count_distribution[y] / avg_pool_size_list[i])

    
simulation_1.model.params.update({"number_of_pools": number_of_pools_list})

In [None]:
# Experiment execution
df_1, exceptions = run(simulation_1)

## Visualizations

#### To Do:
* Create visualization plots in __init__.py
* Label all x-axis time as Date


### Validator Pools Metrics

In [None]:
# AVG Annualized Daily Profit Yields (%) per pool

px.line(
    df_1,
    x='timestamp',
    y=['diy_hardware_profit_yields_pct', 'diy_cloud_profit_yields_pct','pool_staas_pool_profit_yields_pct', 'pool_hardware_pool_profit_yields_pct', 'pool_cloud_pool_profit_yields_pct'],
    animation_frame='avg_pool_size',
    title='AVG Profit Yields (%) per pool'
)


In [None]:
# AVG Annualized Daily Profit Yields (%) per pool

px.line(
    df_1,
    x='timestamp',
    y=['diy_hardware_profit_yields_pct', 'pool_cloud_pool_profit_yields_pct'],
    animation_frame='avg_pool_size',
    title='AVG Profit Yields (%) per pool'
)


In [None]:
# Cumulative Profit Yields 

px.line(
    df_1,
    x='timestamp',
    y=['diy_hardware_cumulative_profit_yields_pct', 'diy_cloud_cumulative_profit_yields_pct', 'pool_staas_pool_cumulative_profit_yields_pct', 'pool_hardware_pool_cumulative_profit_yields_pct', 'pool_cloud_pool_cumulative_profit_yields_pct'],
    animation_frame='avg_pool_size',
    title='Cumulative Profit Yields (%) per pool'
)

In [None]:
# Cumulative Profit Yields 

px.line(
    df_1,
    x='timestamp',
    y=['diy_hardware_cumulative_profit_yields_pct', 'pool_cloud_pool_cumulative_profit_yields_pct'],
    animation_frame='avg_pool_size',
    title='Cumulative Profit Yields (%) per pool'
)

In [None]:
# Shared Validators per Pool

px.line(
    df_1,
    x='timestamp',
    y=['pool_staas_shared_validators_per_pool', 'pool_hardware_shared_validators_per_pool', 'pool_cloud_shared_validators_per_pool'],
    animation_frame='avg_pool_size'
)

In [None]:
# AVG ETH STAKED per pool

# px.line(
#     df_1,
#     x='timestamp',
#     y=['pool_staas_pool_eth_staked', 'pool_hardware_pool_eth_staked', 'pool_cloud_pool_eth_staked'],
#     animation_frame='avg_pool_size'
# )


In [None]:
# Pool size

# px.line(
#     df_1,
#     x='timestamp',
#     y=['pool_staas_pool_size', 'pool_hardware_pool_size', 'pool_cloud_pool_size'],
#     animation_frame='avg_pool_size'
# )


In [None]:
# AVG Profit (USD) per pool

# px.line(
#     df_1,
#     x='timestamp',
#     y=['pool_staas_pool_profit', 'pool_hardware_pool_profit', 'pool_cloud_pool_profit'],
#     animation_frame='avg_pool_size'
# )


## Environment-level Metrics

### Total Validators

In [None]:
# px.line(
#     df_1,
#     x='timestamp',
#     y=['pool_staas_validator_count', 'pool_hardware_validator_count', 'pool_cloud_validator_count', 'diy_hardware_validator_count', 'diy_cloud_validator_count'],
#     animation_frame='avg_pool_size'
# )

### Shared Validators

In [None]:
# px.line(
#     df_1,
#     x='timestamp',
#     y=['pool_staas_shared_validators', 'pool_hardware_shared_validators', 'pool_cloud_shared_validators', 'diy_hardware_shared_validators', 'diy_cloud_shared_validators'],
#     animation_frame='avg_pool_size'
# )

###  ETH Staked

In [None]:
# px.line(
#     df_1,
#     x='timestamp',
#     y=['pool_staas_eth_staked', 'pool_hardware_eth_staked', 'pool_cloud_eth_staked', 'diy_hardware_eth_staked', 'staas_full_eth_staked'],
#     animation_frame='avg_pool_size'
# )

# Analysis 2: Sweep of Pool Size

Phase-space analysis showing metrics as a function of the average pool size for pool validators using pool compounding.

* In order to accurately account for the compounding of pool validator yeilds over time, we first simulate the model over the desired time-horizon.
* Then, we perform a phase-space analysis at the desired timestep (e.g. at year 10)


In [None]:
# Analysis-specific setup

In [None]:
# Fetch the pool-size sweep analysis experiment
experiment = pool_size_sweep_analysis.experiment
experiment.engine.deepcopy = True 
# Create a copy of the experiment simulation
simulation_2 = copy.deepcopy(experiment.simulations[0])

In [None]:
# Experiment configuration 

# Note: to change the default 'initial AVG pool size' samples, 
# see 'pool_size_sweep_analysis.py' located in experiments/templates
DELTA_TIME = epochs_per_day  # epochs per timestep (determines compounding period)
SIMULATION_TIME_MONTHS = 5 * 12  # number of months
TIMESTEPS = epochs_per_month * SIMULATION_TIME_MONTHS // DELTA_TIME


normal_adoption = simulation_2.model.params['validator_process'][0](_run=None, _timestep=None)

simulation_2.model.params.update({
    "dt": [DELTA_TIME], # determines compounding period (default: per day)
    "validator_process": [lambda _run, _timestep: normal_adoption * 1], # New validators per epoch
    "stage": [Stage.ALL], 
    "eth_price_process": [lambda _run, _timestep: 3000],
    
})


# Set time horizon:
YEARS = 5
TIMESTEP_ANALYSIS = YEARS * 12 # convert years to months

In [None]:
# Experiment execution
df_2, exceptions = run(simulation_2)

## Visualizations

#### To Do
* Set time analysis point in labels / headings

In [None]:
# To plot a specific point in time without having to re-run the simulation, 
# set TIMESTEP_ANALYSIS below and re-run the following cells.

YEAR = 3
TIMESTEP_ANALYSIS = YEAR * 12  # convert year to month

In [None]:
#visualizations.plot_pool_profit_over_pool_size(df_2, TIMESTEP_ANALYSIS)

In [None]:
visualizations.plot_pool_profit_yields_over_pool_size(df_2, TIMESTEP_ANALYSIS)

In [None]:
#visualizations.plot_pool_cumulative_yields_over_pool_size(df_2, TIMESTEP_ANALYSIS, 2)