Building Bancor Simulations in Python --- A Step by Step Explainer
================================================================

Learn the fundamentals of Bancor protocol simulations.
------------------------------------------------------

![](https://miro.medium.com/max/1400/1*YGYDyx3QC29suZxcuZPwwg.png)


Table of Contents
=================
- [**Project Setup**](https://mike-w-casale.medium.com/e308858a5324#54d8)
- [**Datasets**](https://mike-w-casale.medium.com/e308858a5324#79d6)
- [**The Baseline Simulation**](https://mike-w-casale.medium.com/e308858a5324#f651)
- [**The Proposal Simulation**](https://mike-w-casale.medium.com/e308858a5324#640e)
- [**Summarizing the results**](https://mike-w-casale.medium.com/e308858a5324#5084)


This notebook was created as supporting material for the blog post "[Building Bancor Simulations in Python]()"

Project setup
=============

If you don't already have one, you should create a virtualenv (VM) using [these instructions](https://docs.python.org/3/library/venv.html). Note that an alternate environment such as Conda will also suffice.

Install
-------

The **Bancor Research** python library is available for Python 3.6+.

To install it in your VM using [pypi](https://pypi.org/project/bancor-research/), run this command in your terminal:

````{tab} PyPI
$ pip install --upgrade bancor-research
````

Note, if running this command inside a jupyter notebook, use this instead:

````{tab} PyPI
! pip install --upgrade bancor-research
````


Extra dependencies to access refreshed yahoo finance API data (optional):

````{tab} PyPI
pip install yfinance
````


Datasets
========

The following datasets have been curated and preprocessed in order to derive realistic simulation assumptions based on real-world data whenever possible. You can download the data manually via the links below, or you can do this programmatically as demonstrated in the code section(s) that follows.

**Raw Datasets:**
- [**price_feeds.csv**](https://bancorml.s3.us-east-2.amazonaws.com/price_feeds.csv)--- consists of token price feeds at hourly resolution from the yahoo finance API.
- [**historical_actions.csv**](https://bancorml.s3.us-east-2.amazonaws.com/historical_actions.csv)--- consists of Trade, Deposit, and Withdrawal actions performed on Bancor v3 since its launch in April.
- [**fees_vs_liquidity.csv**](https://bancorml.s3.us-east-2.amazonaws.com/fees_vs_liquidity.csv)--- consists of historical trading fees per token joined with on-curve liquidity snapshots.
- [**user_initial_balances.csv**](https://bancorml.s3.us-east-2.amazonaws.com/blog/user_initial_balances.csv)--- default balances for the global user wallet.

In addition to the raw data, the following [Data Profiling Reports](https://pandas-profiling.ydata.ai/docs/master/index.html) are available for html download:

**HTML Data Profiling Reports:**
- [**price_feeds.html**](https://bancorml.s3.us-east-2.amazonaws.com/blog/price_feeds.html)
- [**historical_actions.html**](https://bancorml.s3.us-east-2.amazonaws.com/blog/historical_actions.html)
- [**fees_vs_liquidity.html**](https://bancorml.s3.us-east-2.amazonaws.com/blog/fees_vs_liquidity.html)

**Import Core Dependencies**

In [2]:
import pandas as pd
import json
from bancor_simulator.v3.spec import get_vault_balance
from bancor_research.bancor_simulator.v3.spec.state import *

**Default Parameters:**

Referring to the original Bancor v3 specification in [BIP15](https://docs.google.com/document/d/11UeMYaI_1CWdf_Nu6veUO-vNB5uX-FcVRqTSU-ziDRk/edit), we define the following default parameters to be used throughout the baseline simulation and as the genesis state for the proposal simulation.

- **default_trading_fee**= The fee charged by the protocol to perform trades

- **default_bnt_funding_limit**= The BancorDAO determines the available liquidity for trading through adjustment of the "BNT funding limit" parameter.

- **default_bnt_min_liquidity**= The BancorDAO prescribes a liquidity threshold, denominated in BNT, that represents the minimum available liquidity that must be present before the protocol bootstraps the pool with BNT. This parameter is adjustable, and is set to 20,000 BNT at the time of launch.

In [3]:
default_trading_fee = '1%'
default_bnt_funding_limit = 1000000
default_bnt_min_liquidity = 10000 # Note that we set this slightly lower that actual to prevent automatic pool shutdown scenarios

**Other Constants, Parameters, & Variables:**

In [4]:
# paths to data downloads
price_feeds_path = 'https://bancorml.s3.us-east-2.amazonaws.com/price_feeds-2.csv'
historical_actions_path = 'https://bancorml.s3.us-east-2.amazonaws.com/historical_actions.csv'
fees_earned_vs_slippage_path = 'https://bancorml.s3.us-east-2.amazonaws.com/fees_earned_vs_slippage-2.csv'
user_initial_balances_path = 'https://bancorml.s3.us-east-2.amazonaws.com/blog/user_initial_balances.csv'

# derived data distributions used to specify realistic monte-carlo scenario generation.
pool_freq_dist_path = 'https://bancorml.s3.us-east-2.amazonaws.com/blog/pool_freq_dist.json'
whitelisted_tokens_path = 'https://bancorml.s3.us-east-2.amazonaws.com/blog/whitelisted_tokens.json'
action_freq_dist_path = 'https://bancorml.s3.us-east-2.amazonaws.com/blog/action_freq_dist.json'

# simulation parameters
constant_multiplier = 5
n_rolling_days = 7
num_simulation_days = 60
num_hours_per_day = 24
arbitrage_percentage = .99
global_username = 'global user'
sample_date_cutoff = '01/08/22 00:00:00'

# mean values based on the pandas profil reports
deposit_mean = 15410.150585
trade_mean = 18211.910227
withdraw_mean = 74761.482476

**Load the Raw Datasets**

In [5]:
price_feeds = pd.read_csv(price_feeds_path)
historical_actions = pd.read_csv(historical_actions_path)
fees_earned_vs_slippage = pd.read_csv(fees_earned_vs_slippage_path)
user_initial_balances = pd.read_csv(user_initial_balances_path)

**Load the Probability Distributions & Whitelist**

In [6]:
def load_json(json_file, indx=None):
    report_file = pd.read_json(json_file, typ='series')
    df = pd.DataFrame([report_file])
    if indx is not None:
        return json.loads(df.to_json(orient='records'))[indx]
    else:
        return df.to_json(orient='records')

pool_freq_dist = load_json(pool_freq_dist_path, indx=0)
action_freq_dist = load_json(action_freq_dist_path, indx=0)
whitelisted_tokens = load_json(whitelisted_tokens_path, indx=0)

Note the `pool_freq_dist` dictionary structure for future reference.

In [7]:
# Probability distribution for tokens involved in actions.
pool_freq_dist

{'eth': 0.2770026543,
 'link': 0.2014224461,
 'bnt': 0.496052542,
 'wbtc': 0.0255223576}

Note the `action_freq_dist` dictionary structure for future reference.

In [8]:
# Probability distribution for actions
action_freq_dist

{'trade': 0.4209827809,
 'deposit': 0.3453685428,
 'withdraw_initiated': 0.1168243381,
 'withdraw_completed': 0.07527394,
 'withdraw_canceled': 0.02150684,
 'withdraw_pending': 0.0200435582}

Note the `whitelisted_tokens` dictionary structure for future reference.

In [9]:
# Default whitelist structure
whitelisted_tokens

{'eth': {'trading_fee': '1%',
  'bnt_funding_limit': 1000000,
  'decimals': 18,
  'ep_vault_balance': 0},
 'link': {'trading_fee': '1%',
  'bnt_funding_limit': 1000000,
  'decimals': 18,
  'ep_vault_balance': 0},
 'bnt': {'trading_fee': '1%',
  'bnt_funding_limit': 1000000,
  'decimals': 18,
  'ep_vault_balance': 0},
 'wbtc': {'trading_fee': '1%',
  'bnt_funding_limit': 1000000,
  'decimals': 18,
  'ep_vault_balance': 0}}

In [10]:
from datetime import datetime

date_time_obj = datetime.strptime(sample_date_cutoff, '%d/%m/%y %H:%M:%S')
time = historical_actions.copy()
time['date'] = [datetime.strptime(str(i), '%Y-%m-%d') for i in pd.to_datetime(time['time']).dt.date.values]
time = time[time['date'] > date_time_obj]
mean_events_per_day = time.groupby('date').count()['event_name'].mean()
mean_events_per_hour = int(round(float(mean_events_per_day / num_hours_per_day)))

num_timesteps = num_simulation_days * num_hours_per_day
assert len(price_feeds) > num_timesteps, "the number of timesteps must be less than the length of the price feeds table"
simulation_actions_count = int(round(mean_events_per_day * num_simulation_days, 0))

print(f'mean_events_per_day={mean_events_per_day}')
print(f'mean_events_per_hour={mean_events_per_hour}')
print(f'simulation_actions_count={simulation_actions_count}')
print(f'num_timesteps={num_timesteps}')

mean_events_per_day=58.57142857142857
mean_events_per_hour=2
simulation_actions_count=3514
num_timesteps=1440


The `transact(...)` function below is the main logic which specifies the Monte-Carlo transaction (deposit, trade, etc...) frequency distributions.

In [11]:
def transact(self):
    """
    Specifies the Monte-Carlo transaction behavior (i.e., deposits, trades, etc...)
    in terms of their respective probability distributions.
    """

    latest_action = 'None'
    for _ in range(num_hours_per_day):
        if self.timestamp == 1:

            self.protocol.global_state.set_protocol_wallet_balance('bnt', '30000001')
            self.protocol.global_state.set_protocol_wallet_balance('eth', '30000001')
            self.protocol.global_state.set_protocol_wallet_balance('link', '30000001')
            self.protocol.global_state.set_protocol_wallet_balance('wbtc', '30000001')

            self.protocol.global_state.set_pooltoken_balance('bnt', '30000001')
            self.protocol.global_state.set_pooltoken_balance('eth', '30000001')
            self.protocol.global_state.set_pooltoken_balance('link', '30000001')
            self.protocol.global_state.set_pooltoken_balance('wbtc', '30000001')

            self.protocol.global_state.set_staked_balance('bnt', '30000001')
            self.protocol.global_state.set_staked_balance('eth', '30000001')
            self.protocol.global_state.set_staked_balance('link', '30000001')
            self.protocol.global_state.set_staked_balance('wbtc', '30000001')

            self.protocol.global_state.set_master_vault_balance('bnt', '3000001')
            self.protocol.global_state.set_master_vault_balance('eth', '3000001')
            self.protocol.global_state.set_master_vault_balance('link', '3000001')
            self.protocol.global_state.set_master_vault_balance('wbtc', '3000001')

            self.protocol.global_state.set_bnt_trading_liquidity('bnt', '10001')
            self.protocol.global_state.set_bnt_trading_liquidity('eth', '10001')
            self.protocol.global_state.set_bnt_trading_liquidity('link', '10001')
            self.protocol.global_state.set_bnt_trading_liquidity('wbtc', '10001')

            self.protocol.set_user_balance(f'user_{self.timestamp}', 'eth', '3000001', self.timestamp)
            self.protocol.set_user_balance(f'user_{self.timestamp}', 'bnt', '3000001', self.timestamp)
            self.protocol.set_user_balance(f'user_{self.timestamp}', 'wbtc', '3000001', self.timestamp)
            self.protocol.set_user_balance(f'user_{self.timestamp}', 'link', '3000001', self.timestamp)

            self.protocol.deposit('eth', '300000', f'user_{self.timestamp}', self.timestamp)
            self.protocol.deposit('link', '300000', f'user_{self.timestamp}', self.timestamp)
            self.protocol.deposit('bnt', '300000', f'user_{self.timestamp}', self.timestamp)
            self.protocol.deposit('wbtc', '300000', f'user_{self.timestamp}', self.timestamp)

        timestamp = self.timestamp

        for _ in range(mean_events_per_hour):
            self.timestamp += 1
            beginning_state = self.protocol.global_state.copy()
            try:
                self.latest_amt = None
                self.latest_tkn_name = None
                self.user_name = global_username

                i = self.random.randint(0, self.num_timesteps)

                deposit_range = self.num_timesteps * self.action_freq_dist['deposit']
                trade_range = deposit_range + self.num_timesteps * self.action_freq_dist['trade'] * (1 - arbitrage_percentage)
                arb_range = trade_range + self.num_timesteps * self.action_freq_dist['trade'] * arbitrage_percentage
                withdraw_completed = arb_range + self.num_timesteps * self.action_freq_dist['withdraw_completed']

                if i < deposit_range:
                    latest_action = "deposit"
                    self.perform_random_deposit()

                elif deposit_range <= i < trade_range:
                    latest_action = "trade"
                    self.perform_random_trade()

                elif trade_range <= i < arb_range:
                    latest_action = "arbitrage_trade"
                    self.perform_random_arbitrage_trade()

                elif arb_range <= i < withdraw_completed:
                    latest_action = "withdrawal"
                    self.perform_random_withdrawal()

                state = self.protocol.global_state
                state.timestamp = timestamp
                for tkn_name in self.whitelisted_tokens:
                    if not get_is_trading_enabled(state, tkn_name):
                        self.protocol.enable_trading(tkn_name=tkn_name, timestamp=timestamp)

                # The code below creates a new dataframe which collects the data we want to analyze
                df = {'timestamp': [timestamp], 'latest_action': [latest_action], 'latest_amt': [self.latest_amt],
                      'latest_tkn_name': [self.latest_tkn_name]}

                for tkn in self.whitelisted_tokens:
                    if tkn != 'bnt':
                        df[f'{tkn}_bnt_funding_limit'] = [get_bnt_funding_limit(state, tkn)]
                        df[f'{tkn}_vault_real'] = [get_master_vault_balance(state, tkn)]
                        df[f'{tkn}_is_trading_enabled'] = [get_is_trading_enabled(state, tkn)]
                        df[f'{tkn}_iloss'] = [self.iloss_realized[tkn][-1]]
                        df[f'{tkn}_fees_earned'] = [self.total_fees_earned[tkn][-1]]
                        df[f'{tkn}_staking'] = [get_staked_balance(state, tkn)]
                        df[f'{tkn}_surplus_real'] = [df[f'{tkn}_vault_real'][0] - df[f'{tkn}_staking'][0]]
                        df[f'{tkn}_tkn_trading_liquidity'] = [get_tkn_trading_liquidity(state, tkn)]
                        df[f'{tkn}_bnt_trading_liquidity'] = [get_bnt_trading_liquidity(state, tkn)]

                self.logger.append(pd.DataFrame(df))
            except:
                self.protocol.global_state = beginning_state


Finally, we import the main logical class `MonteCarloGenerator` which we will use to run the simulations.

In [12]:
from bancor_research.scenario_generator import MonteCarloGenerator

Run the baseline simulation.

In [13]:
simulator = MonteCarloGenerator(
    cooldown_time=0,
    whitelisted_tokens= whitelisted_tokens,
    price_feed=price_feeds,
    user_initial_balances=user_initial_balances,
    simulation_actions_count=simulation_actions_count,
    num_timesteps=num_timesteps,
    num_simulation_days=num_simulation_days,
    pool_freq_dist=pool_freq_dist,
    action_freq_dist=action_freq_dist,
    bnt_min_liquidity=default_bnt_min_liquidity,
    deposit_mean=deposit_mean,
    trade_mean=trade_mean,
    withdraw_mean=withdraw_mean
)

baseline_output = simulator.run(transact)
baseline_output.to_csv('baseline_output.csv', index=False)
baseline_output

Unnamed: 0,timestamp,latest_action,latest_amt,latest_tkn_name,eth_bnt_funding_limit,eth_vault_real,eth_is_trading_enabled,eth_iloss,eth_fees_earned,eth_staking,...,link_bnt_trading_liquidity,wbtc_bnt_funding_limit,wbtc_vault_real,wbtc_is_trading_enabled,wbtc_iloss,wbtc_fees_earned,wbtc_staking,wbtc_surplus_real,wbtc_tkn_trading_liquidity,wbtc_bnt_trading_liquidity
0,1,withdrawal,,,1000000,3300001,False,0.000000,0.000000e+00,30300001,...,0,1000000,3300001,False,0,0.000000,30300001,-27000000,0,0
0,3,deposit,1190891.14616582007147371768951416015625,link,1000000,3300001,True,0.000000,0.000000e+00,30300001,...,40000,1000000,3300001,True,0,0.000000,30300001,-27000000,1.13063214690753087539338806419028157698517586...,20000
0,3,withdrawal,,,1000000,3300001,True,0.000000,0.000000e+00,30300001,...,40000,1000000,3300001,True,0,0.000000,30300001,-27000000,1.13063214690753087539338806419028157698517586...,20000
0,5,deposit,249479.45779123276588506996631622314453125,bnt,1000000,3300001,True,0.000000,0.000000e+00,30300001,...,40000,1000000,3300001,True,0,0.000000,30300001,-27000000,1.13063214690753087539338806419028157698517586...,20000
0,5,deposit,,,1000000,3300001,True,0.000000,0.000000e+00,30300001,...,40000,1000000,3300001,True,0,0.000000,30300001,-27000000,1.13063214690753087539338806419028157698517586...,20000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
0,2930,deposit,,,1000000,325792441.610165623943196234363268440338424670...,True,-91035.392107,2.003109e+06,254616109.023875636719120051025932512415388586...,...,631560.398239814658270464632595433488429575091...,1000000,30022166.9064246188373576752303256055328997677...,True,0,13727.052858,52120308.5324741419921863743853312249923941369...,-22098141.626049523154828699155005619459494369...,4915425.69254200666935169233327978605618915725...,118.570894051669753580554961796224658616733944...
0,2932,withdrawal,,,1000000,325792441.610165623943196234363268440338424670...,True,-91035.392107,2.003109e+06,254616109.023875636719120051025932512415388586...,...,631560.398239814658270464632595433488429575091...,1000000,30022166.9064246188373576752303256055328997677...,True,0,13727.052858,52120308.5324741419921863743853312249923941369...,-22098141.626049523154828699155005619459494369...,4915425.69254200666935169233327978605618915725...,118.570894051669753580554961796224658616733944...
0,2934,withdrawal,,,1000000,325792441.610165623943196234363268440338424670...,True,-91035.392107,2.003109e+06,254616109.023875636719120051025932512415388586...,...,631560.398239814658270464632595433488429575091...,1000000,30022166.9064246188373576752303256055328997677...,True,0,13727.052858,52120308.5324741419921863743853312249923941369...,-22098141.626049523154828699155005619459494369...,4915425.69254200666935169233327978605618915725...,118.570894051669753580554961796224658616733944...
0,2936,arbitrage_trade,,,1000000,325792441.610165623943196234363268440338424670...,True,-91035.392107,2.003109e+06,254616109.023875636719120051025932512415388586...,...,631560.398239814658270464632595433488429575091...,1000000,30022166.9064246188373576752303256055328997677...,True,0,13727.052858,52120308.5324741419921863743853312249923941369...,-22098141.626049523154828699155005619459494369...,4915425.69254200666935169233327978605618915725...,118.570894051669753580554961796224658616733944...


Now run the proposal simulation, taking care to include the special arguments which override the default behavior.


In [None]:
simulator = MonteCarloGenerator(
    cooldown_time=0,
    whitelisted_tokens= whitelisted_tokens,
    price_feed=price_feeds,
    user_initial_balances=user_initial_balances,
    simulation_actions_count=simulation_actions_count,
    num_timesteps=num_timesteps,
    num_simulation_days=num_simulation_days,
    pool_freq_dist=pool_freq_dist,
    action_freq_dist=action_freq_dist,
    bnt_min_liquidity=default_bnt_min_liquidity,
    deposit_mean=deposit_mean,
    trade_mean=trade_mean,
    withdraw_mean=withdraw_mean
)

proposal_output = simulator.run(transact,
                                mean_events_per_day,
                                num_hours_per_day,
                                n_rolling_days,
                                constant_multiplier,
                                is_proposal=True)
proposal_output.to_csv('proposal_output.csv', index=False)
proposal_output

Combine the results to analyze.

In [None]:
# add a label for the baseline results
proposal_output['type'] = ['proposal' for _ in range(len(proposal_output))]

# add a label for the proposal results
baseline_output['type'] = ['baseline' for _ in range(len(baseline_output))]

# concatenate the results
combined = pd.concat([proposal_output, baseline_output])

# Add back the original human readable datetime for viusualizations.
combined = combined.merge(price_feeds.rename({'index':'timestamp'}, axis=1)[['timestamp','time']], on=['timestamp'])

# Sort by datetime to visualize properly.
combined = combined.sort_values('time')

combined

**Data Visualization**

Check the pattern for available trading to crosscheck the following charts.

In [None]:
import plotly.express as px

fig = px.line(combined, x = "time", y = ['eth_is_trading_enabled'],
              color = "type",
              title='ETH Trading Enabled')
fig.show()

Check the ETH fees earned.

In [None]:
fig = px.line(combined,
              x = "time",
              y = 'eth_fees_earned',
              color = "type",
              title='ETH Fees Earned')
fig.show()

Check the ETH Imperminent Loss

In [None]:
fig = px.line(combined, x = "time", y = ['eth_iloss'],
              color = "type",
              title='ETH IL')
fig.show()

Check the ETH bnt funding limit. This is the main modification made to the system.
The parameter should be static in the baseline and dynamic in the proposal simulation.

In [None]:
fig = px.line(combined, x = "time", y = ['eth_bnt_funding_limit'],
              color = "type",
              title='ETH bnt funding limit')
fig.show()

Check the ETH off-curve trading liquidity.

In [None]:
fig = px.line(combined, x = "time", y = 'eth_tkn_trading_liquidity',
              color = "type",
              title='ETH off curve trading liquidity')
fig.show()

Check the ETH on-curve trading liquidity.

In [None]:
fig = px.line(combined, x = "time", y = 'eth_bnt_trading_liquidity',
              color = "type",
              title='ETH on curve trading liquidity')
fig.show()

Check the ETH surplus.

In [None]:
fig = px.line(combined, x = "time", y = 'eth_surplus_real',
              color='type',
               title='ETH surplus')
fig.show()

Validate the surplus by checking the vault and staking balances.

In [None]:
fig = px.line(combined, x = "time", y = 'eth_staking',
              color='type',
               title='ETH staking')
fig.show()

In [None]:
fig = px.line(combined, x = "time", y = 'eth_vault_real',
              color='type',
               title='ETH vault')
fig.show()