# Backtest: Binance OrderBook data

Tutorial for [NautilusTrader](https://nautilustrader.io/docs/) a high-performance algorithmic trading platform and event driven backtester.

[View source on GitHub](https://github.com/nautechsystems/nautilus_trader/blob/develop/docs/tutorials/backtest_binance_orderbook.ipynb).

:::info
We are currently working on this tutorial.
:::

## Overview

This tutorial runs through how to set up the data catalog and a `BacktestNode` to backtest an `OrderBookImbalance` strategy or order book data. This example requires you bring your own Binance order book data.

## Prerequisites

- Python 3.11+ installed
- [JupyterLab](https://jupyter.org/) or similar installed (`pip install -U jupyterlab`)
- [NautilusTrader](https://pypi.org/project/nautilus_trader/) latest release installed (`pip install -U nautilus_trader`)

## Imports

We'll start with all of our imports for the remainder of this guide:

In [1]:
import os
import shutil
from decimal import Decimal
from pathlib import Path

import pandas as pd

from nautilus_trader.adapters.binance.loaders import BinanceOrderBookDeltaDataLoader
from nautilus_trader.backtest.node import BacktestDataConfig
from nautilus_trader.backtest.node import BacktestEngineConfig
from nautilus_trader.backtest.node import BacktestNode
from nautilus_trader.backtest.node import BacktestRunConfig
from nautilus_trader.backtest.node import BacktestVenueConfig
from nautilus_trader.config import ImportableStrategyConfig
from nautilus_trader.config import LoggingConfig
from nautilus_trader.core.datetime import dt_to_unix_nanos
from nautilus_trader.model import OrderBookDelta
from nautilus_trader.persistence.catalog import ParquetDataCatalog
from nautilus_trader.persistence.wranglers import OrderBookDeltaDataWrangler
from nautilus_trader.test_kit.providers import TestInstrumentProvider

## Loading data

In [2]:
# Path to your data directory, using user /Downloads as an example
DATA_DIR = "~/Downloads"

In [3]:
data_path = Path(DATA_DIR).expanduser() / "Data" / "Binance"
raw_files = list(data_path.iterdir())
assert raw_files, f"Unable to find any histdata files in directory {data_path}"
raw_files

[PosixPath('/Users/sac/Downloads/Data/Binance/BTCUSDT_T_DEPTH_2022-11-01_depth_update.csv'),
 PosixPath('/Users/sac/Downloads/Data/Binance/BTCUSDT_T_DEPTH_2022-11-01_depth_snap.csv')]

In [4]:
# First we'll load the initial order book snapshot
path_snap = data_path / "BTCUSDT_T_DEPTH_2022-11-01_depth_snap.csv"
df_snap = BinanceOrderBookDeltaDataLoader.load(path_snap)
df_snap.head()

Unnamed: 0_level_0,instrument_id,action,side,price,size,order_id,flags,sequence
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2022-11-01 00:00:00+00:00,BTCUSDT.BINANCE,ADD,BUY,19998.0,1.152253,0,32,0
2022-11-01 00:00:00+00:00,BTCUSDT.BINANCE,ADD,BUY,19996.0,0.66218,0,32,0
2022-11-01 00:00:00+00:00,BTCUSDT.BINANCE,ADD,BUY,19994.0,0.379546,0,32,0
2022-11-01 00:00:00+00:00,BTCUSDT.BINANCE,ADD,BUY,19992.0,0.24115,0,32,0
2022-11-01 00:00:00+00:00,BTCUSDT.BINANCE,ADD,BUY,19990.0,0.155155,0,32,0


In [5]:
# Then we'll load the order book updates, to save time here we're limiting to 1 million rows
path_update = data_path / "BTCUSDT_T_DEPTH_2022-11-01_depth_update.csv"
nrows = 1_000_000
df_update = BinanceOrderBookDeltaDataLoader.load(path_update, nrows=nrows)
df_update.head()

Unnamed: 0_level_0,instrument_id,action,side,price,size,order_id,flags,sequence
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2022-11-01 00:00:00+00:00,BTCUSDT.BINANCE,UPDATE,SELL,20122.917661,0.936128,0,0,0
2022-11-01 00:00:00.100000+00:00,BTCUSDT.BINANCE,UPDATE,BUY,20141.613117,0.983327,0,0,0
2022-11-01 00:00:00.200000+00:00,BTCUSDT.BINANCE,UPDATE,SELL,20381.981632,0.886866,0,0,0
2022-11-01 00:00:00.300000+00:00,BTCUSDT.BINANCE,UPDATE,BUY,20274.174317,0.589417,0,0,0
2022-11-01 00:00:00.400000+00:00,BTCUSDT.BINANCE,UPDATE,SELL,20644.973594,0.904972,0,0,0


### Process deltas using a wrangler

In [6]:
BTCUSDT_BINANCE = TestInstrumentProvider.btcusdt_binance()
wrangler = OrderBookDeltaDataWrangler(BTCUSDT_BINANCE)

deltas = wrangler.process(df_snap)
deltas += wrangler.process(df_update)
deltas.sort(key=lambda x: x.ts_init)  # Ensure data is non-decreasing by `ts_init`
deltas[:10]

[OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=CLEAR, order=BookOrder(side=NO_ORDER_SIDE, price=0, size=0, order_id=0), flags=0, sequence=0, ts_event=1667260800000000000, ts_init=1667260800000000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=19998.00, size=1.152253, order_id=0), flags=32, sequence=0, ts_event=1667260800000000000, ts_init=1667260800000000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=19996.00, size=0.662180, order_id=0), flags=32, sequence=0, ts_event=1667260800000000000, ts_init=1667260800000000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=19994.00, size=0.379546, order_id=0), flags=32, sequence=0, ts_event=1667260800000000000, ts_init=1667260800000000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=19992.00, size=0.241150, order_id=0), flags=32, sequence=0, ts_event

### Set up data catalog

In [7]:
CATALOG_PATH = os.getcwd() + "/catalog"

# Clear if it already exists, then create fresh
if os.path.exists(CATALOG_PATH):
    shutil.rmtree(CATALOG_PATH)
os.mkdir(CATALOG_PATH)

# Create a catalog instance
catalog = ParquetDataCatalog(CATALOG_PATH)

In [8]:
# Write instrument and ticks to catalog
catalog.write_data([BTCUSDT_BINANCE])
catalog.write_data(deltas)

In [9]:
# Confirm the instrument was written
catalog.instruments()

[CurrencyPair(id=BTCUSDT.BINANCE, raw_symbol=BTCUSDT, asset_class=CRYPTOCURRENCY, instrument_class=SPOT, quote_currency=USDT, is_inverse=False, price_precision=2, price_increment=0.01, size_precision=6, size_increment=0.000001, multiplier=1, lot_size=None, margin_init=0, margin_maint=0, maker_fee=0.001, taker_fee=0.001, info=None)]

In [10]:
# Explore the available data in the catalog
start = dt_to_unix_nanos(pd.Timestamp("2022-11-01", tz="UTC"))
end =  dt_to_unix_nanos(pd.Timestamp("2022-11-04", tz="UTC"))

deltas = catalog.order_book_deltas(start=start, end=end)
print(len(deltas))
deltas[:10]

864041


[OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=CLEAR, order=BookOrder(side=NO_ORDER_SIDE, price=0.00, size=0.000000, order_id=0), flags=0, sequence=0, ts_event=1667260800000000000, ts_init=1667260800000000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=19998.00, size=1.152253, order_id=0), flags=32, sequence=0, ts_event=1667260800000000000, ts_init=1667260800000000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=19996.00, size=0.662180, order_id=0), flags=32, sequence=0, ts_event=1667260800000000000, ts_init=1667260800000000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=19994.00, size=0.379546, order_id=0), flags=32, sequence=0, ts_event=1667260800000000000, ts_init=1667260800000000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=19992.00, size=0.241150, order_id=0), flags=32, sequence=0

## Configure backtest

In [11]:
instrument = catalog.instruments()[0]
book_type = "L2_MBP"  # Ensure data book type matches venue book type

data_configs = [BacktestDataConfig(
        catalog_path=CATALOG_PATH,
        data_cls=OrderBookDelta,
        instrument_id=instrument.id,
        # start_time=start,  # Run across all data
        # end_time=end,  # Run across all data
    )
]

venues_configs = [
    BacktestVenueConfig(
        name="BINANCE",
        oms_type="NETTING",
        account_type="CASH",
        base_currency=None,
        starting_balances=["20 BTC", "100000 USDT"],
        book_type=book_type,  # <-- Venues book type
    )
]

strategies = [
    ImportableStrategyConfig(
        strategy_path="nautilus_trader.examples.strategies.orderbook_imbalance:OrderBookImbalance",
        config_path="nautilus_trader.examples.strategies.orderbook_imbalance:OrderBookImbalanceConfig",
        config={
            "instrument_id": instrument.id,
            "book_type": book_type,
            "max_trade_size": Decimal("1.000"),
            "min_seconds_between_triggers": 1.0,
        },
    ),
]

# NautilusTrader currently exceeds the rate limit for Jupyter notebook logging (stdout output),
# this is why the `log_level` is set to "ERROR". If you lower this level to see
# more logging then the notebook will hang during cell execution. A fix is currently
# being investigated which involves either raising the configured rate limits for
# Jupyter, or throttling the log flushing from Nautilus.
# https://github.com/jupyterlab/jupyterlab/issues/12845
# https://github.com/deshaw/jupyterlab-limit-output
config = BacktestRunConfig(
    engine=BacktestEngineConfig(
        strategies=strategies,
        logging=LoggingConfig(log_level="ERROR"),
    ),
    data=data_configs,
    venues=venues_configs,
)

config

BacktestRunConfig(venues=[BacktestVenueConfig(name='BINANCE', oms_type='NETTING', account_type='CASH', starting_balances=['20 BTC', '100000 USDT'], base_currency=None, default_leverage=1.0, leverages=None, book_type='L2_MBP', routing=False, frozen_account=False, reject_stop_orders=True, support_gtd_orders=True, support_contingent_orders=True, use_position_ids=True, use_random_ids=False, use_reduce_only=True, bar_execution=True, bar_adaptive_high_low_ordering=False, trade_execution=False, modules=None)], data=[BacktestDataConfig(catalog_path='/Users/sac/dev/naut/catalog', data_cls=<class 'nautilus_trader.model.data.OrderBookDelta'>, catalog_fs_protocol=None, catalog_fs_storage_options=None, instrument_id=InstrumentId('BTCUSDT.BINANCE'), start_time=None, end_time=None, filter_expr=None, client_id=None, metadata=None, bar_spec=None, instrument_ids=None, bar_types=None)], engine=BacktestEngineConfig(environment=<Environment.BACKTEST: 'backtest'>, trader_id=TraderId('BACKTESTER-001'), insta

## Run the backtest

In [12]:
node = BacktestNode(configs=[config])

result = node.run()

In [13]:
result

[BacktestResult(trader_id='BACKTESTER-001', machine_id='Mac.lan', run_config_id='6daebef8f7584300be6fc965a368f598bf156e270139470efd2e45ac17dc1587', instance_id='e8ba107c-0210-4125-97b4-8a8f47e4f013', run_id='7ae752da-91d7-45db-aad8-0491f34fa0f8', run_started=1749260299845944000, run_finished=1749260313058076000, backtest_start=1667260800000000000, backtest_end=1667347199900000000, elapsed_time=86399.9, iterations=0, total_events=0, total_orders=0, total_positions=0, stats_pnls={'BTC': {'PnL (total)': 0.0, 'PnL% (total)': 0.0, 'Max Winner': 0.0, 'Avg Winner': 0.0, 'Min Winner': 0.0, 'Min Loser': 0.0, 'Avg Loser': 0.0, 'Max Loser': 0.0, 'Expectancy': 0.0, 'Win Rate': 0.0}, 'USDT': {'PnL (total)': 0.0, 'PnL% (total)': 0.0, 'Max Winner': 0.0, 'Avg Winner': 0.0, 'Min Winner': 0.0, 'Min Loser': 0.0, 'Avg Loser': 0.0, 'Max Loser': 0.0, 'Expectancy': 0.0, 'Win Rate': 0.0}}, stats_returns={'Returns Volatility (252 days)': nan, 'Average (Return)': nan, 'Average Loss (Return)': nan, 'Average Win 

In [14]:
from nautilus_trader.backtest.engine import BacktestEngine
from nautilus_trader.model import Venue


engine: BacktestEngine = node.get_engine(config.id)

engine.trader.generate_order_fills_report()

In [15]:
engine.trader.generate_positions_report()

In [16]:
engine.trader.generate_account_report(Venue("BINANCE"))

Unnamed: 0,total,locked,free,currency,account_id,account_type,base_currency,margins,reported,info
2022-11-01 00:00:00+00:00,20.0,0.0,20.0,BTC,BINANCE-001,CASH,,[],True,{}
2022-11-01 00:00:00+00:00,100000.0,0.0,100000.0,USDT,BINANCE-001,CASH,,[],True,{}
