# Backtest: Binance OrderBook data

Tutorial for [NautilusTrader](https://nautilustrader.io/docs/) a high-performance algorithmic trading platform and event driven backtester.

[View source on GitHub](https://github.com/nautechsystems/nautilus_trader/blob/develop/docs/tutorials/backtest_binance_orderbook.ipynb).

:::info
We are currently working on this article.
:::

## Overview

This tutorial runs through how to setup the data catalog and a `BacktestNode` to backtest an `OrderBookImbalance` strategy or order book data. This example requires you bring your Binance own order book data.

## Prerequisites

- [NautilusTrader](https://pypi.org/project/nautilus_trader/) latest release installed (`pip install -U nautilus_trader`)
- [JupyterLab](https://jupyter.org/) or similar installed (`pip install -U jupyterlab`)
- Python 3.10+ installed

## Imports

We'll start with all of our imports for the remainder of this guide:

In [1]:
import os
import shutil
from decimal import Decimal
from pathlib import Path

import pandas as pd

from nautilus_trader.backtest.node import BacktestNode
from nautilus_trader.backtest.node import BacktestVenueConfig
from nautilus_trader.backtest.node import BacktestDataConfig
from nautilus_trader.backtest.node import BacktestRunConfig
from nautilus_trader.backtest.node import BacktestEngineConfig
from nautilus_trader.core.datetime import dt_to_unix_nanos
from nautilus_trader.config import ImportableStrategyConfig
from nautilus_trader.config import LoggingConfig
from nautilus_trader.examples.strategies.ema_cross import EMACross, EMACrossConfig
from nautilus_trader.model.data import OrderBookDelta
from nautilus_trader.persistence.loaders import BinanceOrderBookDeltaDataLoader
from nautilus_trader.persistence.wranglers import OrderBookDeltaDataWrangler
from nautilus_trader.persistence.catalog import ParquetDataCatalog
from nautilus_trader.test_kit.providers import TestInstrumentProvider

## Loading data

In [2]:
# Path to your data directory, using user /Downloads as an example
DATA_DIR = "~/Downloads"

In [3]:
data_path = Path(DATA_DIR).expanduser() / "Data" / "Binance"
data_path = Path("/Users/minzzii/Documents/highfrequencytrading/nautilus_trader/tests/test_data/binance")
raw_files = list(data_path.iterdir())
assert raw_files, f"Unable to find any histdata files in directory {data_path}"
raw_files

[PosixPath('/Users/minzzii/Documents/highfrequencytrading/nautilus_trader/tests/test_data/binance/btcusdt-quotes.parquet'),
 PosixPath('/Users/minzzii/Documents/highfrequencytrading/nautilus_trader/tests/test_data/binance/btcusdt-trades.parquet'),
 PosixPath('/Users/minzzii/Documents/highfrequencytrading/nautilus_trader/tests/test_data/binance/btcusdt-instrument-repr.txt'),
 PosixPath('/Users/minzzii/Documents/highfrequencytrading/nautilus_trader/tests/test_data/binance/btcusdt-depth-snap.csv'),
 PosixPath('/Users/minzzii/Documents/highfrequencytrading/nautilus_trader/tests/test_data/binance/btcusdt-instrument.txt'),
 PosixPath('/Users/minzzii/Documents/highfrequencytrading/nautilus_trader/tests/test_data/binance/btcusdt-depth-update.csv'),
 PosixPath('/Users/minzzii/Documents/highfrequencytrading/nautilus_trader/tests/test_data/binance/ethusdt-trades.csv')]

In [4]:
# First we'll load the initial order book snapshot
path_snap = data_path / "btcusdt-depth-snap.csv" #"BTCUSDT_T_DEPTH_2022-11-01_depth_snap.csv"
df_snap = BinanceOrderBookDeltaDataLoader.load(path_snap)
df_snap.head()

Unnamed: 0_level_0,instrument_id,action,side,price,size,order_id,flags,sequence
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2022-11-01 23:49:39.146000+00:00,BTCUSDT.BINANCE,ADD,BUY,20377.0,1.77,0,32,2098021528332
2022-11-01 23:49:39.146000+00:00,BTCUSDT.BINANCE,ADD,BUY,20376.9,0.001,0,32,2098021528332
2022-11-01 23:49:39.146000+00:00,BTCUSDT.BINANCE,ADD,BUY,20376.8,0.009,0,32,2098021528332
2022-11-01 23:49:39.146000+00:00,BTCUSDT.BINANCE,ADD,BUY,20376.7,1.216,0,32,2098021528332
2022-11-01 23:49:39.146000+00:00,BTCUSDT.BINANCE,ADD,BUY,20376.6,0.011,0,32,2098021528332


In [5]:
# Then we'll load the order book updates, to save time here we're limiting to 1 million rows
path_update = data_path / "btcusdt-depth-update.csv" #"BTCUSDT_T_DEPTH_2022-11-01_depth_update.csv"
nrows = 1_000_000
df_update = BinanceOrderBookDeltaDataLoader.load(path_update, nrows=nrows)
df_update.head()

Unnamed: 0_level_0,instrument_id,action,side,price,size,order_id,flags,sequence
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2022-11-01 23:59:59.939000+00:00,BTCUSDT.BINANCE,DELETE,SELL,20472.8,0.0,0,0,2098041696700
2022-11-01 23:59:59.939000+00:00,BTCUSDT.BINANCE,DELETE,SELL,20472.9,0.0,0,0,2098041696700
2022-11-01 23:59:59.939000+00:00,BTCUSDT.BINANCE,DELETE,SELL,20473.1,0.0,0,0,2098041696700
2022-11-01 23:59:59.939000+00:00,BTCUSDT.BINANCE,DELETE,SELL,20473.2,0.0,0,0,2098041696700
2022-11-01 23:59:59.939000+00:00,BTCUSDT.BINANCE,UPDATE,SELL,20473.3,0.001,0,0,2098041696700


### Process deltas using a wrangler

In [6]:
BTCUSDT_BINANCE = TestInstrumentProvider.btcusdt_binance()
wrangler = OrderBookDeltaDataWrangler(BTCUSDT_BINANCE)

deltas = wrangler.process(df_snap)
deltas += wrangler.process(df_update)
deltas.sort(key=lambda x: x.ts_init)  # Ensure data is non-decreasing by `ts_init`
deltas[:10]

[OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=CLEAR, order=BookOrder(side=NO_ORDER_SIDE, price=0, size=0, order_id=0), flags=0, sequence=2098021528332, ts_event=1667346579146000000, ts_init=1667346579146000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=20377.00, size=1.770000, order_id=0), flags=32, sequence=2098021528332, ts_event=1667346579146000000, ts_init=1667346579146000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=20376.90, size=0.001000, order_id=0), flags=32, sequence=2098021528332, ts_event=1667346579146000000, ts_init=1667346579146000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=20376.80, size=0.009000, order_id=0), flags=32, sequence=2098021528332, ts_event=1667346579146000000, ts_init=1667346579146000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=20376.70, size=1.216

### Set up data catalog

In [7]:
CATALOG_PATH = os.getcwd() + "/catalog"

# Clear if it already exists, then create fresh
if os.path.exists(CATALOG_PATH):
    shutil.rmtree(CATALOG_PATH)
os.mkdir(CATALOG_PATH)

# Create a catalog instance
catalog = ParquetDataCatalog(CATALOG_PATH)

In [8]:
# Write instrument and ticks to catalog
catalog.write_data([BTCUSDT_BINANCE])
catalog.write_data(deltas)

In [9]:
# Confirm the instrument was written
catalog.instruments()

[CurrencyPair(id=BTCUSDT.BINANCE, raw_symbol=BTCUSDT, asset_class=CRYPTOCURRENCY, instrument_class=SPOT, quote_currency=USDT, is_inverse=False, price_precision=2, price_increment=0.01, size_precision=6, size_increment=0.000001, multiplier=1, lot_size=None, margin_init=0, margin_maint=0, maker_fee=0.001, taker_fee=0.001, info=None)]

In [10]:
# Explore the available data in the catalog
start = dt_to_unix_nanos(pd.Timestamp("2022-11-01", tz="UTC"))
end =  dt_to_unix_nanos(pd.Timestamp("2022-11-04", tz="UTC"))

deltas = catalog.order_book_deltas(start=start, end=end)
print(len(deltas))
deltas[:10]

201


[OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=CLEAR, order=BookOrder(side=NO_ORDER_SIDE, price=0.00, size=0.000000, order_id=0), flags=0, sequence=2098021528332, ts_event=1667346579146000000, ts_init=1667346579146000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=20377.00, size=1.770000, order_id=0), flags=32, sequence=2098021528332, ts_event=1667346579146000000, ts_init=1667346579146000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=20376.90, size=0.001000, order_id=0), flags=32, sequence=2098021528332, ts_event=1667346579146000000, ts_init=1667346579146000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=20376.80, size=0.009000, order_id=0), flags=32, sequence=2098021528332, ts_event=1667346579146000000, ts_init=1667346579146000000),
 OrderBookDelta(instrument_id=BTCUSDT.BINANCE, action=ADD, order=BookOrder(side=BUY, price=20376.70, 

## Configure backtest

In [11]:
instrument = catalog.instruments()[0]
book_type = "L2_MBP"  # Ensure data book type matches venue book type

data_configs = [BacktestDataConfig(
        catalog_path=CATALOG_PATH,
        data_cls=OrderBookDelta,
        instrument_id=instrument.id,
        # start_time=start,  # Run across all data
        # end_time=end,  # Run across all data
    )
]

venues_configs = [
    BacktestVenueConfig(
        name="BINANCE",
        oms_type="NETTING",
        account_type="CASH",
        base_currency=None,
        starting_balances=["20 BTC", "100000 USDT"],
        book_type=book_type,  # <-- Venues book type
    )
]

strategies = [
    ImportableStrategyConfig(
        strategy_path="nautilus_trader.examples.strategies.orderbook_imbalance:OrderBookImbalance",
        config_path="nautilus_trader.examples.strategies.orderbook_imbalance:OrderBookImbalanceConfig",
        config=dict(
            instrument_id=instrument.id,
            book_type=book_type,
            max_trade_size=Decimal("1.000"),
            min_seconds_between_triggers=1.0,
        ),
    ),
]

# NautilusTrader currently exceeds the rate limit for Jupyter notebook logging (stdout output),
# this is why the `log_level` is set to "ERROR". If you lower this level to see
# more logging then the notebook will hang during cell execution. A fix is currently
# being investigated which involves either raising the configured rate limits for
# Jupyter, or throttling the log flushing from Nautilus.
# https://github.com/jupyterlab/jupyterlab/issues/12845
# https://github.com/deshaw/jupyterlab-limit-output
config = BacktestRunConfig(
    engine=BacktestEngineConfig(
        strategies=strategies,
        logging=LoggingConfig(log_level="ERROR"),
    ),
    data=data_configs,
    venues=venues_configs,
)

config

BacktestRunConfig(venues=[BacktestVenueConfig(name='BINANCE', oms_type='NETTING', account_type='CASH', starting_balances=['20 BTC', '100000 USDT'], base_currency=None, default_leverage=1.0, leverages=None, book_type='L2_MBP', routing=False, frozen_account=False, bar_execution=True, reject_stop_orders=True, support_gtd_orders=True, support_contingent_orders=True, use_position_ids=True, use_random_ids=False, use_reduce_only=True, modules=None)], data=[BacktestDataConfig(catalog_path='/Users/minzzii/Documents/highfrequencytrading/nautilus_trader/docs/tutorials/catalog', data_cls=<class 'nautilus_trader.model.data.OrderBookDelta'>, catalog_fs_protocol=None, catalog_fs_storage_options=None, instrument_id=InstrumentId('BTCUSDT.BINANCE'), start_time=None, end_time=None, filter_expr=None, client_id=None, metadata=None, bar_spec=None, batch_size=10000)], engine=BacktestEngineConfig(environment=<Environment.BACKTEST: 'backtest'>, trader_id=TraderId('BACKTESTER-001'), instance_id=None, cache=None

## Run the backtest

In [12]:
node = BacktestNode(configs=[config])

result = node.run()

In [13]:
result

[BacktestResult(trader_id='BACKTESTER-001', machine_id='minzzii-MacBookPro-2.local', run_config_id='2ea2e3095a4b1b3e0edc8698a1e6e5b3e62bc916025d46acdf6c9bb66645ed7e', instance_id='e60d8140-ec09-482e-a5d7-dd8b08315f2e', run_id='d711aa1b-9ecc-4aa0-92a7-9ded8fe807d8', run_started=1724846688402318000, run_finished=1724846688432878000, backtest_start=1667346579146000000, backtest_end=1667347199939000000, elapsed_time=620.793, iterations=0, total_events=0, total_orders=0, total_positions=0, stats_pnls={'BTC': {'PnL (total)': 0.0, 'PnL% (total)': 0.0, 'Max Winner': 0.0, 'Avg Winner': 0.0, 'Min Winner': 0.0, 'Min Loser': 0.0, 'Avg Loser': 0.0, 'Max Loser': 0.0, 'Expectancy': 0.0, 'Win Rate': 0.0}, 'USDT': {'PnL (total)': 0.0, 'PnL% (total)': 0.0, 'Max Winner': 0.0, 'Avg Winner': 0.0, 'Min Winner': 0.0, 'Min Loser': 0.0, 'Avg Loser': 0.0, 'Max Loser': 0.0, 'Expectancy': 0.0, 'Win Rate': 0.0}}, stats_returns={'Returns Volatility (252 days)': nan, 'Average (Return)': nan, 'Average Loss (Return)':

In [14]:
from nautilus_trader.backtest.engine import BacktestEngine
from nautilus_trader.model.identifiers import Venue

engine: BacktestEngine = node.get_engine(config.id)

report = engine.trader.generate_order_fills_report()
print(report)

Empty DataFrame
Columns: []
Index: []


In [15]:
report = engine.trader.generate_positions_report()
print(report)

Empty DataFrame
Columns: []
Index: []


In [16]:
engine.trader.generate_account_report(Venue("BINANCE"))

Unnamed: 0,total,locked,free,currency,account_id,account_type,base_currency,margins,reported,info
2022-11-01 23:49:39.146000+00:00,20.0,0.0,20.0,BTC,BINANCE-001,CASH,,[],True,{}
2022-11-01 23:49:39.146000+00:00,100000.0,0.0,100000.0,USDT,BINANCE-001,CASH,,[],True,{}
