# Integrating Custom Data

By combining your custom data with the feed data (order book and trades), you can enhance your strategy while harnessing the full potential of hftbacktest.

## Accessing Spot Price

In this example, we'll combine the spot BTCUSDT mid-price with the USDM-Futures BTCUSDT feed data. This will enable you to estimate the fair value price, taking the underlying price into consideration.

The spot data is used only in the local-side, and thus, should come with a local timestamp. Following this, in your backtesting logic, your task is to identify the most recent data that predates the current timestamp.

The raw spot feed is processed to create spot data, which includes both a local timestamp and the spot mid price.

In [1]:
import numpy as np
import gzip
import json

spot = np.full((100_000, 2), np.nan, np.float64)
i = 0

with gzip.open('spot/btcusdt_20240809.gz', 'r') as f:
    while True:
        line = f.readline()
        if line is None or line == b'':
            break

        line = line.decode().strip()
        local_timestamp = int(line[:19])

        obj = json.loads(line[20:])
        if obj['stream'] == 'btcusdt@bookTicker':
            data = obj['data']
            mid = (float(data['b']) + float(data['a'])) / 2.0
            spot[i] = [local_timestamp, mid]
            i += 1
            
spot = spot[:i]

It displays the basis and spot mid price as it identifies the latest Point-in-Time data that falls before the current timestamp.

In [2]:
from numba import njit
from hftbacktest import BacktestAsset, HashMapMarketDepthBacktest

out_dtype = np.dtype([('timestamp', 'i8'), ('mid_price', 'f8'), ('spot_mid_price', 'f8')])

@njit
def print_basis(hbt, spot):
    out = np.empty(1_000_000, out_dtype)

    t = 0
    spot_row = 0
    
    # Checks every 60-sec (in nanoseconds)
    while hbt.elapse(1_000_000_000) == 0:
        # Finds the latest spot mid value.
        while spot_row < len(spot) and spot[spot_row, 0] <= hbt.current_timestamp:
            spot_row += 1
        spot_mid_price = spot[spot_row - 1, 1] if spot_row > 0 else np.nan

        depth = hbt.depth(0)
        
        mid_price = (depth.best_bid + depth.best_ask) / 2.0
        basis = mid_price - spot_mid_price

        if t % 10 == 0:
            print(
                'current_timestamp:',
                hbt.current_timestamp,
                'futures_mid:',
                round(mid_price, 2),
                ', spot_mid:',
                round(spot_mid_price, 2),
                ', basis:',
                round(basis, 2)
            )

        out[t].timestamp = hbt.current_timestamp
        out[t].mid_price = mid_price
        out[t].spot_mid_price = spot_mid_price
        t += 1
        
    return out[:t]

asset = (
    BacktestAsset()
        .data(['usdm/btcusdt_20240809.npz'])
        .initial_snapshot('usdm/btcusdt_20240808_eod.npz')
        .linear_asset(1.0) 
        .constant_latency(10_000_000, 10_000_000)
        .risk_adverse_queue_model() 
        .no_partial_fill_exchange()
        .trading_value_fee_model(0.0002, 0.0007)
        .tick_size(0.1)
        .lot_size(0.001)
)

hbt = HashMapMarketDepthBacktest([asset])

out = print_basis(hbt, spot)

_ = hbt.close()

current_timestamp: 1723161602500000000 futures_mid: 61659.85 , spot_mid: 61688.0 , basis: -28.14
current_timestamp: 1723161612500000000 futures_mid: 61713.95 , spot_mid: 61727.8 , basis: -13.85
current_timestamp: 1723161622500000000 futures_mid: 61713.45 , spot_mid: 61728.94 , basis: -15.5
current_timestamp: 1723161632500000000 futures_mid: 61666.05 , spot_mid: 61690.08 , basis: -24.02
current_timestamp: 1723161642500000000 futures_mid: 61638.45 , spot_mid: 61661.5 , basis: -23.06
current_timestamp: 1723161652500000000 futures_mid: 61632.05 , spot_mid: 61663.98 , basis: -31.93
current_timestamp: 1723161662500000000 futures_mid: 61578.15 , spot_mid: 61600.0 , basis: -21.85
current_timestamp: 1723161672500000000 futures_mid: 61524.25 , spot_mid: 61562.0 , basis: -37.74
current_timestamp: 1723161682500000000 futures_mid: 61552.45 , spot_mid: 61570.0 , basis: -17.54
current_timestamp: 1723161692500000000 futures_mid: 61593.05 , spot_mid: 61606.0 , basis: -12.96
current_timestamp: 172316170

In [3]:
import polars as pl
import holoviews as hv

df = pl.DataFrame(out).with_columns(
    pl.from_epoch('timestamp', time_unit='ns').alias('timestamp')
)

hv.extension('bokeh')



Although this is a short-period sample, you can observe that the basis is mean-reverting. There may be statistical arbitrage opportunities, particularly if you are eligible for rebates or zero fees.

In [4]:
df = df.with_columns(
    ((df['mid_price'] - df['spot_mid_price']) / df['mid_price'] * 10000).alias('basis_bp')
)

# Convert the Polars DataFrame to a Pandas DataFrame for plotting
pd_df = df.to_pandas()

# Initialize Holoviews with the Bokeh backend
hv.extension('bokeh')

# Create the plot using Holoviews
plot = hv.Curve(pd_df, 'timestamp', 'basis_bp', label='Basis BP')

# Customize the plot's appearance
plot.opts(
    xlabel='Timestamp',
    ylabel='Basis BP',
    title='Basis BP Over Time',
    width=800,
    height=400,
    show_legend=True  # This enables the legend
)

# Display the plot
plot