# Hummingbot simple backtest

Assume we quote continuously to given spread. Measure short term profitability of our fills to future price, 1 000 
trades later, for different spread settings. 

We use [crypto-lake.com](https://crypto-lake.com/#data) example market data, FTRB-USDT market on Ascendex.

In [None]:
import datetime
import math

import numpy as np
import pandas as pd
import cufflinks as cf

import lakeapi

cf.go_offline()
lakeapi.use_sample_data(anonymous_access=True)

In [None]:
# Parameters
symbol = 'FTRB-USDT'
exchange = 'ASCENDEX'
tick_size_decimals = 6
fees_bps = 16 # ascendex vip2, maker order, altcoin

start = datetime.datetime(2022, 9, 1)
end = datetime.datetime(2022, 10, 15)

## Data

In [None]:
def load_data(table: str):
	print('Loading', table)
	return lakeapi.load_data(
		table = table,
		start = start,
		end = end,
		symbols = [symbol],
		exchanges = [exchange],
		drop_partition_cols = True,
	).sort_values('received_time')

# Load l1 data = top of the order book
l1 = load_data('level_1')
# Load trades
trades = load_data('trades')

In [None]:
# Merge trades and l1 depth data
df = pd.merge_asof(
	left = trades.rename(columns = {'origin_time': 'trade_origin_time', 'received_time': 'trade_received_time'}),
	right = l1.rename(columns = {'received_time': 'depth_received_time'}),
	left_on = 'trade_received_time',
	right_on = 'depth_received_time',
	tolerance = pd.Timedelta(minutes = 120),
)
df['mid'] = (df.bid_0_price + df.ask_0_price) / 2
df.head(3)

In [None]:
# df['spread'] = df.ask_0_price - df.bid_0_price
# df['spread_bps'] = (df.spread / df.mid) * 10_000 * coeff_pct / 100.

In [None]:
# def hbot_spread_density(spread, spread_max):
# 	return math.exp(-2*spread/spread_max) if spread <= spread_max else 0

## Backtest

In [None]:
results = []
# Show details of 100bps spread variant in the plot below
spread_to_plot = 100
df_to_plot = None
for spread_bps in range(20, 200, 10):
# for coeff_pct in range(50, 1500, 50):
	# df['spread_bps'] = (df.spread / df.mid) * 10_000 * coeff_pct / 100.
	# df['spread_bps'] = df['spread_bps'].clip(40, 400)
	df['spread_bps'] = spread_bps
	df['future_price'] = df.mid.shift(-1000)
	df['quote_bid'] = round(df.bid_0_price * (1 - df.spread_bps * 0.0001), tick_size_decimals)
	df['quote_ask'] = round(df.ask_0_price * (1 + df.spread_bps * 0.0001), tick_size_decimals)
	df['fill_sign'] = 0
	df.loc[df.quote_bid > df.price, 'fill_sign'] = 1
	df.loc[df.quote_ask < df.price, 'fill_sign'] = -1
	df['gross_profitability_short_bps'] = 10_000 * (df.future_price - df.mid) / df.mid * df.fill_sign 
	df['net_profitability_short_bps'] = (df['gross_profitability_short_bps'] - fees_bps) * df.fill_sign.abs()
	fill_count = df.fill_sign.abs().sum()
	results.append((spread_bps, fill_count, df.gross_profitability_short_bps.sum() / fill_count))
	# results.append((df.spread_bps.mean(), df.fill_sign.abs().sum(), df.gross_profitability_short_bps.mean()))
	if spread_to_plot == spread_bps:
		df_to_plot = df.copy()

result_df = pd.DataFrame(results, columns = ['spread_bps', 'fill_count', 'gross_profitability_short_bps'])
result_df['net_profitability_short_bps'] = result_df.gross_profitability_short_bps - fees_bps
result_df['total_gross_profit_bps'] = result_df.gross_profitability_short_bps * result_df.fill_count
result_df['total_net_profit_bps'] = result_df.net_profitability_short_bps * result_df.fill_count
result_df

### Plots

In [None]:
# Plot trades and the spread
df_to_plot.set_index('trade_received_time')[['bid_0_price', 'ask_0_price', 'price']][:3000].iplot(yTitle = 'price')

In [None]:
# df_to_plot.set_index('trade_received_time').net_profitability_short_bps.replace(0, np.nan).iplot(kind = 'hist', bins = 20)
# df_to_plot.set_index('trade_received_time').net_profitability_short_bps.cumsum().iplot()

## More data

What data are available in the anonymous sample repository. Uncomment to explore:

In [None]:
# for table in ('level_1', 'trades', 'book', 'candles'):
# 	available_data = pd.DataFrame(lakeapi.list_data(table = table))
# 	print(table)
# 	display(available_data[['exchange', 'symbol', 'dt']].groupby(['exchange', 'symbol']).aggregate({'dt': ['first', 'last']}))

## Conclusion

Average reward on FTRB-USDT is currently 358% p.a. for average spread of 40bps. In this model scenario with 40 bps spread, you would loose about 30% monthly, which would leave you with almost exactly zero profit.

With wider spread such as 100bps, you can achieve better profitability of about +10% monthly at the cost of less rewards. This seems much better, but note that this backtest is far from a complete implementation of hummingbot trading logic and while it provides a good idea about the profitability, it is not very realistic.

You can add more logic to your quoting strategy in order to increase profitability or shrink spread and increase rewards. I also suggest you improve this notebook and open a pull request on github to share it with the community. If you do so, I will check your code for mistakes and might provide tips for further improvement.

### Follow ups

- how are the results stable in time? how to stabilize them?
- how much rewards would you get depending on spread? which spread maximizes trading profits + rewards?
- how does inventory balancing affect profitability? can we simulate it?
- how does the lag between quoting and fill affect profitability? can we simulate it?