<a href="https://colab.research.google.com/github/gingerchien/QuantHub/blob/main/vectorbt_for_beginners.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook is a compilation of **Chad Thackray's Vectorbt for beginners - Full Python Course** located below and a few of my ideas after learning the course:

https://www.youtube.com/watch?v=JOdEZMcvyac&list=PLKCjdQRzJEHzUc09EhgHD_to93yLnjT0P


#What is Vectorbt?

Accodring to their webstite: https://vectorbt.dev/
vectorbt is a Python package for quantitative analysis that takes a novel approach to backtesting: it operates entirely on pandas and NumPy objects, and is accelerated by Numba to analyze any data at speed and scale. This allows for testing of many thousands of strategies in seconds.

In contrast to other backtesters, vectorbt represents complex data as (structured) NumPy arrays. This enables superfast computation using vectorized operations with NumPy and non-vectorized but dynamically compiled operations with Numba. It also integrates Plotly and Jupyter Widgets to display complex charts and dashboards akin to Tableau right in the Jupyter notebook. Due to high performance, vectorbt can process large amounts of data even without GPU and parallelization and enables the user to interact with data-hungry widgets without significant delays.

# 0. Quickstart

* Documentation: https://vectorbt.dev/api/indicators/basic/
* Github: https://github.com/polakowo/vectorbt

In [6]:
#check python version
!python3 --version

Python 3.10.12


## Install Libraries

In [3]:
# ##install vectorbt
# !pip3 install vectorbt

# ##install Yahoo Finance
# !pip3 install yfinance

In [4]:
import vectorbt as vbt
import yfinance as yf
import datetime

## Download Data

In [5]:
end_date = datetime.datetime.now()
start_date= end_date - datetime.timedelta(days=7)
print(start_date)

#btc_price = vbt.YFData.download(['BTC-USD', 'ETH-USD'], interval='1m', start=start_date, end=end_date ,missing_index='drop').get('Close')
btc_price = vbt.YFData.download('BTC-USD', missing_index='drop').get('Close')

2023-12-14 22:23:14.053813


In [6]:
print(btc_price.tail())
print(btc_price.head())
print(type(btc_price))

Date
2023-12-17 00:00:00+00:00    41364.664062
2023-12-18 00:00:00+00:00    42623.539062
2023-12-19 00:00:00+00:00    42270.527344
2023-12-20 00:00:00+00:00    43652.250000
2023-12-21 00:00:00+00:00    43995.789062
Freq: D, Name: Close, dtype: float64
Date
2014-09-17 00:00:00+00:00    457.334015
2014-09-18 00:00:00+00:00    424.440002
2014-09-19 00:00:00+00:00    394.795990
2014-09-20 00:00:00+00:00    408.903992
2014-09-21 00:00:00+00:00    398.821014
Freq: D, Name: Close, dtype: float64
<class 'pandas.core.series.Series'>


## Technical Indicators

In [7]:
#rsi = vbt.RSI.run(btc_price, window=[14,21]) different windows
rsi = vbt.RSI.run(btc_price, window=14)
print(rsi.rsi) #to extract the value, call rsi.rsi

Date
2014-09-17 00:00:00+00:00          NaN
2014-09-18 00:00:00+00:00          NaN
2014-09-19 00:00:00+00:00          NaN
2014-09-20 00:00:00+00:00          NaN
2014-09-21 00:00:00+00:00          NaN
                               ...    
2023-12-17 00:00:00+00:00    55.392152
2023-12-18 00:00:00+00:00    52.656277
2023-12-19 00:00:00+00:00    41.267414
2023-12-20 00:00:00+00:00    49.587286
2023-12-21 00:00:00+00:00    53.110773
Freq: D, Name: (14, Close), Length: 3383, dtype: float64


### Turning Indicator into True/Falst Signals for processing with Vectorbt

In [8]:
entries = rsi.rsi_crossed_below(30)
#print(entries.to_string())

In [9]:
exits = rsi.rsi_crossed_above(70)
#print(exits.to_string())

In [10]:
#test with portfolio pf
pf = vbt.Portfolio.from_signals(btc_price, entries, exits)

#check what's going on
print(pf.stats())

Start                          2014-09-17 00:00:00+00:00
End                            2023-12-21 00:00:00+00:00
Period                                3383 days 00:00:00
Start Value                                        100.0
End Value                                      79.460413
Total Return [%]                              -20.539587
Benchmark Return [%]                         9520.056158
Max Gross Exposure [%]                             100.0
Total Fees Paid                                      0.0
Max Drawdown [%]                               89.870408
Max Drawdown Duration                 2175 days 00:00:00
Total Trades                                          37
Total Closed Trades                                   37
Total Open Trades                                      0
Open Trade PnL                                       0.0
Win Rate [%]                                   59.459459
Best Trade [%]                                 37.937622
Worst Trade [%]                

In [11]:
print(pf.total_return()) #can extract individual items from stats()

-0.2053958686215688


In [12]:
#plotting results, the most simple plot in vectorbt
pf.plot().show()

# 1. Custom Indicators using the Indicator Factory
* Reference: https://vectorbt.dev/api/indicators/factory/#naive-approach

In [13]:
end_time = datetime.datetime.now()
start_time= end_time - datetime.timedelta(days=3)
btc_price = vbt.YFData.download('BTC-USD', missing_index='drop', start= start_time, end=end_time, interval='1m').get('Close')

In [14]:
print(btc_price)

Datetime
2023-12-18 22:24:00+00:00    42573.015625
2023-12-18 22:25:00+00:00    42545.664062
2023-12-18 22:26:00+00:00    42585.726562
2023-12-18 22:27:00+00:00    42581.765625
2023-12-18 22:28:00+00:00    42549.281250
                                 ...     
2023-12-21 22:16:00+00:00    43942.902344
2023-12-21 22:17:00+00:00    43943.816406
2023-12-21 22:19:00+00:00    43974.316406
2023-12-21 22:20:00+00:00    43992.187500
2023-12-21 22:21:00+00:00    43981.074219
Name: Close, Length: 4093, dtype: float64


## Define Custom Indicator

In [15]:
import numpy as np
def custom_indicator(close, rsi_window=14, ma_window=50):
  rsi = vbt.RSI.run(close, window=rsi_window).rsi.to_numpy() #the .rsi grabs the actual values to turn it into signal later as raw values are not helpful
  ma = vbt.MA.run(close, window=ma_window).ma.to_numpy()#converting to numpy arrays
  #print(rsi)
  #print(ma)
  #create signal
  trend = np.where(rsi>70, -1, 0)
  trend = np.where((rsi<30) & (close<ma) , 1, trend) #only use & version here, AND does not work, for or use |
  #print(trend)
  return trend

#create indicator object so Vectorbt can recognize it for any custom indicators
indicator = vbt.IndicatorFactory(
    class_name='Combination',
    short_name='comb',
    input_names=['close'],
    param_names=['rsi_window', 'ma_window'],
    output_names=['value']
).from_apply_func(custom_indicator, rsi_window=14, ma_window=50) #provide default values then point it to the custom indictor

#run and get indicator values
res = indicator.run(btc_price, rsi_window=21, ma_window=50)
#print(res.value)
entries = res.value == 1.0
exits = res.value == -1.0

# https://vectorbt.dev/api/portfolio/base/#custom-metrics
pf = vbt.Portfolio.from_signals(btc_price, entries, exits, seed=42, freq='m')
print(pf.stats()) #if vectorbt could not parse the frequency of the close, it will not return any duration in time units or metrics that requires annualization and throw a bunch of warnings

Start                         2023-12-18 22:24:00+00:00
End                           2023-12-21 22:21:00+00:00
Period                                  2 days 20:13:00
Start Value                                       100.0
End Value                                    101.719236
Total Return [%]                               1.719236
Benchmark Return [%]                           3.307397
Max Gross Exposure [%]                            100.0
Total Fees Paid                                     0.0
Max Drawdown [%]                               2.841064
Max Drawdown Duration                   1 days 04:48:00
Total Trades                                         30
Total Closed Trades                                  30
Total Open Trades                                     0
Open Trade PnL                                      0.0
Win Rate [%]                                  76.666667
Best Trade [%]                                 0.542534
Worst Trade [%]                               -1

In [16]:
pf.plot().show()

## What if we want a 5 min indicator on the 1 min data?

In [17]:
import numpy as np

end_time = datetime.datetime.now()
start_time= end_time - datetime.timedelta(days=3)
btc_price = vbt.YFData.download(['BTC-USD', 'ETH-USD'], missing_index='drop', start= start_time, end=end_time, interval='1m').get('Close')

# https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.align.html
# broadcast_axis = 0 , interchangeable w/ 'index', join = 'right' (takes the keys from the right table only (close is the right table, rsi is the left table))
def custom_indicator2(close, rsi_window=14, ma_window=50):
  close_5m = close.resample('5T').last() #converting 1m candles to 5m
  rsi = vbt.RSI.run(close_5m, window=rsi_window).rsi
  rsi, _ = rsi.align(close, broadcast_axis=0, method='ffill', join='right') #this makes the rsi align to the 1 min but at 5 min intervals, where the other 4 min are NAN in between. forward fill helps fill in the gap.

  #convert everything to numpy
  close = close.to_numpy()
  rsi = rsi.to_numpy()

  ma = vbt.MA.run(close, window=ma_window).ma.to_numpy()
  #print(rsi)
  #print(ma)
  #create signal
  trend = np.where(rsi>70, -1, 0)
  trend = np.where((rsi<30) & (close<ma) , 1, trend) #only use & version here, AND does not work, for or use |
  #print(trend)
  return trend

#create indicator object so Vectorbt can recognize it for any custom indicators
indicator2 = vbt.IndicatorFactory(
    class_name='Combination',
    short_name='comb',
    input_names=['close'],
    param_names=['rsi_window', 'ma_window'],
    output_names=['value']
).from_apply_func(custom_indicator2, rsi_window=14, ma_window=50, keep_pd=True) #keep_pd = True, this ensures everything stays in panda format instead of getting turned into numpy arrays

#run and get indicator values
res = indicator2.run(btc_price, rsi_window=21, ma_window=50)
#print(res.value)
entries = res.value == 1.0
exits = res.value == -1.0

# https://vectorbt.dev/api/portfolio/base/#custom-metrics
pf = vbt.Portfolio.from_signals(btc_price, entries, exits, seed=42, freq='m')
print(pf.stats()) #if vectorbt could not parse the frequency of the close, it will not return any duration in time units or metrics that requires annualization and throw a bunch of warnings


Symbols have mismatching index. Dropping missing data points.



Start                         2023-12-18 22:24:00+00:00
End                           2023-12-21 22:21:00+00:00
Period                                  2 days 18:12:00
Start Value                                       100.0
End Value                                     98.235464
Total Return [%]                              -1.764536
Benchmark Return [%]                           2.273138
Max Gross Exposure [%]                            100.0
Total Fees Paid                                     0.0
Max Drawdown [%]                               4.111337
Max Drawdown Duration                   1 days 17:00:30
Total Trades                                        3.5
Total Closed Trades                                 3.5
Total Open Trades                                   0.0
Open Trade PnL                                      0.0
Win Rate [%]                                       37.5
Best Trade [%]                                  0.23966
Worst Trade [%]                               -1


Object has multiple columns. Aggregating using <function mean at 0x79a4f45d5cf0>. Pass column to select a single column/group.



# 2. Hyperparameter Optimization

In [18]:
end_time = datetime.datetime.now()
start_time= end_time - datetime.timedelta(days=3)
btc_price = vbt.YFData.download(['BTC-USD', 'ETH-USD'], missing_index='drop', start= start_time, end=end_time, interval='1m').get('Close')

def custom_indicator2(close, rsi_window=14, ma_window=50, entry = 30, exit=70):
  close_5m = close.resample('5T').last()
  rsi = vbt.RSI.run(close_5m, window=rsi_window).rsi
  rsi, _ = rsi.align(close, broadcast_axis=0, method='ffill', join='right')

  #convert everything to numpy
  close = close.to_numpy()
  rsi = rsi.to_numpy()

  ma = vbt.MA.run(close, window=ma_window).ma.to_numpy()
  #print(rsi)
  #print(ma)
  #create signal
  trend = np.where(rsi>exit, -1, 0)
  trend = np.where((rsi<entry) & (close<ma) , 1, trend)
  #print(trend)
  return trend

#create indicator object so Vectorbt can recognize it for any custom indicators
indicator2 = vbt.IndicatorFactory(
    class_name='Combination',
    short_name='comb',
    input_names=['close'],
    param_names=['rsi_window', 'ma_window', 'entry', 'exit'],
    output_names=['value']
).from_apply_func(custom_indicator2, rsi_window=14, ma_window=50, entry=30, exit=70, keep_pd=True)

#run and get indicator values
res = indicator2.run(btc_price, rsi_window=np.arange(10,40,step=3, dtype=int), ma_window=np.arange(20,200,step=15, dtype=int), entry=np.arange(10,40,step=4, dtype=int),exit=np.arange(60,85,step=4, dtype=int), param_product=True) #param_product = True shows a combination of all the parameters

entries = res.value == 1.0
exits = res.value == -1.0

# https://vectorbt.dev/api/portfolio/base/#custom-metrics
pf = vbt.Portfolio.from_signals(btc_price, entries, exits, seed=42, freq='m')
returns = pf.total_return()#print out everything by using .to_string()
print(returns.max(), returns.idxmax()) #print out the max combination of returns

print(pf.stats()) #if vectorbt could not parse the frequency of the close, it will not return any duration in time units or metrics that requires annualization and throw a bunch of warnings


Symbols have mismatching index. Dropping missing data points.



0.0498419918599636 (25, 20, 22, 84, 'BTC-USD')
Start                         2023-12-18 22:24:00+00:00
End                           2023-12-21 22:22:00+00:00
Period                                  2 days 18:13:00
Start Value                                       100.0
End Value                                     99.732749
Total Return [%]                              -0.267251
Benchmark Return [%]                           2.332336
Max Gross Exposure [%]                             75.0
Total Fees Paid                                     0.0
Max Drawdown [%]                               3.312095
Max Drawdown Duration         1 days 20:27:30.767857142
Total Trades                                   3.382515
Total Closed Trades                            3.266146
Total Open Trades                              0.116369
Open Trade PnL                                 0.157175
Win Rate [%]                                  58.992323
Best Trade [%]                                 0.707419
W


Object has multiple columns. Aggregating using <function mean at 0x79a4f45d5cf0>. Pass column to select a single column/group.



## To show the other columns, call it out specifically

In [19]:
# print(returns.to_string()) #print out everything by using .to_string()
# returns = returns[returns.index.isin(['ETH-USD'], level='symbol')]

# print(returns.max(), returns.idxmax()) #print out the max combination of returns
# #print(returns.to_string()) #print out everything by using .to_string()
# print(pf.stats()) #if vectorbt could not parse the frequency of the close, it will not return any duration in time units or metrics that requires annualization and throw a bunch of warnings

## Creating Heatmaps

In [20]:
#plotly is used in vectorbt, x_level and y_level can be any parameters
fig = returns.vbt.heatmap(x_level='comb_rsi_window', y_level='comb_entry',
                          slider_level='comb_ma_window')#adding the slider, can use it to sort symbol best combo
fig.show()




## Groupby Axis to Aggregate Parameters if A Lot of Parameters

In [21]:
#Give us a heatmap grouped by the exit and entry values, mean as aggregation function
#for each entry and exit path, it will average the result of parameters
#This will give us only 3 columns and everything else is averaged out
returns = returns.groupby(level=['comb_exit', 'comb_entry', 'symbol']).mean()
print(returns.to_string())
print(returns.max())
print(returns.idxmax())

comb_exit  comb_entry  symbol 
60         10          BTC-USD    0.000054
                       ETH-USD    0.000174
           14          BTC-USD    0.001014
                       ETH-USD   -0.001570
           18          BTC-USD    0.000496
                       ETH-USD   -0.004912
           22          BTC-USD   -0.000451
                       ETH-USD   -0.021145
           26          BTC-USD   -0.006729
                       ETH-USD   -0.028494
           30          BTC-USD   -0.010485
                       ETH-USD   -0.031855
           34          BTC-USD   -0.010198
                       ETH-USD   -0.038085
           38          BTC-USD   -0.005172
                       ETH-USD   -0.037322
64         10          BTC-USD    0.000662
                       ETH-USD    0.000581
           14          BTC-USD    0.001485
                       ETH-USD   -0.001959
           18          BTC-USD    0.001401
                       ETH-USD   -0.004958
           22          

In [22]:
fig = returns.vbt.heatmap(
    x_level='comb_entry',
    y_level='comb_exit',
    slider_level='symbol',
)
fig.show()

## Create Volume Graph with Multiple Parameters

In [23]:
#create indicator object so Vectorbt can recognize it for any custom indicators
indicator2 = vbt.IndicatorFactory(
    class_name='Combination',
    short_name='comb',
    input_names=['close'],
    param_names=['rsi_window', 'ma_window', 'entry', 'exit'],
    output_names=['value']
).from_apply_func(custom_indicator2, rsi_window=14, ma_window=50, entry=30, exit=70, keep_pd=True)

#run and get indicator values
res = indicator2.run(btc_price, rsi_window=np.arange(10,40,step=3, dtype=int),
                     #ma_window=np.arange(20,200,step=15, dtype=int),
                     entry=np.arange(10,40,step=4, dtype=int),
                     exit=np.arange(60,85,step=4, dtype=int), param_product=True) #param_product = True shows a combination of all the parameters

entries = res.value == 1.0
exits = res.value == -1.0

# https://vectorbt.dev/api/portfolio/base/#custom-metrics
pf = vbt.Portfolio.from_signals(btc_price, entries, exits, seed=42, freq='m')
returns = pf.total_return()#print out everything by using .to_string()
print(returns.max(), returns.idxmax()) #print out the max combination of returns

fig = returns.vbt.volume(
    x_level='comb_entry',
    y_level='comb_exit',
    z_level='comb_rsi_window',
    slider_level='symbol',
)

fig.show()

0.0498419918599636 (25, 22, 84, 'BTC-USD')


# 3. Optimization Techniques to Speedup Backtesting
*   Use TA-Lib directly as it is programmed to run fast with C with vbt wrapper
* Use @njit to convert to machine code for faster compilation




In [24]:
# #Google Colab runs on Linux environment, so download TA-Lib C library first
# !wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
# !tar -xzvf ta-lib-0.4.0-src.tar.gz
# %cd ta-lib
# !./configure --prefix=/usr
# !make
# !make install
# %cd ..
# !rm -rf ta-lib ta-lib-0.4.0-src.tar.gz

# #Then install the Python wrapper using pip
# !pip install TA-Lib

--2023-12-21 22:26:26--  http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
Resolving prdownloads.sourceforge.net (prdownloads.sourceforge.net)... 204.68.111.105
Connecting to prdownloads.sourceforge.net (prdownloads.sourceforge.net)|204.68.111.105|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://downloads.sourceforge.net/project/ta-lib/ta-lib/0.4.0/ta-lib-0.4.0-src.tar.gz [following]
--2023-12-21 22:26:26--  http://downloads.sourceforge.net/project/ta-lib/ta-lib/0.4.0/ta-lib-0.4.0-src.tar.gz
Resolving downloads.sourceforge.net (downloads.sourceforge.net)... 204.68.111.105
Reusing existing connection to prdownloads.sourceforge.net:80.
HTTP request sent, awaiting response... 302 Found
Location: http://nchc.dl.sourceforge.net/project/ta-lib/ta-lib/0.4.0/ta-lib-0.4.0-src.tar.gz [following]
--2023-12-21 22:26:26--  http://nchc.dl.sourceforge.net/project/ta-lib/ta-lib/0.4.0/ta-lib-0.4.0-src.tar.gz
Resolving nchc.dl.sourceforge.n

In [25]:
import pandas as pd
import talib
import datetime
from numpy import njit

#copied code from above to optimize
end_time = datetime.datetime.now()
start_time= end_time - datetime.timedelta(days=3)

#save btc_price into csv form for faster processing, can comment out YFData download before optimization run
btc_price = vbt.YFData.download(['BTC-USD', 'ETH-USD'], missing_index='drop', start= start_time, end=end_time, interval='1m').get('Close')
btc_price.to_csv('data.csv')
btc_price = pd.read_csv('data.csv')
btc_price['Datetime'] = pd.to_datetime(btc_price['Datetime'])
btc_price.set_index('Datetime', inplace=True)
print(btc_price)

# #narrow down to the BTC-USD column, only if we use TA-Lib directly without the vbt wrapper
# btc_price = btc_price['BTC-USD']
# print(btc_price)

#can use a vbt wrapper with TA-Lib
RSI = vbt.IndicatorFactory.from_talib('RSI')

#njit converts to machine code and compile to speed up compilation
@njit
def produce_signal(rsi, entry, exit):
  trend = np.where(rsi>exit, -1, 0)
  trend = np.where((rsi<entry), 1, trend)
  return trend


#revised and took out close_5m and ma to help make the example easier to run
def custom_indicator2(close, rsi_window=14, entry = 30, exit=70):
  #replace with talib function wrapper from vbt, this should run a bit quicker
  #rsi = vbt.RSI.run(close, window=rsi_window).rsi
  rsi = RSI.run(close,rsi_window).real.to_numpy()  #with vbt, need to call what needs to be extracted. convert to numpy to utilize njit
  return produce_signal(rsi, entry, exit)

#Removed all ma related items
indicator2 = vbt.IndicatorFactory(
    class_name='Combination',
    short_name='comb',
    input_names=['close'],
    param_names=['rsi_window', 'entry', 'exit'],
    output_names=['value']
).from_apply_func(custom_indicator2, rsi_window=14, entry=30, exit=70, to_2d=False) #setting to_2d to false before using TA-LIB since TA-LIB cant process 2 columns

#run and get indicator values
res = indicator2.run(btc_price, rsi_window=np.arange(10,40,step=5, dtype=int), entry=np.arange(10,40,step=3, dtype=int),exit=np.arange(60,85,step=3, dtype=int), param_product=True) #param_product = True shows a combination of all the parameters

entries = res.value == 1.0
exits = res.value == -1.0

# https://vectorbt.dev/api/portfolio/base/#custom-metrics
pf = vbt.Portfolio.from_signals(btc_price, entries, exits, seed=42, freq='m')
returns = pf.total_return()#print out everything by using .to_string()
print(returns.max(), returns.idxmax()) #print out the max combination of returns
#print(pf.stats()) #if vectorbt could not parse the frequency of the close, it will not return any duration in time units or metrics that requires annualization and throw a bunch of warnings


Symbols have mismatching index. Dropping missing data points.



                                BTC-USD      ETH-USD
Datetime                                            
2023-12-18 22:29:00+00:00  42508.500000  2219.254395
2023-12-18 22:30:00+00:00  42509.453125  2217.471924
2023-12-18 22:31:00+00:00  42479.878906  2212.724854
2023-12-18 22:32:00+00:00  42454.968750  2211.706299
2023-12-18 22:33:00+00:00  42467.015625  2213.386230
...                                 ...          ...
2023-12-21 22:22:00+00:00  43995.789062  2248.189941
2023-12-21 22:23:00+00:00  43983.839844  2247.550293
2023-12-21 22:24:00+00:00  44009.128906  2248.982422
2023-12-21 22:26:00+00:00  43992.410156  2248.053711
2023-12-21 22:27:00+00:00  43994.917969  2248.126709

[3972 rows x 2 columns]
0.05325968490814901 (15, 34, 81, 'BTC-USD')


# 4. Graphing/Dashboarding

# 5. Order Types

# 6. Avoid Over-fitting