<a href="https://colab.research.google.com/github/ahsank/StockML/blob/main/Backtest.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Backtesting
=======================

It uses *backtesting.py* Python framework for [backtesting](https://www.investopedia.com/terms/b/backtesting.asp) trading strategies. See [Quickstart](https://github.com/kernc/backtesting.py/blob/master/doc/examples/Quick%20Start%20User%20Guide.ipynb)


## Data
DataFrame should ideally be indexed with a _datetime index_ (convert it with [`pd.to_datetime()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.to_datetime.html));
otherwise a simple range index will do.

In [None]:
!pip install backtesting

In [None]:
!pip install yahoo_fin

In [446]:
from yahoo_fin import stock_info
tickers = ['ARKK', 'SPY', 'XLE', 'QQQ']
num_days = 1200 # 6 years
dfs = {}
for ticker in tickers:
  df = stock_info.get_data(ticker, start_date='2016-01-01')
  df.columns = map(str.title, df.columns)
  # df['Unadjusted'] = df.Close
  # df.Close = df.Adjclose
  # df.drop('Adjclose', axis=1, inplace=True)
  # df = df.tail(num_days)
  dfs[ticker] = df


In [489]:
from yahoo_fin import stock_info
tickers = ['ARKK', 'SPY', 'XLE', 'QQQ', 'NVDA', 'MSFT', 'PTON', 'ETSY', 'COIN']
num_days = 1200 # 6 years
dfs = {}
for ticker in tickers:
  df = stock_info.get_data(ticker, start_date='2016-01-01')
  df.columns = map(str.title, df.columns)
  df['Unadjusted'] = df.Close
  df.Close = df.Adjclose
  df.drop('Adjclose', axis=1, inplace=True)
  df = df.tail(num_days)
  dfs[ticker] = df

In [None]:
# Use Adjclose, Doesn't work
# spy.Close = spy.Adjclose
# spy.drop('Adjclose', axis=1, inplace=True)

In [431]:
import pandas as pd


def SMA(values, n):
    """
    Return simple moving average of `values`, at
    each step taking into account `n` previous values.
    """
    return pd.Series(values).rolling(n).mean()

In [432]:
def ToSeries(values):
  return pd.Series(values)

In [464]:
from backtesting import Strategy
from backtesting.lib import crossover


class SmaCross(Strategy):
    # Define the two MA lags as *class variables*
    # for later optimization
    n1 = 5 # 10
    n2 = 200 # 20
    init = False

    def init(self):
        # Precompute the two moving averages
        self.sma1 = self.I(SMA, self.data.Close, self.n1)
        self.sma2 = self.I(SMA, self.data.Close, self.n2)

    def next(self):
        if self.init:
          self.buy()
          self.init = False
        # If sma1 crosses above sma2, close any existing
        # short trades, and buy the asset
        if crossover(self.sma1, self.sma2):
            # self.position.close()
            self.buy()

        # Else, if sma1 crosses below sma2, close any existing
        # long trades, and sell the asset
        elif crossover(self.sma2, self.sma1):
            self.position.close()
            # self.sell()

In [512]:
class SmaCross1(Strategy):
    # Define the two MA lags as *class variables*
    # for later optimization
    n1 = 5 # 10
    n2 = 200 # 20
    init = False
    lastClose = 0
    daydiff = 20

    def init(self):
        # Precompute the two moving averages
        self.sma1 = self.I(SMA, self.data.Close, self.n1)
        self.sma2 = self.I(SMA, self.data.Close, self.n2)

    def next(self):
        if self.init:
          self.buy()
          self.init = False
        # If sma1 crosses above sma2, close any existing
        # short trades, and buy the asset
        if crossover(self.sma1, self.sma2):
          # print(self.sma1.size, self.lastClose)
          if self.sma1.size > self.lastClose + self.daydiff:
            # self.position.close()
            self.buy()

        # Else, if sma1 crosses below sma2, close any existing
        # long trades, and sell the asset
        elif crossover(self.sma2, self.sma1):
            if self.position.is_long:
              self.lastClose = self.sma1.size
            self.position.close()
            # self.sell()

In [450]:
class AboveSma(Strategy):
    # Define the two MA lags as *class variables*
    # for later optimization
    n1 = 5
    n2 = 200
    n3 = 100 # Should be above 1% SMA

    def init(self):
        # Precompute the two moving averages
        self.sma1 = self.I(SMA, self.data.Close, self.n1)
        self.sma2 = self.I(SMA, self.data.Close, self.n2)
        self.Close = self.I(ToSeries, self.data.Close)

    def next(self):
        # If price crosses above sma1 and sma2, close any existing
        # short trades, and buy the asset
        if crossover(self.Close, self.sma1) and crossover(self.Close, self.sma2):
            # self.position.close()
            self.buy()

        # Else, if price crosses below sma1 and sma2, close any existing
        # long trades
        elif crossover(self.sma1, self.Close) and crossover(self.sma2, self.Close):
            self.position.close()
            # self.sell()

In [None]:
class CautiousSma(Strategy):
    # Define the two MA lags as *class variables*
    # for later optimization
    n1 = 5
    n2 = 200

    def init(self):
        # Precompute the two moving averages
        self.sma1 = self.I(SMA, self.data.Close, self.n1)
        self.sma2 = self.I(SMA, self.data.Close, self.n2)
        self.Close = self.I(ToSeries, self.data.Close)

    def next(self):
        # If price crosses above sma1 and sma2, close any existing
        # short trades, and buy the asset
        if crossover(self.Close, self.sma1) and crossover(self.Close, self.sma2):
            self.position.close()
            self.buy()
        # Else, if price crosses below sma1 or sma2, close any existing
        # long trades
        elif crossover(self.sma1, self.Close) or crossover(self.sma2, self.Close):
            self.position.close()
            # self.sell()

In [442]:
def HighRange(values, m, n=0):
    """
    Return High value of range (-m, -n) days
    """
    return pd.Series(values).shift(n).rolling(m).max()

In [443]:
def LowRange(values, m, n=0):
    """
    Return High value of range (-m, -n) days
    """
    return pd.Series(values).shift(n).rolling(m).min()

In [None]:
dfs['SPY']

Unnamed: 0,Open,High,Low,Close,Volume,Ticker,Unadjusted
2016-01-04,200.490005,201.029999,198.589996,174.043228,222353500,SPY,201.020004
2016-01-05,201.399994,201.899994,200.050003,174.337585,110845800,SPY,201.360001
2016-01-06,198.339996,200.059998,197.600006,172.138428,152112600,SPY,198.820007
2016-01-07,195.330002,197.440002,193.589996,168.008560,213436100,SPY,194.050003
2016-01-08,195.190002,195.850006,191.580002,166.164444,209817200,SPY,191.919998
...,...,...,...,...,...,...,...
2024-04-17,506.049988,506.220001,499.119995,500.549988,75910300,SPY,500.549988
2024-04-18,501.980011,504.130005,498.559998,499.519989,74548100,SPY,499.519989
2024-04-19,499.440002,500.459991,493.859985,495.160004,102129100,SPY,495.160004
2024-04-22,497.829987,502.380005,495.429993,499.720001,67763400,SPY,499.720001


In [517]:
class AboveSmaAndLY(Strategy):
    # Define the two MA lags as *class variables*
    # for later optimization
    n1 = 5
    n2 = 200
    n3 = 200
    n4 = 100
    useyh = False
    daydiff = 20
    lastClose = 0

    def init(self):
        # Precompute the two moving averages
        self.sma1 = self.I(SMA, self.data.Close, self.n1)
        self.sma2 = self.I(SMA, self.data.Close, self.n2)
        self.Close = self.I(ToSeries, self.data.Close)
        self.YHigh = self.I(HighRange, self.data.Close, self.n3)
        self.LYHigh = self.I(HighRange, self.data.Close, self.n3, self.n3)
        self.YLow = self.I(LowRange, self.data.Close, self.n3)
        self.LYLow = self.I(LowRange, self.data.Close, self.n3, self.n3)

    def next(self):
        close = self.Close[-1]
        closeadj = close*self.n4/100.0
        # If price crosses above sma1 and sma2, close any existing
        # short trades, and buy the asset
        if close > self.sma1[-1] and \
              close > self.sma2[-1] and \
              self.Close.size > self.lastClose + self.daydiff and \
              (close > (self.YLow[-1] + self.LYHigh[-1] + self.YHigh[-1])/3 or not self.useyh):
              # self.position.close()
              # print(self.Close[-1])
              self.buy()

        # Else, if price crosses below sma1 and sma2, close any existing
        # long trades
        elif (self.sma1[-1] > closeadj and \
            self.sma2[-1] > closeadj and \
            ((self.LYHigh[-1] + self.YHigh[-1])/2 > closeadj or not self.useyh)):
            # print(self.position.pl_pct)
            if self.position.is_long:
              self.lastClose = self.Close.size
            self.position.close()
            # self.sell()

In [522]:
strategy = AboveSmaAndLY
bt = Backtest(dfs['COIN'], strategy, cash=10_000, commission=0)
stats = bt.run()
stats

Start                     2021-04-14 00:00:00
End                       2024-04-23 00:00:00
Duration                   1105 days 00:00:00
Exposure Time [%]                   32.152231
Equity Final [$]                 30615.019882
Equity Peak [$]                  36376.618851
Return [%]                         206.150199
Buy & Hold Return [%]              -28.295967
Return (Ann.) [%]                   44.778063
Volatility (Ann.) [%]               72.115122
Sharpe Ratio                         0.620925
Sortino Ratio                        1.651856
Calmar Ratio                         1.209061
Max. Drawdown [%]                  -37.035416
Avg. Drawdown [%]                  -13.075365
Max. Drawdown Duration      161 days 00:00:00
Avg. Drawdown Duration       33 days 00:00:00
# Trades                                    4
Win Rate [%]                             25.0
Best Trade [%]                     265.990181
Worst Trade [%]                    -14.899213
Avg. Trade [%]                    

## Backtesting

 See
[`Backtest`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Backtest)


In [523]:
from backtesting import Backtest

cols = ['Start', 'End', 'Duration', 'Exposure Time [%]', 'Equity Final [$]',
       'Equity Peak [$]', 'Return [%]', 'Buy & Hold Return [%]',
       'Return (Ann.) [%]', 'Volatility (Ann.) [%]', 'Sharpe Ratio',
       'Sortino Ratio', 'Calmar Ratio', 'Max. Drawdown [%]',
       'Avg. Drawdown [%]', 'Max. Drawdown Duration', 'Avg. Drawdown Duration',
       '# Trades', 'Win Rate [%]', 'Best Trade [%]', 'Worst Trade [%]',
       'Avg. Trade [%]', 'Max. Trade Duration', 'Avg. Trade Duration',
       'Profit Factor', 'Expectancy [%]', 'SQN', ]
starr = {}
bts = {}
for ticker in dfs.keys():
  tmpbt = Backtest(dfs[ticker], strategy, cash=10_000, commission=0)
  tmpstats = tmpbt.run()
  starr[ticker] = tmpstats
  bts[ticker] = tmpbt




In [524]:
statsdf = pd.DataFrame(starr).T[cols]
statsdf.mean()

Start                     2019-10-04 10:40:00
End                       2024-04-23 00:00:00
Duration                   1662 days 13:20:00
Exposure Time [%]                   33.257068
Equity Final [$]                 19301.502714
Equity Peak [$]                  23461.610826
Return [%]                          93.015027
Buy & Hold Return [%]                249.8933
Return (Ann.) [%]                   10.070382
Volatility (Ann.) [%]               25.709183
Sharpe Ratio                         0.379535
Sortino Ratio                        0.817973
Calmar Ratio                         0.421818
Max. Drawdown [%]                  -32.961447
Avg. Drawdown [%]                   -14.13938
Max. Drawdown Duration      740 days 13:20:00
Avg. Drawdown Duration      272 days 00:00:00
# Trades                             6.111111
Win Rate [%]                        27.561728
Best Trade [%]                      85.754489
Worst Trade [%]                    -12.331429
Avg. Trade [%]                    



[`Backtest.plot()`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Backtest.plot)
method provides the same insights in a more visual form.

In [459]:
bt.plot()

  formatter=DatetimeTickFormatter(days=['%d %b', '%a %d'],
  formatter=DatetimeTickFormatter(days=['%d %b', '%a %d'],
  fig = gridplot(
  fig = gridplot(


## Optimization

 optimize the two parameters by calling
[`Backtest.optimize()`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Backtest.optimize)


In [460]:
%%time

# stats = bt.optimize(n1=range(5, 30, 5),
#                     n2=range(200, 250, 10),
#                     n3 = range(100, 110, 1),
#                     maximize='Equity Final [$]',
#                     constraint=lambda param: param.n1 < param.n2)
stats = bt.optimize(n1=range(5, 30, 5),
                             n2=range(20, 250, 10),
                              maximize='Equity Final [$]',
                              constraint=lambda param: param.n1 < param.n2)
stats

Backtest.optimize:   0%|          | 0/3 [00:00<?, ?it/s]

CPU times: user 195 ms, sys: 75.3 ms, total: 271 ms
Wall time: 9.94 s


Start                     2016-01-04 00:00:00
End                       2024-04-23 00:00:00
Duration                   3032 days 00:00:00
Exposure Time [%]                    72.77512
Equity Final [$]                  21939.16806
Equity Peak [$]                  22858.937378
Return [%]                         119.391681
Buy & Hold Return [%]              151.392894
Return (Ann.) [%]                    9.936608
Volatility (Ann.) [%]               12.282316
Sharpe Ratio                         0.809017
Sortino Ratio                         1.22591
Calmar Ratio                         0.643375
Max. Drawdown [%]                  -15.444507
Avg. Drawdown [%]                   -1.371044
Max. Drawdown Duration      752 days 00:00:00
Avg. Drawdown Duration       23 days 00:00:00
# Trades                                   10
Win Rate [%]                             50.0
Best Trade [%]                       44.34491
Worst Trade [%]                      -3.67344
Avg. Trade [%]                    

Check`stats['_strategy']`

In [461]:
stats._strategy

<Strategy SmaCross(n1=5,n2=200)>

In [527]:
bts['NVDA'].plot(plot_volume=False, plot_pl=False)

  formatter=DatetimeTickFormatter(days=['%d %b', '%a %d'],
  formatter=DatetimeTickFormatter(days=['%d %b', '%a %d'],
  fig = gridplot(
  fig = gridplot(


Strategy optimization managed to up its initial performance _on in-sample data_ by almost 50% and even beat simple
[buy & hold](https://en.wikipedia.org/wiki/Buy_and_hold).
In real life optimization, however, do **take steps to avoid
[overfitting](https://en.wikipedia.org/wiki/Overfitting)**.

## Trade data

In addition to backtest statistics returned by
[`Backtest.run()`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Backtest.run)
shown above, you can look into _individual trade returns_ and the changing _equity curve_ and _drawdown_ by inspecting the last few, internal keys in the result series.

In [None]:
stats.tail()

Expectancy [%]                                            14.702499
SQN                                                        1.231193
_strategy                                AboveSmaAndLY(n1=5,n2=190)
_equity_curve                        Equity  DrawdownPct Drawdow...
_trades               Size  EntryBar  ExitBar  EntryPrice   Exit...
dtype: object

The columns should be self-explanatory.

In [None]:
starr['NVDA']['_equity_curve']  # Contains equity/drawdown curves. DrawdownDuration is only defined at ends of DD periods.

Unnamed: 0,Equity,DrawdownPct,DrawdownDuration
2016-01-04,10000.000000,0.000000,NaT
2016-01-05,10000.000000,0.000000,NaT
2016-01-06,10000.000000,0.000000,NaT
2016-01-07,10000.000000,0.000000,NaT
2016-01-08,10000.000000,0.000000,NaT
...,...,...,...
2024-04-17,252202.130066,0.115400,NaT
2024-04-18,254110.143982,0.108708,NaT
2024-04-19,228697.137390,0.197844,NaT
2024-04-22,238651.135193,0.162931,NaT


In [529]:
stats = starr['NVDA']
stats['_trades'].tail(50)  # Contains individual trade data

Unnamed: 0,Size,EntryBar,ExitBar,EntryPrice,ExitPrice,PnL,ReturnPct,EntryTime,ExitTime,Duration
0,77,414,639,128.404999,220.119995,7062.054718,0.714263,2021-03-10,2022-01-28,324 days
1,70,660,664,242.910004,228.169998,-1031.800385,-0.060681,2022-03-01,2022-03-07,6 days
2,86,859,862,185.309998,168.639999,-1433.619843,-0.089957,2022-12-13,2022-12-16,3 days
3,85,883,1199,170.360001,807.950012,54195.150986,3.742604,2023-01-19,2024-04-23,460 days


Learn more by exploring further
[examples](https://kernc.github.io/backtesting.py/doc/backtesting/index.html#tutorials)
or find more framework options in the
[full API reference](https://kernc.github.io/backtesting.py/doc/backtesting/index.html#header-submodules).