## Mean Reversion for a Singular Index - Testing Notebook

We will create a program that performs mean reversion on the SPY stock. The main purpose of this program is to build familiarity for how the different libraries interact and how backtesting can be done on a optimized solution.

This will be done in three steps:
1. import data using openbb
2. build sk-learn model to optimize mean reversion strategy
3. walk-forward backtesting on historical data using vectorbt

In other programs, we will add a fourth step (train sk-learn model with inputs from ta-lib indicators), but the ta-lib indicators are not needed for the mean reversion strategy.

### Import Libraries

In [42]:
from openbb_terminal.sdk import openbb
import talib
import vectorbt as vbt
import numpy as np
import pandas as pd
import sklearn

### Import Data Using OpenBB

In [33]:
# grab data

ohlcv = openbb.stocks.load(symbol="SPY", start_date="2016-03-18", end_date="2023-01-10")
print(ohlcv)
print(type(ohlcv))

INFO:openbb_terminal.stocks.stocks_helper:START
INFO:openbb_terminal.stocks.stocks_helper:{"INPUT": {"start_date": "2016-03-18", "interval": "1440", "end_date": "2023-01-10", "prepost": "False", "source": "YahooFinance", "weekly": "False", "monthly": "False", "verbose": "True", "symbol": "SPY", "chart": "False"}, "VIRTUAL_PATH": "stocks.load", "CHART": false}


INFO:openbb_terminal.stocks.stocks_helper:END


                  Open        High         Low       Close   Adj Close  \
date                                                                     
2016-03-18  178.956873  179.491544  178.632569  179.140945  179.140945   
2016-03-21  178.869229  179.631788  178.632568  179.395126  179.395126   
2016-03-22  178.597496  179.885966  178.430971  179.298706  179.298706   
2016-03-23  178.904332  179.097165  177.940166  178.115479  178.115479   
2016-03-24  177.054844  178.071597  176.826956  178.036530  178.036530   
...                ...         ...         ...         ...         ...   
2023-01-04  378.973683  381.644056  375.828598  379.547333  379.547333   
2023-01-05  377.529702  377.648380  374.602204  375.215393  375.215393   
2023-01-06  378.409930  384.977055  375.245076  383.819885  383.819885   
2023-01-09  386.084751  389.378214  383.414408  383.602295  383.602295   
2023-01-10  382.999018  386.361689  382.029765  386.292450  386.292450   

               Volume  Dividends  Sto

In [41]:
# instantiate 200 day mean with close prices

closes = ohlcv["Close"]

data = pd.DataFrame()

data['Close'] = closes
data['Rolling Average'] = closes.rolling(window=200).mean()
data['Close After Week'] = closes.shift(periods=-7)
data['Change'] = data['Close After Week'] - data['Close']
data = data['2017-01-01':'2023-01-01']

print(data)

                 Close  Rolling Average  Close After Week     Change
date                                                                
2017-01-03  200.629669       188.405286        201.778748   1.149078
2017-01-04  201.823288       188.517427        202.241943   0.418655
2017-01-05  201.662933       188.629248        201.529312  -0.133621
2017-01-06  202.384460       188.750593        201.974731  -0.409729
2017-01-09  201.716400       188.868993        201.226517  -0.489883
...                ...              ...               ...        ...
2022-12-23  378.706665       392.922739        375.215393  -3.491272
2022-12-27  377.213226       392.764674        383.819885   6.606659
2022-12-28  372.525238       392.598109        383.602295  11.077057
2022-12-29  379.230835       392.420448        386.292450   7.061615
2022-12-30  378.231873       392.191807               NaN        NaN

[1510 rows x 4 columns]


### Build sk-learn Mean Reversion Model

In [46]:
# takes in the closes and rolling averages for the time period and finds optimal trading boundaries, deviation, stoploss, and profit cap (incomplete)

# new idea at the end of wednesday: add values to the matrix to represent trading boundaries (a 100 if the close is too far below or above rolling average) and stoploss/profit cap (stock change in the previous three weeks)

def mean_reversion_training(prices):
    X = prices[["Close","Rolling Average"]].values
    y = prices["Change"].values

    lasso_model = sklearn.linear_model.Lasso(alpha=1.0)  # You can specify the regularization strength (alpha)
    lasso_model.fit(X, y)

    return lasso_model
    

### Backtest SK-Learn Model with VectorBT

In [51]:
# testing for first phase (incomplete)

model = mean_reversion_training(data['2017-01-01':'2018-12-31'])
pred = model.predict(data['2019-01-01':'2019-05-30'][["Close","Rolling Average"]].values)
real = data['2019-01-01':'2019-05-30']["Change"].values

print(pred)
print(real)

pred_sim = np.where(pred > 0, 1, -1)
real_sim = np.where(real > 0, 1, -1)

results = pred * real
results_sim = pred_sim * real_sim

total_result = np.sum(results)
total_result_sim = np.sum(results_sim)

print(total_result)
print(total_result_sim)

[ 1.66113742e+00  2.14969829e+00  1.47382399e+00  1.30817990e+00
  1.11088390e+00  1.01257193e+00  9.37280148e-01  9.29359799e-01
  1.05970395e+00  8.16862816e-01  7.66015408e-01  6.03784495e-01
  3.16453547e-01  6.13148043e-01  5.68920412e-01  5.58303403e-01
  3.75260747e-01  5.42207973e-01  5.71236415e-01  2.28657768e-01
  3.60236459e-02  2.58630609e-02 -1.29845651e-01 -2.22586336e-01
 -1.91490609e-01  2.37404727e-02 -1.81110831e-03 -1.26093100e-02
 -2.97587368e-01 -3.69135592e-01 -3.16915585e-01 -5.60872762e-01
 -5.97761978e-01 -6.41085414e-01 -5.57348625e-01 -6.96716381e-01
 -7.25643157e-01 -7.07200297e-01 -6.95723100e-01 -6.51475851e-01
 -7.93163611e-01 -7.07384672e-01 -6.74543263e-01 -5.34002137e-01
 -3.41809883e-01 -2.95778216e-01 -6.22201357e-01 -7.07237418e-01
 -8.57825606e-01 -8.41139320e-01 -9.53059028e-01 -1.03530530e+00
 -1.03870734e+00 -9.66504143e-01 -1.22742307e+00 -7.72694293e-01
 -7.54327105e-01 -9.25117753e-01 -8.02532735e-01 -8.89156474e-01
 -1.03431934e+00 -1.30959