![QuantConnect Logo](https://cdn.quantconnect.com/web/i/icon.png)
<hr>

# Introduction
This notebook demonstrates how to train a neural network to predict if the next trading day will favour momentum or reversion risk exposure.

# Get Label
In order to train and test the model, you need to create some labels. Let the label be 1 when the next day favors momentum factor exposure over reversal factor exposure and 0 otherwise. You can get the performance of the momentum and reversal factors from https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html. Specifically, the following CSV files provide the factors you need:
 - [Momentum Factor (Mom) [Daily]](https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Momentum_Factor_daily_CSV.zip)
 - [Short-Term Reversal Factor (ST Rev) [Daily]](https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_ST_Reversal_Factor_daily_CSV.zip)

The following code block shows how to load the files from the QuantConnect CDN and calculate the labels.

In [1]:
import plotly.graph_objects as go

def get_ff_factor(qb, file_name, column_name):
    df = pd.read_csv(
        f"https://cdn.quantconnect.com/ff/{file_name}", 
        skiprows=13, skipfooter=1, engine='python', index_col=0)
    factor = df[column_name] / 100
    # Each index day represents the day of the return, not EoD.
    factor.index = pd.to_datetime(factor.index, format='%Y%m%d') 
    # Move index to EoD to align with technical indicators.
    factor.index += timedelta(1) 
    return factor 

qb = QuantBook()
momentum = get_ff_factor(qb, "F-F_Momentum_Factor_daily.csv", "Mom   ")
reversion = get_ff_factor(qb, "F-F_ST_Reversal_Factor_daily.csv", "ST_Rev")
# Label 1 = Momentum regime; Label 0 = Reversion regime.
label = (momentum > reversion.reindex(momentum.index)).dropna().astype(int) 
# The label is the regime of the following day, not the day that just 
# passed.
label = label.shift(-1).dropna() 
label_sma = label.rolling(30).mean()
# Plot the last 3 years.
period = 3*252
layout = dict(
    title="Labels<br><sup>Regime changes occur when one factor becomes"
        + "more profitable than the other factor</sup>", 
    xaxis_title="Date", 
    yaxis_title="Label (0=Reversion; 1=Momentum)"
)
go.Figure(
    [
        go.Scatter(
            x=label.index[-period:], y=label.iloc[-period:], name='Prediction'
        ),
        go.Scatter(
            x=label_sma.index[-period:], y=label_sma.iloc[-period:], 
            name='30-Day SMA'
        )
    ],
    layout
).show()
# Plot March 2020.
idx = label[
    (datetime(2020, 3, 1) <= label.index) & 
    (label.index <= datetime(2020, 4, 1))
].index
go.Figure(
    [
        go.Scatter(
            x=label.loc[idx].index, y=label.loc[idx], name='Prediction'
        ),
        go.Scatter(
            x=label_sma.loc[idx].index, y=label_sma.loc[idx], name='30-Day SMA'
        )
    ],    
    layout
).show()
print(label, "\n")
print(label.value_counts())

# Get Factors
The next step is to gather the factors. Let's use the VIX closing price and the following indicators of the SPY market index:
 - Relative strength index
 - Average true range
 - Standard deviation of daily returns

After you gather all the factors, standardize them so you can use them as input to the neural network.

In [2]:
spy_symbol = qb.add_equity("SPY", Resolution.DAILY).symbol

# Define the parameters.
period = 21
start_date = datetime(1990, 1, 1)
end_date = datetime(2024, 1, 1)

# Get the RSI indicator data.
rsi = qb.indicator(
    RelativeStrengthIndex(period), spy_symbol, start_date, end_date
)['relativestrengthindex']
rsi.name = 'rsi'

# Get the ATR indicator (normalized by SMA of price) data.
atr = qb.indicator(
    AverageTrueRange(period, MovingAverageType.SIMPLE), 
    spy_symbol, start_date, end_date
)['averagetruerange']
atr /= qb.indicator(
    SimpleMovingAverage(period), spy_symbol, start_date, end_date
)['simplemovingaverage']
atr.name = 'atr'

# Get the STD indicator (STD of daily returns) data.
std = qb.history(spy_symbol, start_date, end_date).loc[spy_symbol][
    'close'].pct_change().dropna().rolling(period).std().dropna()
std.name = 'std'

# Get the VIX factor data.
vix_symbol = qb.add_data(CBOE, "VIX", Resolution.DAILY).symbol
vix = qb.history(vix_symbol, start_date, end_date).loc[vix_symbol]['close']
vix.name = "vix"

# Combine factors into a matrix.
factors = pd.concat([rsi, atr, std, vix], axis=1).dropna()

# Standardize factors for the neural network.
from sklearn.preprocessing import StandardScaler
factors = pd.DataFrame(
    StandardScaler().fit_transform(factors.values),  
    columns=factors.columns, index=factors.index
)

# Display the factors.
period = 252
go.Figure(
    [
        go.Scatter(
            x=factors.index[-period:], y=factors[f].iloc[-period:], name=f
        ) 
        for f in factors.columns
    ],
    dict(
        title="Factor Values<br><sup>The factors shown here are diverse "
            + "yet standardized to a similar scale</sup>", 
        xaxis_title="Date", yaxis_title="Value"
    )
).show()
factors

# Create Test and Train Datasets

The following code block aligns the factors and labels, then splits them up into testing and training datasets.

In [3]:
from sklearn.model_selection import train_test_split

idx = sorted(list(set(factors.index).intersection(label.index))) 
X = factors.loc[idx].values
y = label.loc[idx].values

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.33, random_state=0, shuffle=False
)

# Create, Train, and Test the Model

Now that you have the factors and labels, you can build the neural network, train it, and then evaluate its accuracy.

In [4]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from keras.utils import set_random_seed
from sklearn.metrics import accuracy_score

set_random_seed(0)

# Build the neural network model.
model = keras.Sequential(
    [
        layers.Input(shape=(X.shape[1],)),
        layers.Dense(8, activation='relu'),
        layers.Dense(1, activation='sigmoid')
    ]
)

# Compile the model.
model.compile(
    optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']
)

# Train the model.
model.fit(X_train, y_train, epochs=100, verbose=0)

# Evaluate the predictions.
y_hat = (model.predict(X_train, verbose=0) > 0.5).astype(int)
print(f"In-sample accuracy: {accuracy_score(y_hat, y_train)}")

y_hat = (model.predict(X_test, verbose=0) > 0.5).astype(int)
print(f"Out-of-sample accuracy: {accuracy_score(y_hat, y_test)}")
print(f"OOS Label counts: {np.unique(y_hat, return_counts=True)[1]}")

# Plot Out of Sample Results
Let's produce a plot to show the predicted regimes over the out-of-sample dataset.

In [5]:
import plotly.graph_objects as go

predictions = pd.Series(y_hat.flatten(), index=label[-len(y_test):].index)
go.Figure(
    go.Scatter(
        x=predictions.index, y=predictions.values, name='Predictions'
    ),
    dict(
        title="Predictions<br><sup>The prediction changes show the "
            + "model detects favorable times for each regime.</sup>", 
        xaxis_title="Date", yaxis_title="Prediction"
    )
).show()

Now let's use these predicted regimes to simulate a trading strategy. The strategy will be long SPY if the model predicts the current regime favours momentum over reversion. Otherwise, long TLT.

In [6]:
from backtestlib import rough_daily_backtest

portfolio_weights = pd.DataFrame({
    spy_symbol: predictions,
    qb.add_equity("TLT", Resolution.DAILY).symbol: abs(predictions-1)
})
rough_daily_backtest(qb, portfolio_weights)