# **Multilayer Perceptron: Market Timing in AAPL Stock**

**Goal**: To design a strategy that times the return of Apple Stock using MLP network that aims to predict whether the future return of AAPL is positive or negtaive.

**1. Data**

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import yfinance as yf
import tensorflow as ts

from sklearn import metrics
from sklearn.model_selection import train_test_split
import tensorflow as tf

In [None]:
df = yf.download("AAPL", start="1980-01-01", end="2022-04-11")

df["Ret"] = df["Close"].pct_change()
df.reset_index(inplace=True)
name = "Ret"
df.tail()

**Inputs and Outputs**

In [None]:
df["Ret25_i"] = df[name].rolling(25).apply(lambda x: 100 * (np.prod(1 + x / 100) - 1))
df["Ret60_i"] = df[name].rolling(60).apply(lambda x: 100 * (np.prod(1 + x / 100) - 1))
df["Ret90_i"] = df[name].rolling(90).apply(lambda x: 100 * (np.prod(1 + x / 100) - 1))
df["Ret120_i"] = df[name].rolling(120).apply(lambda x: 100 * (np.prod(1 + x / 100) - 1))
df["Ret240_i"] = df[name].rolling(240).apply(lambda x: 100 * (np.prod(1 + x / 100) - 1))

del df["Date"]
del df["Open"]
del df["Close"]
del df["High"]
del df["Low"]
del df["Volume"]

df = df.dropna()
df.tail(10)

**Defining the output: Classification**

- Defining output labels with the focus on the +120(trading) days return for Apple stock. The aim is to predict whether, on a given time t, the return of AAPL from time t to t+120 days will be positive or negative - zero return, although unlikely, will also be classified as a negative return.
- Therefore, we first investigate, at a given time t, what would be the 120-day return. Then, we calculate our output variable, keeping in that we'll be running a classifcation task, and, hence, we need to convert our output variables to a 0, 1, variable (0 for negative 120 days return, and 1 for positive).

In [None]:
df["Ret120"] = df["Ret120_i"].shift(-120)
df["Output"] = df["Ret120"] > 0
df["Output"] = df["Output"].astype(int)
del df["Ret120"]
df = df.dropna()
df.tail()

In [None]:
df.describe()

**2. Train-Test Samples and Scaling**

- We will take 20% of observations and devote them to testing, while 80% will be used for training the model

In [None]:
ts = int(0.2 * len(df)) # Number of observations in the test sample
split_time = len(df) - ts  # From this data we are in the test sample
test_time = df.iloc[split_time:, 0:1].values  # Keep the test sample dates
Ret_vector = df.iloc[split_time:, 1:2].values
df.tail()

- Next, we use sklearn to formally define the input and output matrices for training (X_train and y_train) and test (X_test and y_test).

In [None]:
Xdf, ydf = df.iloc[:, 2:-1], df.iloc[:, -1]
X = Xdf.astype("float32")
y = ydf.astype("float32")

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=ts, shuffle=True
)

print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

**3. Model Training**

**Scaling**: We use scaled data by taking prices that adjust for any dividends.

**Model and Training**: We use 3 hidden layers with 25, 15, 10 units respectively, and a final single-unit output layer.
- All hidden layers use a ReLU activation function, whereas the output layer uses sigmoid activation function.
- **Dropou Layer** - after each hidden layer. Dropout randomly sets some units of a hidden layer to zero. We set n_dropout = 0.2 to shut down 20% of the units in the layer.
- **Loss function**  We use a binary cross-entropy that is essentially a log-likelihood.

- **Metric for loss function** - Binary accuracy.

In [None]:
tf.keras.backend.clear_session() # We clear the backend to reset the random seed process
tf.random.set_seed(1234) # A random seed so that results obtained are somewhat replicable.

act_fun = 'relu'
hp_units = 25
hp_units_2 = 15
hp_units_3 = 10
n_dropout = 0.2

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(units=hp_units, activation=act_fun))
model.add(tf.keras.layers.Dropout(n_dropout))
model.add(tf.keras.layers.Dense(units=hp_units_2, activation=act_fun))
model.add(tf.keras.layers.Dropout(n_dropout))
model.add(tf.keras.layers.Dense(units=hp_units_3, activation=act_fun))
model.add(tf.keras.layers.Dropout(n_dropout))
model.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

hp_lr = 1e-5 # learning rate
adam = tf.keras.optimizers.Adam(learning_rate=hp_lr)

model.compile(optimizer=adam, loss="binary_crossentropy", metrics=["accuracy"])

**4. Validation and callbacks (Earlystopping)**

In [None]:
es = tf.keras.callbacks.EarlyStopping(
    monitor="Val_accuracy",
    mode="max",
    verbose="1",
    patience=20,
    restore_best_weights=True,
)

**Classification on imbalance data: class_weight**

- There's always a possibility that one of the labels we are trying to predict is underrepresented in the training sample.
- Ideally you would want the model to give a heavier weight to underrepresented labels so that you do not overlook this in future prediction.
- **class_weights** - essentially consist on passing keras a weight for each class in the sample, so that we can make the model focus on a particular class more than it will be based on its representation in the sample.

In [None]:
class_weight = {0: (np.mean(y_train) / 0.5) * 1.2, 1: 1.0}
print(class_weight)

- Finally, we train the model:

In [None]:
history = model.fit(
    X_train,
    y_train,
    validation_split=0.2,
    epochs=500,
    batch_size=32,
    verbose=2,
    callbacks=[es],
    class_weight=class_weight,
)

In [None]:
model.summary()

**5. Financial Performance of the model**

In [None]:
y_prob = model.predict(X_test)
y_pred = np.where(y_prob > 0.5, 1, 0)

acc = model.evaluate(X_test, y_test)
print("Model accuracy in test: ", acc)

The confusion matrix:

In [None]:
cm = metrics.confusion_matrix(y_test, y_pred)
plt.figure(figsize=(9,9))
ax = plt.subplot()
sns.heatmap(cm, annot=True, fmt="g", ax=ax)

ax.set_label("Predicted labels")
ax.set_ylabel("True labels")
ax.set_title("Confusion Matrix")
ax.xaxis.set_ticklabels(["DOWN", "UP"])
ax.yaxis.set_ticklabels(["DOWN", "IP"]);

**Trading Strategy based on Moddel Predictions**

- We use backtesting strategy.

In [None]:
df_predictions = pd.DataFrame(
    {
        "Date": test_time.flatten(),
        "Pred": y_pred.flatten(),
        "Ret": Ret_vector.flatten(),
    }
)

df_predictions.tail()

In [None]:
df_predictions.Date = pd.to_datetime(df_predictions.Date)
df = df_predictions
df.tail()

The trading strategy will take long(+1) if the prediction of the model is higher than 0.5, and short(-1) if less. We'll backtest 3 trading strategies:
- A long/short strategy that will take a long or short position when model prediction indcates so.
- a long-only strategy that will go to cash(return = 0) when the model predicts a negative 120-day return.
- A Buy-and-hold strategy that will buy the stock at the beginning of the test period and hold it until the end of the period.

In [None]:
df["Positions"] = np.where(df["Pred"] > 0.5, 1, -1)
df["Strat_ret"] = df["Positions"].shift(1) * df["Ret"]
df["Positions_L"] = df["Positions"].shift(1)
df["Positions_L"][df["Positions_L"] == -1] = 0
df["Strat_ret_L"] = df["Positions_L"] * df["Ret"]
df["CumRet"] = df["Strat_ret"].expanding().apply(lambda x: np.prod(1 + x) - 1)
df["CumRet_L"] = df["Strat_ret_L"].expanding().apply(lambda x: np.prod(1 + x) - 1)
df["bhRet"] = df["Ret"].expanding().apply(lambda x: np.prod(1 + x) - 1)

Final_Return_L = np.prod(1 + df["Strat_ret_L"]) - 1
Final_Return = np.prod(1 + df["Strat_ret"]) - 1
Buy_Return = np.prod(1 + df["Ret"]) - 1

print("Strat Return Long Only =", Final_Return_L * 100, "%")
print("Strat Return =", Final_Return * 100, "%")
print("Buy and Hold Return =", Buy_Return * 100, "%")

In [None]:
fig = plt.figure(figsize=(12, 6))
ax = plt.gca()
df.plot(x="Date", y="bhRet", label="Buy&Hold", ax=ax)
df.plot(x="Date", y="CumRet_L", label="Strat Only Long", ax=ax)
df.plot(x="Date", y="CumRet", label="Strat Long/Short", ax=ax)
plt.xlabel("date")
plt.ylabel("Cumulative Returns")
plt.grid()
plt.show()

df.describe()