# **Multilayer Perceptron: Market Timing in AAPL Stock**

**Goal**: To design a strategy that times the return of Apple Stock using MLP network that aims to predict whether the future return of AAPL is positive or negtaive.

**1. Data**

In [35]:
import numpy as np 
import pandas as pd 
import yfinance as yf 

In [36]:
df = yf.download("AAPL", start="1980-01-01", end="2022-04-11")

df["Ret"] = df["Close"].pct_change()
df.reset_index(inplace=True)
name = "Ret"
df.tail()

[*********************100%***********************]  1 of 1 completed


Price,Date,Close,High,Low,Open,Volume,Ret
Ticker,Unnamed: 1_level_1,AAPL,AAPL,AAPL,AAPL,AAPL,Unnamed: 7_level_1
10415,2022-04-04,175.787811,175.837071,171.847264,171.975337,76468400,0.023693
10416,2022-04-05,172.458023,175.649871,171.827536,174.861759,73401800,-0.018942
10417,2022-04-06,169.276093,171.049342,167.601363,169.798214,89058800,-0.01845
10418,2022-04-07,169.581451,170.78332,167.325495,168.616022,77594700,0.001804
10419,2022-04-08,167.56192,169.226804,166.685149,169.226804,76575500,-0.011909


**Inputs and Outputs**

In [37]:
df["Ret25_i"] = df[name].rolling(25).apply(lambda x: 100 * (np.prod(1 + x / 100) - 1))
df["Ret60_i"] = df[name].rolling(60).apply(lambda x: 100 * (np.prod(1 + x / 100) - 1))
df["Ret90_i"] = df[name].rolling(90).apply(lambda x: 100 * (np.prod(1 + x / 100) - 1))
df["Ret120_i"] = df[name].rolling(120).apply(lambda x: 100 * (np.prod(1 + x / 100) - 1))
df["Ret240_i"] = df[name].rolling(240).apply(lambda x: 100 * (np.prod(1 + x / 100) - 1))

del df["Date"]
del df["Open"]
del df["Close"]
del df["High"]
del df["Low"]
del df["Volume"]

df = df.dropna()
df.tail(10)

Price,Ret,Ret25_i,Ret60_i,Ret90_i,Ret120_i,Ret240_i
Ticker,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
10410,0.005037,0.053177,-0.002813,0.1685,0.239295,0.300388
10411,0.019134,0.09015,0.019857,0.171149,0.252152,0.322116
10412,-0.006649,0.109387,-0.011795,0.135913,0.23638,0.310361
10413,-0.017776,0.074899,-0.016879,0.10112,0.22129,0.305399
10414,-0.001718,0.060206,0.008005,0.09647,0.220199,0.300737
10415,0.023693,0.082275,0.048402,0.117761,0.253071,0.336226
10416,-0.018942,0.074953,0.028462,0.09551,0.238331,0.299128
10417,-0.01845,0.035893,0.00989,0.108755,0.199571,0.277636
10418,0.001804,0.03968,-0.005089,0.088668,0.193851,0.281901
10419,-0.011909,0.046183,-0.019567,0.045156,0.170093,0.276003


**Defining the output: Classification**

- Defining output labels with the focus on the +120(trading) days return for Apple stock. The aim is to predict whether, on a given time t, the return of AAPL from time t to t+120 days will be positive or negative - zero return, although unlikely, will also be classified as a negative return.
- Therefore, we first investigate, at a given time t, what would be the 120-day return. Then, we calculate our output variable, keeping in that we'll be running a classifcation task, and, hence, we need to convert our output variables to a 0, 1, variable (0 for negative 120 days return, and 1 for positive).

In [38]:
df["Ret120"] = df["Ret120_i"].shift(-120)
df["Output"] = df["Ret120"] > 0
df["Output"] = df["Output"].astype(int)
del df["Ret120"]
df = df.dropna()
df.tail()

Price,Ret,Ret25_i,Ret60_i,Ret90_i,Ret120_i,Ret240_i,Output
Ticker,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
10415,0.023693,0.082275,0.048402,0.117761,0.253071,0.336226,0
10416,-0.018942,0.074953,0.028462,0.09551,0.238331,0.299128,0
10417,-0.01845,0.035893,0.00989,0.108755,0.199571,0.277636,0
10418,0.001804,0.03968,-0.005089,0.088668,0.193851,0.281901,0
10419,-0.011909,0.046183,-0.019567,0.045156,0.170093,0.276003,0


In [39]:
df.describe()

Price,Ret,Ret25_i,Ret60_i,Ret90_i,Ret120_i,Ret240_i,Output
Ticker,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
count,10180.0,10180.0,10180.0,10180.0,10180.0,10180.0,10180.0
mean,0.00118,0.029372,0.070766,0.105697,0.139707,0.275046,0.704322
std,0.028192,0.141406,0.226664,0.279427,0.321383,0.447276,0.456369
min,-0.518692,-0.940299,-1.151643,-1.040123,-1.048316,-0.902589,0.0
25%,-0.012825,-0.04853,-0.048529,-0.045875,-0.043797,-0.048386,0.0
50%,0.0,0.035273,0.083604,0.120599,0.159032,0.294689,1.0
75%,0.014601,0.115021,0.210867,0.265228,0.315018,0.565963,1.0
max,0.33228,0.882471,0.892261,1.141607,1.30116,1.782617,1.0


**2. Train-Test Samples and Scaling**

- We will take 20% of observations and devote them to testing, while 80% will be used for training the model

In [None]:
ts = int(0.2 * len(df)) # Number of observations in the test sample
split_time = len(df) - ts  # From this data we are in the test sample
test_time = df.iloc[split_time:, 0:1].values  # Keep the test sample dates
Ret_vector = df.iloc[split_time:, 1:2].values 
df.tail()

Price,Ret,Ret25_i,Ret60_i,Ret90_i,Ret120_i,Ret240_i,Output
Ticker,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
10415,0.023693,0.082275,0.048402,0.117761,0.253071,0.336226,0
10416,-0.018942,0.074953,0.028462,0.09551,0.238331,0.299128,0
10417,-0.01845,0.035893,0.00989,0.108755,0.199571,0.277636,0
10418,0.001804,0.03968,-0.005089,0.088668,0.193851,0.281901,0
10419,-0.011909,0.046183,-0.019567,0.045156,0.170093,0.276003,0


- Next, we use sklearn to formally define the input and output matrices for training (X_train and y_train) and test (X_test and y_test).