# Stock Returns: Neural Network

Our objective in this chapter is to predict stock returns using dense feed-forward neural networks.  Specifically, we will try to predict the daily returns of MSFT from the returns of various correlated assets including stock indices, currencies, and other stocks.

We will begin by first reviewing our previous work in which we use linear regression and nearest neighbors to predict MSFT returns.

## Import Packages

Let's begin by importing the initial packages that we will need.

In [1]:
import numpy as np
import pandas as pd
import yfinance as yf
import sklearn

## Read-In Data

Next, let's read-in our data.  We will start the stocks, whose data we will get from Yahoo.

In [2]:
stock_tickers = ['MSFT', 'IBM', 'GOOGL'] # define tickers
df_stock = yf.download(
    stock_tickers, start='2005-01-01', end='2021-07-31', auto_adjust=False,
)
df_stock = df_stock['Adj Close'] # select only the adjusted close price
df_stock.columns = df_stock.columns.str.lower() # clean-up column names
df_stock.rename_axis('trade_date', inplace=True) # clean-up index name
df_stock.rename_axis('', axis=1, inplace=True) # clean-up index name
df_stock

[*********************100%***********************]  3 of 3 completed


Unnamed: 0_level_0,googl,ibm,msft
trade_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2005-01-03,5.038074,50.237442,18.454880
2005-01-04,4.834026,49.697788,18.523895
2005-01-05,4.809422,49.595024,18.482485
2005-01-06,4.686148,49.440849,18.461790
2005-01-07,4.817872,49.225010,18.406578
...,...,...,...
2021-07-26,133.116882,114.338181,279.157227
2021-07-27,130.996506,114.322174,276.733154
2021-07-28,135.161743,113.537315,276.424042
2021-07-29,134.847443,113.665459,276.694397


Next, we'll grab the currency data from FRED.

In [3]:
currency_tickers = ['JPY=X', 'GBPUSD=X']
df_currency = yf.download(
    currency_tickers, start='2005-01-01', end='2021-07-31',
    auto_adjust=False, ignore_tz=True
)
df_currency = df_currency['Adj Close']
df_currency.columns = df_currency.columns.str.lower()
df_currency.rename_axis('trade_date', inplace=True)
df_currency.rename_axis('', axis=1, inplace=True)
df_currency

[*********************100%***********************]  2 of 2 completed


Unnamed: 0_level_0,gbpusd=x,jpy=x
trade_date,Unnamed: 1_level_1,Unnamed: 2_level_1
2005-01-03,1.904617,102.739998
2005-01-04,1.883594,104.339996
2005-01-05,1.885512,103.930000
2005-01-06,1.876490,104.889999
2005-01-07,1.871293,104.889999
...,...,...
2021-07-26,1.375781,110.543999
2021-07-27,1.382915,110.302002
2021-07-28,1.388272,109.806000
2021-07-29,1.390685,109.890999


Finally, we'll grab the index data Yahoo.

In [4]:
index_tickers = ['SPY', 'DIA', '^VIX'] 
df_index = yf.download(
    index_tickers, start='2005-01-01', end='2021-07-31', auto_adjust=False
)
df_index = df_index['Adj Close']
df_index.columns = df_index.columns.str.lower().str.replace('^', '')
df_index.rename_axis('trade_date', inplace=True)
df_index.rename_axis('', axis=1, inplace=True)
df_index

[*********************100%***********************]  3 of 3 completed


Unnamed: 0_level_0,dia,spy,vix
trade_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2005-01-03,67.762741,82.074020,14.080000
2005-01-04,67.118706,81.071114,13.980000
2005-01-05,66.746117,80.511688,14.090000
2005-01-06,66.954491,80.921036,13.580000
2005-01-07,66.828247,80.805092,13.490000
...,...,...,...
2021-07-26,326.679871,416.758026,17.580000
2021-07-27,325.945435,414.858612,19.360001
2021-07-28,324.774048,414.688477,18.309999
2021-07-29,326.131317,416.408386,17.700001


## Join and Clean Data

Now we can join together our price data and convert it into returns and differences (for VIX, as these are more stationary).  Notice that we are implicitly adding a time series component to our regression by adding lagged `msft` returns as a feature.

In [5]:
df_data = \
    (
    df_stock
        .merge(df_index, how='left', left_index=True, right_index=True) # join currency data
        .merge(df_currency, how='left', left_index=True, right_index=True) # join index data
        .dropna()
        .assign(msft = lambda df: df['msft'].pct_change())   # percent change
        .assign(msft_lag_0 = lambda df: df['msft'].shift(0)) #
        .assign(msft_lag_1 = lambda df: df['msft'].shift(1)) #
        .assign(ibm = lambda df: df['ibm'].pct_change())     #
        .assign(googl = lambda df: df['googl'].pct_change()) #
        .assign(spy = lambda df: df['spy'].pct_change())     #
        .assign(dia = lambda df: df['dia'].pct_change())     #
        .assign(vix = lambda df: df['vix'].diff())           # absolute change
        .assign(dexjpus = lambda df: df['jpy=x'].pct_change()) # percent change
        .assign(dexusuk = lambda df: df['gbpusd=x'].pct_change()) #
        .dropna()
    )
df_data

Unnamed: 0_level_0,googl,ibm,msft,dia,spy,vix,gbpusd=x,jpy=x,msft_lag_0,msft_lag_1,dexjpus,dexusuk
trade_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2005-01-05,-0.005090,-0.002068,-0.002236,-0.005551,-0.006900,0.110001,1.885512,103.930000,-0.002236,0.003740,-0.003929,0.001018
2005-01-06,-0.025632,-0.003109,-0.001120,0.003122,0.005084,-0.510000,1.876490,104.889999,-0.001120,-0.002236,0.009237,-0.004785
2005-01-07,0.028109,-0.004366,-0.002991,-0.001886,-0.001433,-0.090000,1.871293,104.889999,-0.002991,-0.001120,0.000000,-0.002769
2005-01-10,0.006242,-0.001044,0.004874,0.003401,0.004728,-0.260000,1.876912,104.169998,0.004874,-0.002991,-0.006864,0.003003
2005-01-11,-0.007793,-0.007108,-0.002612,-0.006402,-0.006891,-0.040000,1.878605,103.419998,-0.002612,0.004874,-0.007200,0.000902
...,...,...,...,...,...,...,...,...,...,...,...,...
2021-07-26,0.007668,0.010117,-0.002140,0.002396,0.002455,0.379999,1.375781,110.543999,-0.002140,0.012336,0.003668,-0.001168
2021-07-27,-0.015929,-0.000140,-0.008684,-0.002248,-0.004558,1.780001,1.382915,110.302002,-0.008684,-0.002140,-0.002189,0.005186
2021-07-28,0.031797,-0.006865,-0.001117,-0.003594,-0.000410,-1.050001,1.388272,109.806000,-0.001117,-0.008684,-0.004497,0.003873
2021-07-29,-0.002325,0.001129,0.000978,0.004179,0.004147,-0.609999,1.390685,109.890999,0.000978,-0.001117,0.000774,0.001738


## Training Set and Testing Set

We'll train our models on data prior to 2016, and then we'll use data from 2016 onward for testing.  So let's separate out these two subsets of data.

In [6]:
df_train = df_data.query('trade_date < "2016-01-01"')
df_train

Unnamed: 0_level_0,googl,ibm,msft,dia,spy,vix,gbpusd=x,jpy=x,msft_lag_0,msft_lag_1,dexjpus,dexusuk
trade_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2005-01-05,-0.005090,-0.002068,-0.002236,-0.005551,-0.006900,0.110001,1.885512,103.930000,-0.002236,0.003740,-0.003929,0.001018
2005-01-06,-0.025632,-0.003109,-0.001120,0.003122,0.005084,-0.510000,1.876490,104.889999,-0.001120,-0.002236,0.009237,-0.004785
2005-01-07,0.028109,-0.004366,-0.002991,-0.001886,-0.001433,-0.090000,1.871293,104.889999,-0.002991,-0.001120,0.000000,-0.002769
2005-01-10,0.006242,-0.001044,0.004874,0.003401,0.004728,-0.260000,1.876912,104.169998,0.004874,-0.002991,-0.006864,0.003003
2005-01-11,-0.007793,-0.007108,-0.002612,-0.006402,-0.006891,-0.040000,1.878605,103.419998,-0.002612,0.004874,-0.007200,0.000902
...,...,...,...,...,...,...,...,...,...,...,...,...
2015-12-24,-0.003474,-0.002093,-0.002687,-0.003356,-0.001650,0.170000,1.487697,120.934998,-0.002687,0.008491,-0.000785,0.003615
2015-12-28,0.021414,-0.004629,0.005029,-0.001370,-0.002285,1.170000,1.493206,120.231003,0.005029,-0.002687,-0.005821,0.003703
2015-12-29,0.014983,0.015769,0.010724,0.011430,0.010672,-0.830000,1.489403,120.349998,0.010724,0.005029,0.000990,-0.002547
2015-12-30,-0.004610,-0.003148,-0.004244,-0.006668,-0.007088,1.210001,1.482228,120.528999,-0.004244,0.010724,0.001487,-0.004817


In [7]:
df_test = df_data.query('trade_date > "2016-01-01"')
df_test

Unnamed: 0_level_0,googl,ibm,msft,dia,spy,vix,gbpusd=x,jpy=x,msft_lag_0,msft_lag_1,dexjpus,dexusuk
trade_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2016-01-04,-0.023869,-0.012135,-0.012257,-0.015518,-0.013979,2.490002,1.473709,120.310997,-0.012257,-0.014740,-0.001154,-0.005541
2016-01-05,0.002752,-0.000735,0.004562,0.000583,0.001691,-1.360001,1.471410,119.467003,0.004562,-0.012257,-0.007015,-0.001560
2016-01-06,-0.002889,-0.005006,-0.018165,-0.014294,-0.012614,1.250000,1.467394,119.101997,-0.018165,0.004562,-0.003055,-0.002729
2016-01-07,-0.024140,-0.017090,-0.034783,-0.023559,-0.023991,4.400000,1.462994,118.610001,-0.034783,-0.018165,-0.004131,-0.002999
2016-01-08,-0.013617,-0.009258,0.003067,-0.010427,-0.010977,2.020000,1.462694,117.540001,0.003067,-0.034783,-0.009021,-0.000205
...,...,...,...,...,...,...,...,...,...,...,...,...
2021-07-26,0.007668,0.010117,-0.002140,0.002396,0.002455,0.379999,1.375781,110.543999,-0.002140,0.012336,0.003668,-0.001168
2021-07-27,-0.015929,-0.000140,-0.008684,-0.002248,-0.004558,1.780001,1.382915,110.302002,-0.008684,-0.002140,-0.002189,0.005186
2021-07-28,0.031797,-0.006865,-0.001117,-0.003594,-0.000410,-1.050001,1.388272,109.806000,-0.001117,-0.008684,-0.004497,0.003873
2021-07-29,-0.002325,0.001129,0.000978,0.004179,0.004147,-0.609999,1.390685,109.890999,0.000978,-0.001117,0.000774,0.001738


## Training Linear Regression and K-Nearest Neighbors

In order to train our model, we first put our training features into `X_train` and our training labels into `y_train`

In [8]:
X_train = df_train.drop(columns=['msft'])[0:len(df_train)-1]
X_train

Unnamed: 0_level_0,googl,ibm,dia,spy,vix,gbpusd=x,jpy=x,msft_lag_0,msft_lag_1,dexjpus,dexusuk
trade_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2005-01-05,-0.005090,-0.002068,-0.005551,-0.006900,0.110001,1.885512,103.930000,-0.002236,0.003740,-0.003929,0.001018
2005-01-06,-0.025632,-0.003109,0.003122,0.005084,-0.510000,1.876490,104.889999,-0.001120,-0.002236,0.009237,-0.004785
2005-01-07,0.028109,-0.004366,-0.001886,-0.001433,-0.090000,1.871293,104.889999,-0.002991,-0.001120,0.000000,-0.002769
2005-01-10,0.006242,-0.001044,0.003401,0.004728,-0.260000,1.876912,104.169998,0.004874,-0.002991,-0.006864,0.003003
2005-01-11,-0.007793,-0.007108,-0.006402,-0.006891,-0.040000,1.878605,103.419998,-0.002612,0.004874,-0.007200,0.000902
...,...,...,...,...,...,...,...,...,...,...,...
2015-12-23,0.001799,0.004422,0.010344,0.012384,-1.030001,1.482338,121.029999,0.008491,0.009484,-0.001452,-0.004936
2015-12-24,-0.003474,-0.002093,-0.003356,-0.001650,0.170000,1.487697,120.934998,-0.002687,0.008491,-0.000785,0.003615
2015-12-28,0.021414,-0.004629,-0.001370,-0.002285,1.170000,1.493206,120.231003,0.005029,-0.002687,-0.005821,0.003703
2015-12-29,0.014983,0.015769,0.011430,0.010672,-0.830000,1.489403,120.349998,0.010724,0.005029,0.000990,-0.002547


Notice that the label we are predicting is the *next* day `msft` return; the features we are using to predict are the *current* day returns of the various correlated assets. 

In [9]:
y_train = df_train[['msft']][1:len(df_train)]
y_train

Unnamed: 0_level_0,msft
trade_date,Unnamed: 1_level_1
2005-01-06,-0.001120
2005-01-07,-0.002991
2005-01-10,0.004874
2005-01-11,-0.002612
2005-01-12,0.001870
...,...
2015-12-24,-0.002687
2015-12-28,0.005029
2015-12-29,0.010724
2015-12-30,-0.004244


### Linear Regression

Let's first fit a simple linear regression to our training data.

In [10]:
from sklearn.linear_model import LinearRegression
linear_regression = LinearRegression()
linear_regression.fit(X_train, y_train)

Recall that the `.score()` of a `LinearRegression` gives the $R^2$.

In [11]:
linear_regression.score(X_train, y_train)

0.017959946658375414

We can also examine the coefficients of our model.

In [12]:
np.round(linear_regression.coef_, 3)

array([[ 0.002, -0.016,  0.214, -0.328,  0.   , -0.003,  0.   ,  0.012,
        -0.048, -0.002, -0.002]])

### KNN

Next, let's fit a KNN to our model.  As you can see, the in-sample $R^2$ is higher for KNN over Linear Regression.

In [13]:
from sklearn.neighbors import KNeighborsRegressor
knn = KNeighborsRegressor(n_neighbors=10)
knn.fit(X_train, y_train)
knn.score(X_train, y_train)

0.1212395470121359

#### Mean-Squared Error

Another goodness-of-fit metric is the mean squared error.  As you can see the models are close on this metric.

In [14]:
sklearn.metrics.mean_squared_error(y_train, linear_regression.predict(X_train))

0.000294029286668637

In [15]:
sklearn.metrics.mean_squared_error(y_train, knn.predict(X_train))

0.00026310669128558063

## Testing Linear Regression and K-Nearest Neighbors

Let's now test the model with the data after 2016.

In [16]:
X_test = df_test.drop(columns=['msft'])[0:len(df_test)-1]
X_test

Unnamed: 0_level_0,googl,ibm,dia,spy,vix,gbpusd=x,jpy=x,msft_lag_0,msft_lag_1,dexjpus,dexusuk
trade_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2016-01-04,-0.023869,-0.012135,-0.015518,-0.013979,2.490002,1.473709,120.310997,-0.012257,-0.014740,-0.001154,-0.005541
2016-01-05,0.002752,-0.000735,0.000583,0.001691,-1.360001,1.471410,119.467003,0.004562,-0.012257,-0.007015,-0.001560
2016-01-06,-0.002889,-0.005006,-0.014294,-0.012614,1.250000,1.467394,119.101997,-0.018165,0.004562,-0.003055,-0.002729
2016-01-07,-0.024140,-0.017090,-0.023559,-0.023991,4.400000,1.462994,118.610001,-0.034783,-0.018165,-0.004131,-0.002999
2016-01-08,-0.013617,-0.009258,-0.010427,-0.010977,2.020000,1.462694,117.540001,0.003067,-0.034783,-0.009021,-0.000205
...,...,...,...,...,...,...,...,...,...,...,...
2021-07-23,0.035769,0.004477,0.006633,0.010288,-0.490000,1.377390,110.139999,0.012336,0.016845,-0.001043,0.004365
2021-07-26,0.007668,0.010117,0.002396,0.002455,0.379999,1.375781,110.543999,-0.002140,0.012336,0.003668,-0.001168
2021-07-27,-0.015929,-0.000140,-0.002248,-0.004558,1.780001,1.382915,110.302002,-0.008684,-0.002140,-0.002189,0.005186
2021-07-28,0.031797,-0.006865,-0.003594,-0.000410,-1.050001,1.388272,109.806000,-0.001117,-0.008684,-0.004497,0.003873


In [17]:
y_test = df_test[['msft']][1:len(df_test)]
y_test

Unnamed: 0_level_0,msft
trade_date,Unnamed: 1_level_1
2016-01-05,0.004562
2016-01-06,-0.018165
2016-01-07,-0.034783
2016-01-08,0.003067
2016-01-11,-0.000573
...,...
2021-07-26,-0.002140
2021-07-27,-0.008684
2021-07-28,-0.001117
2021-07-29,0.000978


In terms of $R^2$, the `LinearRegression` performs better than KNN on the testing data.

In [18]:
linear_regression.score(X_test, y_test)

0.03710052545948106

In [19]:
knn.score(X_test, y_test)

-0.022694775311824067

On the testing data, the models are again quite similar from an mean squared error perspective.

In [20]:
sklearn.metrics.mean_squared_error(y_test, linear_regression.predict(X_test))

0.00028192086585344305

In [21]:
sklearn.metrics.mean_squared_error(y_test, knn.predict(X_test))

0.0002994279301037974

## Neural Network

Let's now fit our first neural network.  We begin by importing some addition packages and functions.

In [22]:
import random
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam

2025-09-15 15:56:21.845234: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2025-09-15 15:56:21.881683: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-09-15 15:56:21.881713: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-09-15 15:56:21.881741: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-09-15 15:56:21.888254: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2025-09-15 15:56:21.888862: I tensorflow/core/platform/cpu_feature_guard.cc:182] This Tens

In order to make our results reproducible, we'll use the following user defined function to set the various seeds for random number generation (this doesn't seem to fully work for some reason, although in previous chapters it seems to work, I'm not sure what the difference is).

In [23]:
def set_seeds(seed=100):
    random.seed(seed)
    np.random.seed(seed)
    tf.random.set_seed(seed)

In [24]:
set_seeds()

Now we can build and compile our model.

In [25]:
model = Sequential()
model.add(Dense(units=128, input_dim=len(X_train.columns), activation='relu'))
model.add(Dense(units=128, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])

Let's fit our model.

In [26]:
%%time
model.fit(X_train, y_train, epochs=50, batch_size=32, verbose=0);

CPU times: user 5.76 s, sys: 286 ms, total: 6.04 s
Wall time: 4.23 s


As we can see from our two metrics, this baseline neural network performs better than `LinearRegression` and k-nearest neighbors.

In [27]:
sklearn.metrics.r2_score(y_test, model.predict(X_test))



0.061010003089904785

In [28]:
sklearn.metrics.mean_squared_error(y_test, model.predict(X_test))



0.0002749205786398829

## Normalization

Next, let's normalize our data and refit.

In [29]:
mu = X_train.mean()
std = X_train.std()

In [30]:
X_train_scaled = (X_train - mu) / std
X_test_scaled = (X_test - mu) / std

In [31]:
set_seeds()

In [32]:
model = Sequential()
model.add(Dense(units=128, input_dim=len(X_train_scaled.columns), activation='relu'))
model.add(Dense(units=128, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])

In [33]:
%%time
model.fit(X_train_scaled, y_train, epochs=50, batch_size=32, verbose=0);

CPU times: user 5.63 s, sys: 269 ms, total: 5.9 s
Wall time: 4.17 s


This seems to have a drastically negative impact on model performance.  So we won't use normalization as we proceed.

In [34]:
sklearn.metrics.r2_score(y_test, model.predict(X_test_scaled))



-1.2055394649505615

In [35]:
sklearn.metrics.mean_squared_error(y_test, model.predict(X_test_scaled))



0.0006457450377929333

## Dropout

In this section, we perform dropout regularization.

In [36]:
from keras.layers import Dropout

In [37]:
set_seeds()

In [38]:
model = Sequential()
model.add(Dense(units=128, input_dim=len(X_train.columns), activation='relu'))
model.add(Dropout(rate=0.3, seed=100))
model.add(Dense(units=128, activation='relu'))
model.add(Dropout(rate=0.3, seed=100))
model.add(Dense(1))
model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])

In [39]:
%%time
model.fit(X_train, y_train, epochs=50, batch_size=32, verbose=0);

CPU times: user 6.34 s, sys: 242 ms, total: 6.58 s
Wall time: 4.66 s


With dropout regularization, the model performs poorly relative to linear regression, nearest neighbors, and the baseline neural network.

In [40]:
sklearn.metrics.r2_score(y_test, model.predict(X_test))



-0.0013687610626220703

In [41]:
sklearn.metrics.mean_squared_error(y_test, model.predict(X_test))



0.00029318401815150996

## Regularization

Now let's try `l2` (ridge) regularization. 

In [42]:
from keras.regularizers import l2

In [43]:
set_seeds()

In [44]:
model = Sequential()
model.add(Dense(units=128, input_dim=len(X_train.columns), activation='relu', activity_regularizer=l2(0.0005)))
model.add(Dense(units=128, activation='relu', activity_regularizer=l2(0.0005)))
model.add(Dense(1))
model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])

In [45]:
%%time
model.fit(X_train, y_train, epochs=50, batch_size=32, verbose=0);

CPU times: user 6.05 s, sys: 318 ms, total: 6.37 s
Wall time: 4.31 s


Using `l2` doesn't improve the model compared to linear regression, nearest neighbors, and the baseline linear regression.

In [46]:
sklearn.metrics.r2_score(y_test, model.predict(X_test))



0.04668128490447998

In [47]:
sklearn.metrics.mean_squared_error(y_test, model.predict(X_test))



0.0002791157822826718

## Classification

Finally, let's recast this as a classification problem where we are simply trying to predict gains and losses.  First we have to change our labels to binary outcomes.

In [48]:
y_train_classification = np.where(y_train['msft'] > 0, 1, 0)
y_test_classification = np.where(y_test['msft'] > 0, 1, 0)

In [49]:
set_seeds()

In [50]:
model = Sequential()
model.add(Dense(units=128, input_dim=len(X_train.columns), activation='relu'))
model.add(Dense(units=64, activation='relu'))
model.add(Dense(units=32, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])

In [51]:
%%time
model.fit(X_train, y_train_classification, epochs=50, batch_size=32, verbose=0);

CPU times: user 6.3 s, sys: 335 ms, total: 6.63 s
Wall time: 4.7 s


The binary classification is right about 45% of the time..

In [52]:
model.evaluate(X_test, y_test_classification)



[0.7166569232940674, 0.4657142758369446]

Guessing that MSFT will rise everyday is right 55% of the time.

In [53]:
y_test_classification.mean()

0.5514285714285714