## Benchmark model

In this notebook a Simple Moving Average model will be created in order to have a benchmark for my DeepAR based model (to be tuned in the next notebook).

### Hyperparameters

DeepAR is the model of choice of this project.
This model expects input data to be already test-train splitted.
A big part of the model design has to be done looking close at data.
More specifically, defining these two hyperparameters about the data:
* Context length
* Prediction length

### Prediction length

This is the length of the time series future predictions in days. It will be conveniently set to 5 days (exactly a week of trading hours) because a shorter interval would be of little significance.
A longer interval could be interesting from an application point of view, but it can be challenging in terms of model performances.

### Context length

Context length can be either:
* designed on patterns or seasonality observed in the data, if any is present;
* chosen as a fixed value. This will be my choice, and it will be the same as the moving average window, in order to have a good reference metrics, applicable to both this model and the benchmark model.

To explore this second option, we will refer to what we've found during the EDA stage.

In [138]:
prediction_length = [10, 20, 50]
context_length = [10, 20, 50]

### Train, test and validation split

Time series will be all trimmed from `data_start` in `1.ExploratoryDataAnalysis.ipy`, according to DeepAR documentation, train time series should be set as the size of the entire time series less the prediction length.
Validation length will exactly equal to prediction length.

In [139]:
from data.data_utils import train_test_valid_split

#### IBM Stock train, test and validation split

In [140]:
df_ibm = df_ibm.loc[data_start:].copy()

In [141]:
df_ibm_train, df_ibm_test, df_ibm_valid = train_test_valid_split(df_ibm, prediction_length=prediction_length[1])
print(len(df_ibm), len(df_ibm_train), len(df_ibm_test), len(df_ibm_valid))

4188 4148 4168 20


In [142]:
df_ibm_train.head()

Unnamed: 0_level_0,Adj Close,Close,High,Low,Open,Volume,Dividends,Stock Splits,10_ac_ma,20_ac_ma,50_ac_ma,10_ac_bb_u,10_ac_bb_l,20_ac_bb_u,20_ac_bb_l,50_ac_bb_u,50_ac_bb_l
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2004-08-19,54.979382,84.889999,85.349998,84.449997,84.75,4704500,0.18,0.0,54.397126,54.958798,55.913001,55.525647,53.268605,56.500281,53.417314,59.046566,52.779437
2004-08-20,55.212502,85.25,85.25,84.519997,84.519997,4501400,0.18,0.0,54.511759,54.977556,55.852774,55.720874,53.302644,56.521944,53.433168,58.920109,52.785438
2004-08-23,54.82391,84.650002,85.449997,84.650002,85.230003,4260600,0.18,0.0,54.582999,54.969128,55.779992,55.771068,53.39493,56.515013,53.423244,58.766151,52.793832
2004-08-24,54.862782,84.709999,85.150002,84.349998,85.0,2710400,0.18,0.0,54.564866,54.937116,55.713028,55.726906,53.402826,56.462835,53.411398,58.625949,52.800108
2004-08-25,55.095955,85.07,85.269997,84.550003,85.0,4405600,0.18,0.0,54.654243,54.917732,55.644653,55.829713,53.478772,56.42394,53.411524,58.447571,52.841736


In [143]:
df_ibm_test.head()

Unnamed: 0_level_0,Adj Close,Close,High,Low,Open,Volume,Dividends,Stock Splits,10_ac_ma,20_ac_ma,50_ac_ma,10_ac_bb_u,10_ac_bb_l,20_ac_bb_u,20_ac_bb_l,50_ac_bb_u,50_ac_bb_l
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2004-08-19,54.979382,84.889999,85.349998,84.449997,84.75,4704500,0.18,0.0,54.397126,54.958798,55.913001,55.525647,53.268605,56.500281,53.417314,59.046566,52.779437
2004-08-20,55.212502,85.25,85.25,84.519997,84.519997,4501400,0.18,0.0,54.511759,54.977556,55.852774,55.720874,53.302644,56.521944,53.433168,58.920109,52.785438
2004-08-23,54.82391,84.650002,85.449997,84.650002,85.230003,4260600,0.18,0.0,54.582999,54.969128,55.779992,55.771068,53.39493,56.515013,53.423244,58.766151,52.793832
2004-08-24,54.862782,84.709999,85.150002,84.349998,85.0,2710400,0.18,0.0,54.564866,54.937116,55.713028,55.726906,53.402826,56.462835,53.411398,58.625949,52.800108
2004-08-25,55.095955,85.07,85.269997,84.550003,85.0,4405600,0.18,0.0,54.654243,54.917732,55.644653,55.829713,53.478772,56.42394,53.411524,58.447571,52.841736


In [144]:
df_ibm_valid.head()

Unnamed: 0_level_0,Adj Close,Close,High,Low,Open,Volume,Dividends,Stock Splits,10_ac_ma,20_ac_ma,50_ac_ma,10_ac_bb_u,10_ac_bb_l,20_ac_bb_u,20_ac_bb_l,50_ac_bb_u,50_ac_bb_l
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2021-03-11,127.139999,127.139999,128.639999,126.779999,128.089996,5145000,,,122.93,121.901001,122.538418,129.00611,116.85389,126.870732,116.931269,129.440326,115.63651
2021-03-12,127.610001,127.610001,127.68,126.610001,127.190002,4009600,,,123.798,122.236,122.647268,129.814133,117.781868,127.793029,116.678972,129.695417,115.599118
2021-03-15,128.580002,128.580002,128.75,127.540001,127.769997,3420600,,,124.582001,122.625,122.76486,130.864455,118.299547,128.812263,116.437738,130.010065,115.519656
2021-03-16,128.240005,128.240005,128.520004,127.339996,128.279999,4630400,,,125.373001,123.033501,122.845259,131.255195,119.490808,129.578942,116.488059,130.243966,115.446553
2021-03-17,129.029999,129.029999,129.490005,127.489998,128.460007,4244800,,,126.040001,123.486501,122.979746,131.916395,120.163607,130.383825,116.589177,130.580128,115.379365


#### Apple Inc. Stock train test split

In [145]:
df_aapl = df_aapl.loc[data_start:].copy()

In [146]:
df_aapl_train, df_aapl_test, df_aapl_valid = train_test_valid_split(df_aapl, prediction_length=prediction_length[1])
print(len(df_aapl), len(df_aapl_train), len(df_aapl_test), len(df_aapl_valid))

4188 4148 4168 20


In [147]:
df_aapl_train.head()

Unnamed: 0_level_0,Adj Close,Close,High,Low,Open,Volume,Dividends,Stock Splits,10_ac_ma,20_ac_ma,50_ac_ma,10_ac_bb_u,10_ac_bb_l,20_ac_bb_u,20_ac_bb_l,50_ac_bb_u,50_ac_bb_l
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2004-08-19,0.472366,0.548393,0.568929,0.542143,0.562679,388920000.0,0.0,2.0,0.473627,0.481141,0.484422,0.491197,0.456057,0.504857,0.457425,0.519835,0.449008
2004-08-20,0.47375,0.55,0.553393,0.544464,0.548393,316780800.0,0.0,2.0,0.475196,0.481218,0.484606,0.488983,0.461409,0.504821,0.457614,0.519691,0.449521
2004-08-23,0.478057,0.555,0.558393,0.546429,0.551071,254660000.0,0.0,2.0,0.476395,0.481079,0.484711,0.488652,0.464139,0.504725,0.457433,0.519683,0.449738
2004-08-24,0.491439,0.570536,0.570536,0.556964,0.558214,374136000.0,0.0,2.0,0.477057,0.48071,0.485274,0.491797,0.462317,0.503401,0.458019,0.519742,0.450806
2004-08-25,0.508359,0.590179,0.591964,0.566607,0.569107,505618400.0,0.0,2.0,0.480195,0.48131,0.486,0.504872,0.455517,0.506265,0.456355,0.520858,0.451141


In [148]:
df_aapl_test.head()

Unnamed: 0_level_0,Adj Close,Close,High,Low,Open,Volume,Dividends,Stock Splits,10_ac_ma,20_ac_ma,50_ac_ma,10_ac_bb_u,10_ac_bb_l,20_ac_bb_u,20_ac_bb_l,50_ac_bb_u,50_ac_bb_l
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2004-08-19,0.472366,0.548393,0.568929,0.542143,0.562679,388920000.0,0.0,2.0,0.473627,0.481141,0.484422,0.491197,0.456057,0.504857,0.457425,0.519835,0.449008
2004-08-20,0.47375,0.55,0.553393,0.544464,0.548393,316780800.0,0.0,2.0,0.475196,0.481218,0.484606,0.488983,0.461409,0.504821,0.457614,0.519691,0.449521
2004-08-23,0.478057,0.555,0.558393,0.546429,0.551071,254660000.0,0.0,2.0,0.476395,0.481079,0.484711,0.488652,0.464139,0.504725,0.457433,0.519683,0.449738
2004-08-24,0.491439,0.570536,0.570536,0.556964,0.558214,374136000.0,0.0,2.0,0.477057,0.48071,0.485274,0.491797,0.462317,0.503401,0.458019,0.519742,0.450806
2004-08-25,0.508359,0.590179,0.591964,0.566607,0.569107,505618400.0,0.0,2.0,0.480195,0.48131,0.486,0.504872,0.455517,0.506265,0.456355,0.520858,0.451141


In [149]:
df_aapl_valid.head()

Unnamed: 0_level_0,Adj Close,Close,High,Low,Open,Volume,Dividends,Stock Splits,10_ac_ma,20_ac_ma,50_ac_ma,10_ac_bb_u,10_ac_bb_l,20_ac_bb_u,20_ac_bb_l,50_ac_bb_u,50_ac_bb_l
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2021-03-11,121.959999,121.959999,123.209999,121.260002,122.540001,102753600.0,,,121.717,125.474,130.376266,127.811144,115.622856,136.351576,114.596423,143.001325,117.751208
2021-03-12,121.029999,121.029999,121.169998,119.160004,120.400002,87963400.0,,,121.693999,124.768999,130.103491,127.797538,115.590461,134.806803,114.731196,142.93763,117.269352
2021-03-15,123.989998,123.989998,124.0,120.419998,121.410004,92403800.0,,,121.313999,124.2,129.912881,126.050901,116.577097,132.909929,115.49007,142.82275,117.003013
2021-03-16,125.57,125.57,127.220001,124.720001,125.699997,114740000.0,,,121.358999,123.818999,129.774441,126.262225,116.455772,131.476156,116.161843,142.71983,116.829052
2021-03-17,124.760002,124.760002,125.860001,122.339996,124.050003,111437500.0,,,121.628999,123.515,129.685303,126.980638,116.27736,130.446924,116.583076,142.707511,116.663094


#### Amazon Stock train test split

In [150]:
df_amzn = df_amzn.loc[data_start:].copy()

In [151]:
df_amzn_train, df_amzn_test, df_amzn_valid = train_test_valid_split(df_amzn, prediction_length=prediction_length[1])
print(len(df_amzn), len(df_amzn_train), len(df_amzn_test), len(df_amzn_valid))

4188 4148 4168 20


In [152]:
df_amzn_train.head()

Unnamed: 0_level_0,Adj Close,Close,High,Low,Open,Volume,Dividends,Stock Splits,10_ac_ma,20_ac_ma,50_ac_ma,10_ac_bb_u,10_ac_bb_l,20_ac_bb_u,20_ac_bb_l,50_ac_bb_u,50_ac_bb_l
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2004-08-19,38.630001,38.630001,40.48,36.349998,40.259998,12696100.0,,,37.106001,37.652,44.9828,39.862539,34.349462,40.417963,34.886038,57.693587,32.272014
2004-08-20,39.509998,39.509998,39.91,38.110001,38.459999,6790800.0,,,37.508,37.6285,44.7682,40.386916,34.629084,40.318115,34.938885,57.479021,32.057379
2004-08-23,39.450001,39.450001,40.0,39.110001,39.889999,5532600.0,,,37.921,37.662,44.5584,40.58155,35.26045,40.42762,34.89638,57.267085,31.849715
2004-08-24,39.049999,39.049999,39.93,38.32,39.720001,7640400.0,,,38.116,37.659,44.3544,40.794891,35.43711,40.418129,34.899871,57.083144,31.625656
2004-08-25,40.299999,40.299999,40.490002,38.16,39.060001,7254800.0,,,38.49,37.7755,44.1582,41.246567,35.733433,40.776113,34.774887,56.827117,31.489283


In [153]:
df_amzn_test.head()

Unnamed: 0_level_0,Adj Close,Close,High,Low,Open,Volume,Dividends,Stock Splits,10_ac_ma,20_ac_ma,50_ac_ma,10_ac_bb_u,10_ac_bb_l,20_ac_bb_u,20_ac_bb_l,50_ac_bb_u,50_ac_bb_l
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2004-08-19,38.630001,38.630001,40.48,36.349998,40.259998,12696100.0,,,37.106001,37.652,44.9828,39.862539,34.349462,40.417963,34.886038,57.693587,32.272014
2004-08-20,39.509998,39.509998,39.91,38.110001,38.459999,6790800.0,,,37.508,37.6285,44.7682,40.386916,34.629084,40.318115,34.938885,57.479021,32.057379
2004-08-23,39.450001,39.450001,40.0,39.110001,39.889999,5532600.0,,,37.921,37.662,44.5584,40.58155,35.26045,40.42762,34.89638,57.267085,31.849715
2004-08-24,39.049999,39.049999,39.93,38.32,39.720001,7640400.0,,,38.116,37.659,44.3544,40.794891,35.43711,40.418129,34.899871,57.083144,31.625656
2004-08-25,40.299999,40.299999,40.490002,38.16,39.060001,7254800.0,,,38.49,37.7755,44.1582,41.246567,35.733433,40.776113,34.774887,56.827117,31.489283


In [154]:
df_amzn_valid.head()

Unnamed: 0_level_0,Adj Close,Close,High,Low,Open,Volume,Dividends,Stock Splits,10_ac_ma,20_ac_ma,50_ac_ma,10_ac_bb_u,10_ac_bb_l,20_ac_bb_u,20_ac_bb_l,50_ac_bb_u,50_ac_bb_l
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2021-03-11,3113.590088,3113.590088,3131.780029,3082.929932,3104.01001,2770800.0,,,3050.265991,3139.507471,3201.778574,3177.936381,2922.595602,3371.442633,2907.572308,3420.540848,2983.0163
2021-03-12,3089.48999,3089.48999,3098.97998,3045.5,3075.0,2418500.0,,,3049.921997,3130.875476,3197.128374,3177.099067,2922.744928,3356.355728,2905.395224,3415.344096,2978.912652
2021-03-15,3081.679932,3081.679932,3082.23999,3032.090088,3074.570068,2913600.0,,,3043.476001,3121.073975,3193.044971,3154.48489,2932.467112,3336.497587,2905.650362,3412.123715,2973.966226
2021-03-16,3091.860107,3091.860107,3128.909912,3075.860107,3104.969971,2510100.0,,,3043.209009,3112.219482,3189.743574,3153.683712,2932.734305,3316.31076,2908.128205,3409.865425,2969.621724
2021-03-17,3135.72998,3135.72998,3173.050049,3070.219971,3073.219971,3100900.0,,,3056.282007,3103.573987,3188.725576,3177.115509,2935.448505,3286.146141,2921.001832,3409.37636,2968.074792


#### Alphabet Inc. Stock train test split

In [155]:
df_googl = df_googl.loc[data_start:].copy()

In [156]:
df_googl_train, df_googl_test, df_googl_valid = train_test_valid_split(df_googl, prediction_length=prediction_length[1])
print(len(df_googl), len(df_googl_train), len(df_googl_test), len(df_googl_valid))

4188 4148 4168 20


In [157]:
df_googl_train.head()

Unnamed: 0_level_0,Adj Close,Close,High,Low,Open,Volume,Dividends,Stock Splits,10_ac_ma,20_ac_ma,50_ac_ma,10_ac_bb_u,10_ac_bb_l,20_ac_bb_u,20_ac_bb_l,50_ac_bb_u,50_ac_bb_l
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2004-08-19,50.220219,50.220219,52.082081,48.028027,50.050049,44659096.0,0.0,1.998,52.432933,52.692943,63.586987,55.781053,49.084813,56.933168,48.452718,88.331593,38.842381
2004-08-20,54.209209,54.209209,54.594597,50.300301,50.555557,22834343.0,0.0,1.998,52.432933,52.692943,63.586987,55.781053,49.084813,56.933168,48.452718,88.331593,38.842381
2004-08-23,54.754753,54.754753,56.796799,54.579578,55.430431,18256126.0,0.0,1.998,52.432933,52.692943,63.586987,55.781053,49.084813,56.933168,48.452718,88.331593,38.842381
2004-08-24,52.487488,52.487488,55.855858,51.836838,55.675674,15247337.0,0.0,1.998,52.432933,52.692943,63.586987,55.781053,49.084813,56.933168,48.452718,88.331593,38.842381
2004-08-25,53.053055,53.053055,54.054054,51.991993,52.532532,9188602.0,0.0,1.998,52.432933,52.692943,63.586987,55.781053,49.084813,56.933168,48.452718,88.331593,38.842381


In [158]:
df_googl_test.head()

Unnamed: 0_level_0,Adj Close,Close,High,Low,Open,Volume,Dividends,Stock Splits,10_ac_ma,20_ac_ma,50_ac_ma,10_ac_bb_u,10_ac_bb_l,20_ac_bb_u,20_ac_bb_l,50_ac_bb_u,50_ac_bb_l
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2004-08-19,50.220219,50.220219,52.082081,48.028027,50.050049,44659096.0,0.0,1.998,52.432933,52.692943,63.586987,55.781053,49.084813,56.933168,48.452718,88.331593,38.842381
2004-08-20,54.209209,54.209209,54.594597,50.300301,50.555557,22834343.0,0.0,1.998,52.432933,52.692943,63.586987,55.781053,49.084813,56.933168,48.452718,88.331593,38.842381
2004-08-23,54.754753,54.754753,56.796799,54.579578,55.430431,18256126.0,0.0,1.998,52.432933,52.692943,63.586987,55.781053,49.084813,56.933168,48.452718,88.331593,38.842381
2004-08-24,52.487488,52.487488,55.855858,51.836838,55.675674,15247337.0,0.0,1.998,52.432933,52.692943,63.586987,55.781053,49.084813,56.933168,48.452718,88.331593,38.842381
2004-08-25,53.053055,53.053055,54.054054,51.991993,52.532532,9188602.0,0.0,1.998,52.432933,52.692943,63.586987,55.781053,49.084813,56.933168,48.452718,88.331593,38.842381


In [159]:
df_googl_valid.head()

Unnamed: 0_level_0,Adj Close,Close,High,Low,Open,Volume,Dividends,Stock Splits,10_ac_ma,20_ac_ma,50_ac_ma,10_ac_bb_u,10_ac_bb_l,20_ac_bb_u,20_ac_bb_l,50_ac_bb_u,50_ac_bb_l
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1
2021-03-11,2100.540039,2100.540039,2111.27002,2056.449951,2058.219971,1384200.0,,,2048.305005,2065.24552,1940.304001,2114.850599,1981.759411,2136.785671,1993.705369,2227.839478,1652.768525
2021-03-12,2050.0,2050.0,2077.610107,2032.420044,2076.409912,1690000.0,,,2051.114001,2063.30802,1946.148801,2115.027085,1987.200918,2134.264426,1992.351614,2230.40103,1661.896572
2021-03-15,2054.439941,2054.439941,2054.98999,2027.790039,2044.97998,1308400.0,,,2049.592004,2061.278516,1952.5126,2112.254882,1986.929127,2130.720402,1991.836629,2231.788099,1673.237101
2021-03-16,2083.889893,2083.889893,2113.679932,2059.290039,2065.98999,1592800.0,,,2051.532996,2059.938013,1959.137598,2117.367638,1985.698353,2126.331087,1993.544939,2234.752349,1683.522846
2021-03-17,2082.219971,2082.219971,2099.0,2044.119995,2068.469971,1292400.0,,,2058.613989,2058.118005,1966.259397,2120.374761,1996.853218,2119.54805,1996.687961,2235.631077,1696.887717


### Metrics computation

Computing metrics on the benchmark model will give me a good reference to evaluate the deep learning model after training.

As can be presumed and seen from data prints, first `n` values of validation data are `NaN` so, it will be better just to exclude the first `n-1` values from the validation set to avoid misleading values to be catch in metrics evaluation. 

In [160]:
from sklearn.metrics import mean_absolute_error, mean_squared_error, mean_absolute_percentage_error, r2_score

#### IBM stock

In [161]:
#n = 10
#ma_str = str(n)+'_ac_ma'

In [162]:
#ibm_ma_mse_loss = mean_squared_error(df_ibm_valid.iloc[n-1:]['Adj Close'], df_ibm_valid.iloc[n-1:][ma_str])

In [163]:
#print(ibm_ma_mse_loss)

In [164]:
n = 20
ma_str = str(n)+'_ac_ma'

Mean Absolute Error

In [165]:
ibm_ma_mae_loss = mean_absolute_error(df_ibm_valid.iloc[:]['Adj Close'], df_ibm_valid[:][ma_str])

In [166]:
print(ibm_ma_mae_loss)

5.286773891448974


Root Mean Squared Error

In [167]:
ibm_ma_mse_loss = mean_squared_error(df_ibm_valid.iloc[:]['Adj Close'], df_ibm_valid[:][ma_str], squared=False)

In [168]:
print(ibm_ma_mse_loss)

5.496420822131089


Mean Absolute Percentage Error

In [169]:
ibm_ma_map_loss = mean_absolute_percentage_error(df_ibm_valid.iloc[:]['Adj Close'], df_ibm_valid[:][ma_str])

In [170]:
print(ibm_ma_map_loss)

0.04008156845030956


R<sup>2</sup> score

In [171]:
ibm_ma_r2_score = r2_score(df_ibm_valid.iloc[:]['Adj Close'], df_ibm_valid[:][ma_str])

In [172]:
print(ibm_ma_r2_score)

-2.350974874566425


In [173]:
#n = 50
#ma_str = str(n)+'_ac_ma'

In [174]:
#ibm_ma_mse_loss = mean_squared_error(df_ibm_valid.iloc[n-1:]['Adj Close'], df_ibm_valid[n-1:][ma_str])

In [175]:
#print(ibm_ma_mse_loss)

#### Apple Inc. stock

In [176]:
#n = 10
#ma_str = str(n)+'_ac_ma'

In [177]:
#aapl_ma_mse_loss = mean_squared_error(df_aapl_valid.iloc[n-1:]['Adj Close'], df_aapl_valid.iloc[n-1:][ma_str])

In [178]:
#print(aapl_ma_mse_loss)

In [179]:
n = 20
ma_str = str(n)+'_ac_ma'

Mean Absolute Error

In [180]:
aapl_ma_mae_loss = mean_absolute_error(df_aapl_valid.iloc[:]['Adj Close'], df_aapl_valid[:][ma_str])

In [181]:
print(aapl_ma_mae_loss)

2.2630758857727047


##### Root Mean Squared Error

In [182]:
aapl_ma_rmse_loss = mean_squared_error(df_aapl_valid.iloc[:]['Adj Close'], df_aapl_valid[:][ma_str], squared=False)

In [183]:
print(aapl_ma_rmse_loss)

2.886362190246315


##### Mean Absolute Percentage Error

In [184]:
aapl_ma_map_loss = mean_absolute_percentage_error(df_aapl_valid.iloc[:]['Adj Close'], df_aapl_valid[:][ma_str])

In [185]:
print(aapl_ma_map_loss)

0.018194131528907812


R<sup>2</sup> score

In [186]:
aapl_ma_r2_score = r2_score(df_aapl_valid.iloc[:]['Adj Close'], df_aapl_valid[:][ma_str])

In [187]:
print(aapl_ma_r2_score)

-0.07186631604828109


In [188]:
#n = 50
#ma_str = str(n)+'_ac_ma'

In [189]:
#aapl_ma_mse_loss = mean_squared_error(df_aapl_valid.iloc[n-1:]['Adj Close'], df_aapl_valid[n-1:][ma_str])

In [190]:
#print(aapl_ma_mse_loss)

#### Amazon.com stock

In [191]:
#n = 10
#ma_str = str(n)+'_ac_ma'

In [192]:
#amzn_ma_mse_loss = mean_squared_error(df_amzn_valid.iloc[n-1:]['Adj Close'], df_amzn_valid.iloc[n-1:][ma_str])

In [193]:
#print(amzn_ma_mse_loss)

In [194]:
n = 20
ma_str = str(n)+'_ac_ma'

Mean Absolute Error

In [195]:
amzn_ma_mae_loss = mean_absolute_error(df_amzn_valid.iloc[:]['Adj Close'], df_amzn_valid[:][ma_str])

In [196]:
print(amzn_ma_mae_loss)

55.70976806640624


Root Mean Squared Error

In [197]:
amzn_ma_rmse_loss = mean_squared_error(df_amzn_valid.iloc[:]['Adj Close'], df_amzn_valid[:][ma_str], squared=False)

In [198]:
print(amzn_ma_rmse_loss)

76.28737528727864


##### Mean Absolute Percentage Error

In [199]:
amzn_ma_map_loss = mean_absolute_percentage_error(df_amzn_valid.iloc[:]['Adj Close'], df_amzn_valid[:][ma_str])

In [200]:
print(amzn_ma_map_loss)

0.017491064096042102


R<sup>2</sup> score

In [201]:
amzn_ma_r2_score = r2_score(df_amzn_valid.iloc[:]['Adj Close'], df_amzn_valid[:][ma_str])

In [202]:
print(amzn_ma_r2_score)

-0.03974293382322225


In [203]:
#n = 50
#ma_str = str(n)+'_ac_ma'

In [204]:
#amzn_ma_mse_loss = mean_squared_error(df_amzn_valid.iloc[n-1:]['Adj Close'], df_amzn_valid[n-1:][ma_str])

In [205]:
#print(amzn_ma_mse_loss)

#### Alphabet Inc. stock

In [206]:
#n = 10
#ma_str = str(n)+'_ac_ma'

In [207]:
#googl_ma_mse_loss = mean_squared_error(df_googl_valid.iloc[n-1:]['Adj Close'], df_googl_valid.iloc[n-1:][ma_str])

In [208]:
#print(googl_ma_mse_loss)

In [209]:
n = 20
ma_str = str(n)+'_ac_ma'

Mean Absolute Error

In [210]:
googl_ma_mae_loss = mean_absolute_error(df_googl_valid.iloc[:]['Adj Close'], df_googl_valid[:][ma_str])

In [211]:
print(googl_ma_mae_loss)

47.56863250732422


Root Mean Squared Error

In [212]:
googl_ma_rmse_loss = mean_squared_error(df_googl_valid.iloc[:]['Adj Close'], df_googl_valid[:][ma_str], squared=False)

In [213]:
print(googl_ma_rmse_loss)

73.62119287083105


##### Mean Absolute Percentage Error

In [214]:
googl_ma_map_loss = mean_absolute_percentage_error(df_googl_valid.iloc[:]['Adj Close'], df_googl_valid[:][ma_str])

In [215]:
print(googl_ma_map_loss)

0.021909751965629425


R<sup>2</sup> score

In [216]:
googl_ma_r2_score = r2_score(df_googl_valid.iloc[:]['Adj Close'], df_googl_valid[:][ma_str])

In [217]:
print(googl_ma_r2_score)

0.03824285908737435


In [218]:
#n = 50
#ma_str = str(n)+'_ac_ma'

In [219]:
#googl_ma_mse_loss = mean_squared_error(df_googl_valid.iloc[n-1:]['Adj Close'], df_googl_valid[n-1:][ma_str])

In [220]:
#print(googl_ma_mse_loss)

Volatility

In [221]:
print(volatility(df_ibm_valid['Adj Close'], n))

0.47449886713725287


In [222]:
print(volatility(df_aapl_valid['Adj Close'], n))

0.40907922979750494


In [223]:
print(volatility(df_amzn_valid['Adj Close'], n))

294.5952685896044


In [224]:
print(volatility(df_googl_valid['Adj Close'], n))

296.6106082052079


As expected, loss augments as we observe moving average on larger windows.
Also, we can observe that Amazon.com and Alphabet Inc. have greater losses, that also corresponds to higher volatility.

Now I'll initialize an array of moving average values on validation data, to be used in future model comparison:

In [225]:
df_valid_ma = [df_ibm_valid[:][ma_str], df_aapl_valid[:][ma_str], df_amzn_valid[:][ma_str], df_googl_valid[:][ma_str]]