<a href="https://colab.research.google.com/github/ashikshafi08/Learning_Tensorflow/blob/main/Exercise/%F0%9F%9B%A0_10_Time_series_fundamentals_and_Milestone_Project_3_BitPredict_%F0%9F%92%B0%F0%9F%93%88_Exercise_Solutions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🛠 10. Time series fundamentals and Milestone Project 3: BitPredict 💰📈 Exercise Solutions 

1. Does scaling the data help for univariate/multivariate data? (e.g. getting all of the values between 0 & 1)
    * Try doing this for a univariate model (e.g. model_1) and a multivariate model (e.g. model_6) and see if it effects model training or evaluation results.
2. Get the most up to date data on Bitcoin, train a model & see how it goes (our data goes up to May 18 2021).
    * You can download the Bitcoin historical data for free from  [coindesk.com/price/bitcoin](https://www.coindesk.com/price/bitcoin)  and clicking “Export Data” -> “CSV”.
3. For most of our models we used WINDOW_SIZE=7, but is there a better window size?
    * Setup a series of experiments to find whether or not there’s a better window size.
    * For example, you might train 10 different models with HORIZON=1 but with window sizes ranging from 2-12.
4. Create a windowed dataset just like the ones we used for model_1 using  [tf.keras.preprocessing.timeseries_dataset_from_array()](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/timeseries_dataset_from_array)  and retrain model_1 using the recreated dataset.
5. For our multivariate modelling experiment, we added the Bitcoin block reward size as an extra feature to make our time series multivariate.
    * Are there any other features you think you could add?
    * If so, try it out, how do these affect the model?
6. Make prediction intervals for future forecasts. To do so, one way would be to train an ensemble model on all of the data, make future forecasts with it and calculate the prediction intervals of the ensemble just like we did for model_8.
7. For future predictions, try to make a prediction, retrain a model on the predictions, make a prediction, retrain a model, make a prediction, retrain a model, make a prediction (retrain a model each time a new prediction is made). Plot the results, how do they look compared to the future predictions where a model wasn’t retrained for every forecast (model_9)?
8. Throughout this notebook, we’ve only tried algorithms we’ve handcrafted ourselves. But it’s worth seeing how a purpose built forecasting algorithm goes.
    * Try out one of the extra algorithms listed in the modelling experiments part such as:
	*  [Facebook’s Kats library](https://github.com/facebookresearch/Kats)  - there are many models in here, remember the machine learning practioner’s motto: experiment, experiment, experiment.
	*  [LinkedIn’s Greykite library](https://github.com/linkedin/greykite) 


## Downloading the data and preprocessing it 

In [1]:
# Download Bitcoin historical data from GitHub 
!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/BTC_USD_2013-10-01_2021-05-18-CoinDesk.csv

# Import with pandas 
import pandas as pd
df = pd.read_csv("/content/BTC_USD_2013-10-01_2021-05-18-CoinDesk.csv", 
                 parse_dates=["Date"], 
                 index_col=["Date"]) # parse the date column (tell pandas column 1 is a datetime)
df.head()

--2021-09-11 06:09:45--  https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/BTC_USD_2013-10-01_2021-05-18-CoinDesk.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 178509 (174K) [text/plain]
Saving to: ‘BTC_USD_2013-10-01_2021-05-18-CoinDesk.csv’


2021-09-11 06:09:46 (73.5 MB/s) - ‘BTC_USD_2013-10-01_2021-05-18-CoinDesk.csv’ saved [178509/178509]



Unnamed: 0_level_0,Currency,Closing Price (USD),24h Open (USD),24h High (USD),24h Low (USD)
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2013-10-01,BTC,123.65499,124.30466,124.75166,122.56349
2013-10-02,BTC,125.455,123.65499,125.7585,123.63383
2013-10-03,BTC,108.58483,125.455,125.66566,83.32833
2013-10-04,BTC,118.67466,108.58483,118.675,107.05816
2013-10-05,BTC,121.33866,118.67466,121.93633,118.00566


In [2]:
# Only want closing price for each day 
bitcoin_prices = pd.DataFrame(df["Closing Price (USD)"]).rename(columns={"Closing Price (USD)": "Price"})
bitcoin_prices.head() , bitcoin_prices.shape

(                Price
 Date                 
 2013-10-01  123.65499
 2013-10-02  125.45500
 2013-10-03  108.58483
 2013-10-04  118.67466
 2013-10-05  121.33866, (2787, 1))

In [3]:
# Get the data in array 
import numpy as np
import tensorflow as tf 
from tensorflow.keras import layers

timesteps = bitcoin_prices.index.to_numpy()
prices = bitcoin_prices['Price'].to_numpy()

# Instantiating the sklearn MinMaxScaler
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler() 

In [4]:
# Create function to view NumPy arrays as windows 

def get_labelled_windows(x , horizon):
  return x[:, :-horizon] ,x[: , -horizon:]


def make_windows_scaled(x, window_size=7, horizon=1):
  """
  Turns a 1D array into a 2D array of sequential windows of window_size. Also applies the standard scaler
  """
  scaler.fit(np.expand_dims(x , axis =1))
  scaled_x = scaler.transform(np.expand_dims(x , axis = 1))
  scaled_x = np.squeeze(scaled_x)
  
  window_step = np.expand_dims(np.arange(window_size+horizon), axis=0)
  window_indexes = window_step + np.expand_dims(np.arange(len(scaled_x)-(window_size+horizon-1)), axis=0).T # create 2D array of windows of size window_size
  windowed_array = scaled_x[window_indexes]
  windows, labels = get_labelled_windows(windowed_array, horizon=horizon)

  return windows, labels


# Make the splits 
def make_train_test_splits(windows , labels , test_split = 0.2):
  split_size = int(len(windows) * (1 - test_split))
  train_windows = windows[:split_size]
  train_labels = labels[:split_size]
  test_windows = windows[split_size:]
  test_labels = labels[split_size:]

  return train_windows ,  test_windows ,train_labels,  test_labels

### 1. Does scaling the data help for univariate/multivariate data? (e.g. getting all of the values between 0 & 1)
* Try doing this for a univariate model (e.g. model_1) and a multivariate model (e.g. model_6) and see if it effects model training or evaluation results.

In [5]:
# Model 1 (Horizon = 1 , Window_size = 7)
HORIZON = 1 
WINDOW_SIZE = 7 


full_windows , full_labels = make_windows_scaled(prices , window_size = WINDOW_SIZE , horizon = HORIZON)
full_windows.shape , full_labels.shape

((2780, 7), (2780, 1))

In [6]:
# Looking at few examples of how price is scaled
for i in range(3):
  print(f'Window: {full_windows[i]} --> Label {full_labels[i]}')

Window: [0.00023831 0.00026677 0.         0.00015955 0.00020168 0.00019087
 0.0002089 ] --> Label [0.00022847]
Window: [0.00026677 0.         0.00015955 0.00020168 0.00019087 0.0002089
 0.00022847] --> Label [0.00024454]
Window: [0.         0.00015955 0.00020168 0.00019087 0.0002089  0.00022847
 0.00024454] --> Label [0.00027478]


In [7]:
# Making train and test splits 
train_windows , test_windows , train_labels , test_labels = make_train_test_splits(full_windows , full_labels)
len(train_windows), len(test_windows), len(train_labels), len(test_labels)

(2224, 556, 2224, 556)

In [8]:
# Building the Model 1 
tf.random.set_seed(42)

# Construct the model 
model_1 = tf.keras.Sequential([
  layers.Dense(128, activation= 'relu') ,
  layers.Dense(HORIZON , activation = 'linear')
])

# Compiling the model 
model_1.compile(loss = 'mae' , 
                optimizer = tf.keras.optimizers.Adam() , 
                metrics = ['mae'])

# Fit the model 
model_1.fit(x = train_windows , 
            y = train_labels , 
            epochs = 100 , batch_size = 128 , verbose = 0 , 
            validation_data = (test_windows , test_labels))

<keras.callbacks.History at 0x7f0d9732ae10>

In [9]:
# Evaluate the model on test data 
model_1.evaluate(test_windows , test_labels)



[0.00931711308658123, 0.00931711308658123]

In [10]:
model_1_preds = tf.squeeze(model_1.predict(test_windows))

Now doing the same for the Multivariate data especially for the Model 6

In [11]:
# Block reward values
block_reward_1 = 50 # 3 January 2009 
block_reward_2 = 25 # 28 November 2012 
block_reward_3 = 12.5 # 9 July 2016
block_reward_4 = 6.25 # 11 May 2020

# Block reward dates (datetime form of the above date stamps)
block_reward_2_datetime = np.datetime64("2012-11-28")
block_reward_3_datetime = np.datetime64("2016-07-09")
block_reward_4_datetime = np.datetime64("2020-05-11")

# Get date indexes for when to add in different block dates
block_reward_2_days = (block_reward_3_datetime - bitcoin_prices.index[0]).days
block_reward_3_days = (block_reward_4_datetime - bitcoin_prices.index[0]).days
block_reward_2_days, block_reward_3_days

# Add block_reward column
bitcoin_prices_block = bitcoin_prices.copy()
bitcoin_prices_block["block_reward"] = None

# Set values of block_reward column (it's the last column hence -1 indexing on iloc)
bitcoin_prices_block.iloc[:block_reward_2_days, -1] = block_reward_2
bitcoin_prices_block.iloc[block_reward_2_days:block_reward_3_days, -1] = block_reward_3
bitcoin_prices_block.iloc[block_reward_3_days:, -1] = block_reward_4
bitcoin_prices_block.head()

Unnamed: 0_level_0,Price,block_reward
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2013-10-01,123.65499,25
2013-10-02,125.455,25
2013-10-03,108.58483,25
2013-10-04,118.67466,25
2013-10-05,121.33866,25


In [12]:
# Make a copy of the Bitcoin historical data with block reward feature
bitcoin_prices_windowed = bitcoin_prices_block.copy()

# Add windowed columns
for i in range(WINDOW_SIZE): 
  bitcoin_prices_windowed[f"Price+{i+1}"] = bitcoin_prices_windowed["Price"].shift(periods=i+1)
bitcoin_prices_windowed.head(10)

Unnamed: 0_level_0,Price,block_reward,Price+1,Price+2,Price+3,Price+4,Price+5,Price+6,Price+7
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2013-10-01,123.65499,25,,,,,,,
2013-10-02,125.455,25,123.65499,,,,,,
2013-10-03,108.58483,25,125.455,123.65499,,,,,
2013-10-04,118.67466,25,108.58483,125.455,123.65499,,,,
2013-10-05,121.33866,25,118.67466,108.58483,125.455,123.65499,,,
2013-10-06,120.65533,25,121.33866,118.67466,108.58483,125.455,123.65499,,
2013-10-07,121.795,25,120.65533,121.33866,118.67466,108.58483,125.455,123.65499,
2013-10-08,123.033,25,121.795,120.65533,121.33866,118.67466,108.58483,125.455,123.65499
2013-10-09,124.049,25,123.033,121.795,120.65533,121.33866,118.67466,108.58483,125.455
2013-10-10,125.96116,25,124.049,123.033,121.795,120.65533,121.33866,118.67466,108.58483


In [108]:
# Let's create X & y, remove the NaN's and convert to float32 to prevent TensorFlow errors 
X = bitcoin_prices_windowed.dropna().drop("Price", axis=1).astype(np.float32) 
y = bitcoin_prices_windowed.dropna()["Price"].astype(np.float32)
X.head()

Unnamed: 0_level_0,block_reward,Price+1,Price+2,Price+3,Price+4,Price+5,Price+6,Price+7,day_of_week
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2013-10-08,25.0,121.794998,120.655327,121.338661,118.67466,108.584831,125.455002,123.654991,1.0
2013-10-09,25.0,123.032997,121.794998,120.655327,121.338661,118.67466,108.584831,125.455002,2.0
2013-10-10,25.0,124.049004,123.032997,121.794998,120.655327,121.338661,118.67466,108.584831,3.0
2013-10-11,25.0,125.961159,124.049004,123.032997,121.794998,120.655327,121.338661,118.67466,4.0
2013-10-12,25.0,125.279663,125.961159,124.049004,123.032997,121.794998,120.655327,121.338661,5.0


In [14]:
# Scaling the X data 
X_scaled = scaler.fit_transform(X)
y_scaled = scaler.fit_transform(np.expand_dims(y , axis = 1))
y_scaled = np.squeeze(y_scaled)

In [15]:
# Make train and test sets
split_size = int(len(X) * 0.8)
X_train, y_train = X_scaled[:split_size], y_scaled[:split_size]
X_test, y_test = X_scaled[split_size:], y_scaled[split_size:]
len(X_train), len(y_train), len(X_test), len(y_test)

(2224, 2224, 556, 556)

In [16]:
# Building a Multivariate time series model and fitting it
tf.random.set_seed(42)

model_6 = tf.keras.Sequential([
  layers.Dense(128 , activation= 'relu'), 
  layers.Dense(HORIZON)
])

model_6.compile(loss = 'mae' , 
                optimizer = tf.keras.optimizers.Adam())

model_6.fit(X_train , y_train , 
          epochs = 100 ,
          verbose = 0 , batch_size = 128, 
          validation_data = (X_test , y_test))

<keras.callbacks.History at 0x7f0d92b95690>

In [17]:
# Evaluate the model 6 
model_6.evaluate(X_test , y_test)



0.07352113723754883

### 2. Get the most up to date data on Bitcoin, train a model & see how it goes (our data goes up to May 18 2021).

In [19]:
# Loading the in the latest csv from Coindesk
df_updated = pd.read_csv('/content/BTC_USD_2014-11-02_2021-09-09-CoinDesk.csv' , 
                 parse_dates = ['Date'] , 
                 index_col = ['Date'])

bitcoin_prices_updated = pd.DataFrame(df_updated["Closing Price (USD)"]).rename(columns={"Closing Price (USD)": "Price"})
bitcoin_prices_updated.head(10) , bitcoin_prices_updated.shape

(                Price
 Date                 
 2014-11-02  325.22633
 2014-11-03  331.60083
 2014-11-04  324.71833
 2014-11-05  332.45666
 2014-11-06  336.58500
 2014-11-07  346.77500
 2014-11-08  344.81166
 2014-11-09  343.06500
 2014-11-10  358.50166
 2014-11-11  368.07666, (2503, 1))

In [20]:
prices_updated = bitcoin_prices_updated['Price'].to_numpy()

In [21]:
def make_windows(x, window_size=7, horizon=1):
  """
  Turns a 1D array into a 2D array of sequential windows of window_size.
  """
  
  window_step = np.expand_dims(np.arange(window_size+horizon), axis=0)
  window_indexes = window_step + np.expand_dims(np.arange(len(x)-(window_size+horizon-1)), axis=0).T # create 2D array of windows of size window_size
  windowed_array = x[window_indexes]
  windows, labels = get_labelled_windows(windowed_array, horizon=horizon)

  return windows, labels

In [22]:
full_windows , full_labels = make_windows(prices_updated)
len(full_windows), len(full_labels)

(2496, 2496)

In [23]:
# Looking at few examples of how price is scaled
for i in range(3):
  print(f'Window: {full_windows[i]} --> Label {full_labels[i]}')

Window: [325.22633 331.60083 324.71833 332.45666 336.585   346.775   344.81166] --> Label [343.065]
Window: [331.60083 324.71833 332.45666 336.585   346.775   344.81166 343.065  ] --> Label [358.50166]
Window: [324.71833 332.45666 336.585   346.775   344.81166 343.065   358.50166] --> Label [368.07666]


In [24]:
# Making train and test splits
train_windows ,  test_windows ,train_labels,  test_labels =  make_train_test_splits(full_windows , full_labels)

len(train_windows) ,  len(test_windows) , len(train_labels),  len(test_labels) 

(1996, 500, 1996, 500)

Now we're building the same Model 1 with the new coindesk data. 

In [25]:
# Building the Model 1 with the updated data
tf.random.set_seed(42)

# Construct the model 
model_1 = tf.keras.Sequential([
  layers.Dense(128, activation= 'relu') ,
  layers.Dense(HORIZON , activation = 'linear')
])

# Compiling the model 
model_1.compile(loss = 'mae' , 
                optimizer = tf.keras.optimizers.Adam() , 
                metrics = ['mae'])

# Fit the model 
model_1.fit(x = train_windows , 
            y = train_labels , 
            epochs = 100 , batch_size = 128 , verbose = 0 , 
            validation_data = (test_windows , test_labels))

<keras.callbacks.History at 0x7f0d973cb210>

In [26]:
# Evaluating the model 
model_1.evaluate(test_windows , test_labels)



[884.0137939453125, 884.0137939453125]

### 3. For most of our models we used WINDOW_SIZE=7, but is there a better window size?

* Setup a series of experiments to find whether or not there’s a better window size.
* For example, you might train 10 different models with HORIZON=1 but with window sizes ranging from 2-12.

In [27]:
# Writing a evaluation function based on the preds and targets 
def evaluate_preds(y_true , y_pred):

  # Casting the values to float32 
  y_true = tf.cast(y_true , tf.float32)
  y_pred = tf.cast(y_pred , tf.float32)


  # Calculate the metrics 
  mae = tf.keras.metrics.mean_absolute_error(y_true , y_pred)
  mse = tf.keras.metrics.mean_squared_error(y_true , y_pred)
  rmse = tf.sqrt(mse)
  mape = tf.keras.metrics.mean_absolute_percentage_error(y_true , y_pred)
  
  # For longer horizons 
  if mae.ndim > 0:
    mae = tf.reduce_sum(mae)
    mse = tf.reduce_sum(mse)
    rmse = tf.reduce_sum(rmse)
    mape = tf.reduce_sum(mape)

  return {'mae' : mae.numpy() , 
          'mse': mse.numpy() , 
          'rmse': rmse.numpy() , 
          'mape': mape.numpy() }

In [28]:
# Writing a for loop to iterate over the Window size and build 10 different models

# 10 Different models with window size ranging from (2 - 12) and store the results
model_results_list = []

from tqdm import tqdm
for size in tqdm(range(2,12)):
  HORIZON = 1 
  WINDOW_SIZE = size

  # Making window and labels 
  full_windows , full_labels = make_windows(prices, window_size= WINDOW_SIZE , horizon= HORIZON)
  

  # Splitting the data in train and test
  train_windows ,  test_windows ,train_labels,  test_labels = make_train_test_splits(full_windows , full_labels)


  # Building a simple dense model
  input = layers.Input(shape = (WINDOW_SIZE ,) , name = 'Input_layer')
  x = layers.Dense(128 , activation= 'relu')(input)
  output = layers.Dense(HORIZON , activation= 'linear')(x)

  # Packing into a model 
  model = tf.keras.Model(input , output , name = f'model_windowed_{size}')

  # Compiling and fitting the model 
  model.compile(loss = 'mae' , optimizer = 'adam' , metrics = 'mae')

  model.fit(train_windows , train_labels , 
            epochs = 100 , verbose = 0 , 
            batch_size = 128 , 
            validation_data = (test_windows , test_labels))
  

  # Making predictions 
  preds_ = model.predict(test_windows)
  y_preds = tf.squeeze(preds_)

  results = evaluate_preds(tf.squeeze(test_labels) , y_preds)
  model_results_list.append(results)
 

100%|██████████| 10/10 [00:51<00:00,  5.18s/it]


In [29]:
# Below are the 10 different models result 
model_results_list

[{'mae': 569.1506, 'mape': 2.5277913, 'mse': 1159396.5, 'rmse': 1076.7528},
 {'mae': 565.61, 'mape': 2.5274613, 'mse': 1139677.6, 'rmse': 1067.5569},
 {'mae': 623.44543, 'mape': 2.8342037, 'mse': 1266272.0, 'rmse': 1125.2875},
 {'mae': 565.61096, 'mape': 2.5193882, 'mse': 1157946.9, 'rmse': 1076.0793},
 {'mae': 575.82715, 'mape': 2.5713227, 'mse': 1181491.9, 'rmse': 1086.9645},
 {'mae': 624.65094, 'mape': 2.8559413, 'mse': 1279308.5, 'rmse': 1131.0652},
 {'mae': 673.8483, 'mape': 3.1560009, 'mse': 1401001.9, 'rmse': 1183.6393},
 {'mae': 661.0718, 'mape': 3.0542161, 'mse': 1343192.1, 'rmse': 1158.9617},
 {'mae': 583.37537, 'mape': 2.6407204, 'mse': 1204248.0, 'rmse': 1097.3823},
 {'mae': 701.84753, 'mape': 3.2888033, 'mse': 1443914.4, 'rmse': 1201.6299}]

### 4. Create a windowed dataset just like the ones we used for model_1 using  [tf.keras.preprocessing.timeseries_dataset_from_array()](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/timeseries_dataset_from_array)  and retrain model_1 using the recreated dataset.

In [30]:
WINDOW_SIZE = 7 
HORIZON = 1

In [32]:
# Make the splits 
def make_train_test_splits(windows , labels , test_split = 0.2):
  split_size = int(len(windows) * (1 - test_split))
  train_windows = windows[:split_size]
  train_labels = labels[:split_size]
  test_windows = windows[split_size:]
  test_labels = labels[split_size:]

  return train_windows ,  test_windows ,train_labels,  test_labels

In [33]:
ds = tf.keras.utils.timeseries_dataset_from_array(
    data = prices , targets = prices , sequence_length = WINDOW_SIZE , sequence_stride = HORIZON, 
    batch_size = 128
)

In [34]:
train_size , test_size = int(0.8 * len(ds)) ,int(0.2 * len(ds))

In [35]:
train_ds = ds.take(train_size)
test_ds = ds.skip(train_size).take(test_size)

In [36]:
for x , y in train_ds.take(1):
  print(x[:2] , y[:2])

tf.Tensor(
[[123.65499 125.455   108.58483 118.67466 121.33866 120.65533 121.795  ]
 [125.455   108.58483 118.67466 121.33866 120.65533 121.795   123.033  ]], shape=(2, 7), dtype=float64) tf.Tensor([123.65499 125.455  ], shape=(2,), dtype=float64)


In [37]:
for x , y in test_ds.take(1):
  print(x[:2] , y[:2])

tf.Tensor(
[[10302.19071368 10301.65965169 10231.42151196 10168.28770938
  10223.5055788  10138.33520522  9984.52051597]
 [10301.65965169 10231.42151196 10168.28770938 10223.5055788
  10138.33520522  9984.52051597 10031.86670899]], shape=(2, 7), dtype=float64) tf.Tensor([10302.19071368 10301.65965169], shape=(2,), dtype=float64)


In [38]:
len(test_ds) + len(train_ds)

21

In [39]:
len(test_ds) , test_size

(4, 4)

In [40]:
# Building the Model 1 with the updated data
tf.random.set_seed(42)

# Building a simple dense model
input = layers.Input(shape = (WINDOW_SIZE ,) , name = 'Input_layer' , dtype = tf.float32)
x = layers.Dense(128 , activation= 'relu')(input)
output = layers.Dense(HORIZON , activation= 'linear')(x)

# Packing into a model 
model = tf.keras.Model(input , output)

# Compiling the model 
model.compile(loss = 'mae' , 
                optimizer = tf.keras.optimizers.Adam() , 
                metrics = ['mae'])

# Fit the model 
model.fit(train_ds ,
          epochs = 100 , verbose = 0 , 
            validation_data = test_ds)

<keras.callbacks.History at 0x7f0d9679fbd0>

In [41]:
model.evaluate(test_ds)



[643.4305419921875, 643.4305419921875]

### 5. For our multivariate modelling experiment, we added the Bitcoin block reward size as an extra feature to make our time series multivariate.

  * Are there any other features you think you could add?
  * If so, try it out, how do these affect the model?

In [42]:
df

Unnamed: 0_level_0,Currency,Closing Price (USD),24h Open (USD),24h High (USD),24h Low (USD)
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2013-10-01,BTC,123.654990,124.304660,124.751660,122.563490
2013-10-02,BTC,125.455000,123.654990,125.758500,123.633830
2013-10-03,BTC,108.584830,125.455000,125.665660,83.328330
2013-10-04,BTC,118.674660,108.584830,118.675000,107.058160
2013-10-05,BTC,121.338660,118.674660,121.936330,118.005660
...,...,...,...,...,...
2021-05-14,BTC,49764.132082,49596.778891,51448.798576,46294.720180
2021-05-15,BTC,50032.693137,49717.354353,51578.312545,48944.346536
2021-05-16,BTC,47885.625255,49926.035067,50690.802950,47005.102292
2021-05-17,BTC,45604.615754,46805.537852,49670.414174,43868.638969


In [45]:
import datetime 

In [81]:
df.index[0].day_of

Timestamp('2013-10-01 00:00:00')

In [87]:
df['day_of_week'] = df.index.dayofweek
df.head(10)

Unnamed: 0_level_0,Currency,Closing Price (USD),24h Open (USD),24h High (USD),24h Low (USD),day_of_week
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2013-10-01,BTC,123.65499,124.30466,124.75166,122.56349,1
2013-10-02,BTC,125.455,123.65499,125.7585,123.63383,2
2013-10-03,BTC,108.58483,125.455,125.66566,83.32833,3
2013-10-04,BTC,118.67466,108.58483,118.675,107.05816,4
2013-10-05,BTC,121.33866,118.67466,121.93633,118.00566,5
2013-10-06,BTC,120.65533,121.33866,121.85216,120.5545,6
2013-10-07,BTC,121.795,120.65533,121.99166,120.43199,0
2013-10-08,BTC,123.033,121.795,123.64016,121.35066,1
2013-10-09,BTC,124.049,123.033,124.7835,122.59266,2
2013-10-10,BTC,125.96116,124.049,128.01683,123.81966,3


In [112]:
# Defining the hyper parameters 
HORIZON = 1 
WINDOW_SIZE = 7 

bitcoin_prices_windowed['day_of_week'] = bitcoin_prices_windowed.index.dayofweek

In [138]:
# Getting three kinds of data (univariate , multivariate and the day of week)

# Univariate data 
full_windows , full_labels = make_windows_scaled(prices)
train_windows , test_windows , train_labels , test_labels = make_train_test_splits(full_windows , full_labels)

# Multivaritate dat 
X = bitcoin_prices_windowed.dropna().drop('Price' , axis = 1).astype(np.float32)
X_scaled = scaler.fit_transform(X)
y = bitcoin_prices_windowed.dropna()['Price'].astype(np.float32)

# Day of week 
day_of_week = bitcoin_prices_windowed.dropna()['day_of_week'].to_list()

In [139]:
# Checking the shapes 
print(full_windows.shape , full_labels.shape)
print(X.shape , y.shape)
print(len(day_of_week))

(2780, 7) (2780, 1)
(2780, 9) (2780,)
2780


In [140]:
# Splitting the multivariate and the day_of_week to train and test splits 
split_size = int(len(X) * 0.8)
train_block_rewards , test_block_rewards = X[:split_size] , X[split_size:]
train_days , test_days = day_of_week[:split_size] , day_of_week[split_size:]
 
len(train_block_rewards), len(train_days) , len(test_block_rewards) , len(test_days)

(2224, 2224, 556, 556)

In [141]:
# Building a performant dataset for train and test 

train_data_tribid = tf.data.Dataset.from_tensor_slices((train_windows , 
                                                        train_block_rewards , 
                                                        train_days))

train_labels_tribid = tf.data.Dataset.from_tensor_slices(train_labels)

# The test/val split 
test_data_tribid = tf.data.Dataset.from_tensor_slices((test_windows , 
                                                       test_block_rewards , 
                                                       test_days))

test_labels_tribid = tf.data.Dataset.from_tensor_slices(test_labels)

# Zipping the data and labels into one complete dataset 
tribid_train_ds = tf.data.Dataset.zip((train_data_tribid , train_labels_tribid))
tribid_test_ds = tf.data.Dataset.zip((test_data_tribid , test_labels_tribid))

# Applying prefetch and batching the dataset 
tribid_train_ds = tribid_train_ds.batch(128).prefetch(tf.data.AUTOTUNE)
tribid_test_ds = tribid_test_ds.batch(128).prefetch(tf.data.AUTOTUNE)

tribid_train_ds ,tribid_test_ds

(<PrefetchDataset shapes: (((None, 7), (None, 9), (None,)), (None, 1)), types: ((tf.float64, tf.float32, tf.int32), tf.float64)>,
 <PrefetchDataset shapes: (((None, 7), (None, 9), (None,)), (None, 1)), types: ((tf.float64, tf.float32, tf.int32), tf.float64)>)

In [145]:
# Building a tribid model 

input_windows = layers.Input(shape = (7,) , dtype=tf.float64 , name='Window Inputs')
exp_layer_1 = layers.Lambda(lambda x: tf.expand_dims(x , axis = 1))(input_windows)
conv1 = layers.Conv1D(filters= 32 , kernel_size=5 , padding='causal' , activation= 'relu')(exp_layer_1)
window_model = tf.keras.Model(input_windows , conv1 , name = 'Windowed model')

input_blocks = layers.Input(shape = (9,) , dtype= tf.float32 , name ='Block rewards input')
exp_layer_2 = layers.Lambda(lambda x: tf.expand_dims(x , axis = 1))(input_blocks)
conv2 = layers.Conv1D(filters = 32 , kernel_size= 5 , activation= 'relu' , padding = 'causal')(exp_layer_2)
block_model = tf.keras.Model(input_blocks , conv2 , name = 'Block rewards model')


# Use expand dims to match the same shape output (None , 1 , 128)
# whereas without expand dims it would be (None , 128)
input_days = layers.Input(shape= (1,) , dtype = tf.int32 , name ='Days of week Input')
exp_layer_3 = layers.Lambda(lambda x: tf.expand_dims(x , axis = 1))(input_days)
dense = layers.Dense(128 , activation= 'relu')(exp_layer_3)
days_model = tf.keras.Model(input_days , dense , name = 'Days Model')

# Concatenating the inputs 
concat = layers.Concatenate(name = 'combined_outputs' )([window_model.output , 
                                                           block_model.output , 
                                                           days_model.output])

# Creating the output layer 
dropout = layers.Dropout(0.4)(concat)
output_layer = layers.Dense(1 , activation = 'linear')(dropout)

# Putting everything into a model 
tribid_model = tf.keras.Model(inputs = [window_model.input , 
                                        block_model.input , 
                                        days_model.input] , 
                              outputs = output_layer)
tribid_model.summary()



Model: "model_7"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
Window Inputs (InputLayer)      [(None, 7)]          0                                            
__________________________________________________________________________________________________
Block rewards input (InputLayer [(None, 9)]          0                                            
__________________________________________________________________________________________________
Days of week Input (InputLayer) [(None, 1)]          0                                            
__________________________________________________________________________________________________
lambda_26 (Lambda)              (None, 1, 7)         0           Window Inputs[0][0]              
____________________________________________________________________________________________

In [147]:
# Compiling and fitting the model 
tribid_model.compile(loss = 'mae' , 
                     optimizer = 'adam' , metrics = ['mae'])

# Fitting the model 
tribid_model.fit(tribid_train_ds , 
                 epochs = 20,  
                 validation_data = tribid_test_ds , verbose = 2)

Epoch 1/20
18/18 - 1s - loss: 29.1395 - mae: 29.1395 - val_loss: 12.8604 - val_mae: 12.8604
Epoch 2/20
18/18 - 0s - loss: 5.7451 - mae: 5.7451 - val_loss: 7.9342 - val_mae: 7.9342
Epoch 3/20
18/18 - 0s - loss: 1.1152 - mae: 1.1152 - val_loss: 3.3295 - val_mae: 3.3295
Epoch 4/20
18/18 - 0s - loss: 0.4115 - mae: 0.4115 - val_loss: 0.4594 - val_mae: 0.4594
Epoch 5/20
18/18 - 0s - loss: 0.8504 - mae: 0.8504 - val_loss: 1.9727 - val_mae: 1.9727
Epoch 6/20
18/18 - 0s - loss: 1.0429 - mae: 1.0429 - val_loss: 0.7200 - val_mae: 0.7200
Epoch 7/20
18/18 - 0s - loss: 0.8757 - mae: 0.8757 - val_loss: 2.0799 - val_mae: 2.0799
Epoch 8/20
18/18 - 0s - loss: 0.2225 - mae: 0.2225 - val_loss: 0.0813 - val_mae: 0.0813
Epoch 9/20
18/18 - 0s - loss: 0.1704 - mae: 0.1704 - val_loss: 0.1398 - val_mae: 0.1398
Epoch 10/20
18/18 - 0s - loss: 0.1544 - mae: 0.1544 - val_loss: 0.0908 - val_mae: 0.0908
Epoch 11/20
18/18 - 0s - loss: 0.1410 - mae: 0.1410 - val_loss: 0.0821 - val_mae: 0.0821
Epoch 12/20
18/18 - 0s - l

<keras.callbacks.History at 0x7f0d8be37250>

In [163]:
# Evaluating the model 
tribid_model.evaluate(tribid_test_ds)



[0.029597144573926926, 0.029597144573926926]

### 6. Make prediction intervals for future forecasts. To do so, one way would be to train an ensemble model on all of the data, make future forecasts with it and calculate the prediction intervals of the ensemble just like we did for model_8.

**Things to do**
- Train an ensemble model on the whole data. 
- Make one dataset (no test/valid) which will use to predict future forecasts of bitcoins. 
- Make a function that will take the number of iterations and different loss functions to train the model with. 


In [239]:
# Make one whole dataset (with the updated bitcoin prices 2014 - 2021)

X_all = bitcoin_prices_windowed.drop(['Price' , 'block_reward' , 'day_of_week'] , axis = 1).dropna().to_numpy()
y_all = bitcoin_prices_windowed.dropna()['Price'].to_numpy()

whole_ds = tf.data.Dataset.from_tensor_slices((X_all , y_all))
whole_ds = whole_ds.batch(128).prefetch(tf.data.AUTOTUNE)
whole_ds

<PrefetchDataset shapes: ((None, 7), (None,)), types: (tf.float64, tf.float64)>

In [172]:
# Creating the function 

def get_ensemble_models(horizon = HORIZON , 
                        dataset = whole_ds , 
                        num_iter = 10 , 
                        num_epochs = 100 , 
                        loss_fns = ['mae' , 'mse' , 'mape']):
  

  # Make a empty list of the ensemble models 
  ensemble_models = []

  # Create num_iter number of models per loss functions 
  for i in range(num_iter):
    for loss_functions in loss_fns:
      print(f'Optimizing model by reducing: {loss_functions} for {num_epochs} epochs, model number: {i}')

      model = tf.keras.Sequential([
          layers.Dense(128 , kernel_initializer='he_normal' , activation= 'relu'),
          layers.Dense(128 , kernel_initializer= 'he_normal', activation= 'relu'),
          layers.Dense(HORIZON)
      ])

      # Compiling the model 
      model.compile(loss = loss_functions , 
                    optimizer = 'adam' , metrics = ['mae' , 'mse'])
      
      # Fit the model 
      model.fit(dataset , 
                epochs = num_epochs , 
                verbose = 0,
                callbacks=[tf.keras.callbacks.EarlyStopping(monitor="loss",
                                                            patience=200,
                                                            restore_best_weights=True),
                           tf.keras.callbacks.ReduceLROnPlateau(monitor="loss",
                                                                patience=100,
                                                                verbose=1)])
      
      ensemble_models.append(model)

  return ensemble_models

In [242]:
# Running the above function 
ensemble_models = get_ensemble_models(num_iter=5 , num_epochs= 1000 , horizon = 1)

Optimizing model by reducing: mae for 1000 epochs, model number: 0

Epoch 00229: ReduceLROnPlateau reducing learning rate to 0.00010000000474974513.

Epoch 00351: ReduceLROnPlateau reducing learning rate to 1.0000000474974514e-05.
Optimizing model by reducing: mse for 1000 epochs, model number: 0

Epoch 00146: ReduceLROnPlateau reducing learning rate to 0.00010000000474974513.

Epoch 00429: ReduceLROnPlateau reducing learning rate to 1.0000000474974514e-05.
Optimizing model by reducing: mape for 1000 epochs, model number: 0

Epoch 00157: ReduceLROnPlateau reducing learning rate to 0.00010000000474974513.

Epoch 00259: ReduceLROnPlateau reducing learning rate to 1.0000000474974514e-05.

Epoch 00380: ReduceLROnPlateau reducing learning rate to 1.0000000656873453e-06.
Optimizing model by reducing: mae for 1000 epochs, model number: 1

Epoch 00313: ReduceLROnPlateau reducing learning rate to 0.00010000000474974513.

Epoch 00440: ReduceLROnPlateau reducing learning rate to 1.000000047497451

In [243]:
# Making future forecastts 
def make_future_forecast(values , model_list , into_future , window_size):

  future_forecast = []
  last_window = values[-window_size:]

  for _ in range(into_future): 
    for model in model_list:
    
      future_pred = model.predict(tf.expand_dims(last_window, axis= 0))
      #future_pred = model.predict(last_window)
      print(f'Predicing on: \n {last_window} --> Prediction: {tf.squeeze(future_pred).numpy()}\n')

      future_forecast.append(tf.squeeze(future_pred).numpy())

      # Update the last window 
      last_window = np.append(last_window , future_pred)[-window_size:]
  return future_forecast

In [244]:
# Getting the future forecast 
future_forecast = make_future_forecast(y_all , ensemble_models , into_future= 14 , window_size = 7 )

Predicing on: 
 [56573.5554719  52147.82118698 49764.1320816  50032.69313676
 47885.62525472 45604.61575361 43144.47129086] --> Prediction: 56227.828125

Predicing on: 
 [52147.82118698 49764.1320816  50032.69313676 47885.62525472
 45604.61575361 43144.47129086 56227.828125  ] --> Prediction: 49828.5390625

Predicing on: 
 [49764.1320816  50032.69313676 47885.62525472 45604.61575361
 43144.47129086 56227.828125   49828.5390625 ] --> Prediction: 50025.203125

Predicing on: 
 [50032.69313676 47885.62525472 45604.61575361 43144.47129086
 56227.828125   49828.5390625  50025.203125  ] --> Prediction: 50599.34765625

Predicing on: 
 [47885.62525472 45604.61575361 43144.47129086 56227.828125
 49828.5390625  50025.203125   50599.34765625] --> Prediction: 48159.0078125

Predicing on: 
 [45604.61575361 43144.47129086 56227.828125   49828.5390625
 50025.203125   50599.34765625 48159.0078125 ] --> Prediction: 46780.62109375

Predicing on: 
 [43144.47129086 56227.828125   49828.5390625  50025.20312

### 7. For future predictions, try to make a prediction, retrain a model on the predictions, make a prediction, retrain a model, make a prediction, retrain a model, make a prediction (retrain a model each time a new prediction is made). Plot the results, how do they look compared to the future predictions where a model wasn’t retrained for every forecast (model_9)?


### 8. Throughout this notebook, we’ve only tried algorithms we’ve handcrafted ourselves. But it’s worth seeing how a purpose built forecasting algorithm goes.

* Try out one of the extra algorithms listed in the modelling experiments part such as:
	*  [Facebook’s Kats library](https://github.com/facebookresearch/Kats)  - there are many models in here, remember the machine learning practioner’s motto: experiment, experiment, experiment.
	*  [LinkedIn’s Greykite library](https://github.com/linkedin/greykite) 
