<a href="https://colab.research.google.com/github/L4ncelot1024/Learn_Deep_Learning_Le_Wagon/blob/main/Day3/01_Time_Series.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Time series forecasting: Weather Forceast

This tutorial is an introduction to time series forecasting using Recurrent Neural Networks (RNNs). This is covered in two parts: first, you will forecast a univariate time series, then you will forecast a multivariate time series.

In [None]:
%tensorflow_version 2.x
# Force the tensorflow version to be 2.0

In [None]:
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf

import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd

mpl.rcParams['figure.figsize'] = (8, 6)
mpl.rcParams['axes.grid'] = False

## Data
This tutorial uses a [weather time series dataset](https://www.bgc-jena.mpg.de/wetter/) recorded by the [Max-Planck-Institute for Biogeochemistry](https://www.bgc-jena.mpg.de/index.php/Main/HomePage).

This dataset contains __14__ different features such as air temperature, atmospheric pressure, and humidity. These were collected every 10 minutes, beginning in 2003. For efficiency, you will use only the data collected between 2009 and 2016. This section of the dataset was prepared by François Chollet for his book [Deep Learning with Python](https://www.manning.com/books/deep-learning-with-python).

In [None]:
zip_path = tf.keras.utils.get_file(
    origin='https://storage.googleapis.com/tensorflow/tf-keras-datasets/jena_climate_2009_2016.csv.zip',
    fname='jena_climate_2009_2016.csv.zip',
    extract=True)
csv_path, _ = os.path.splitext(zip_path)

In [None]:
df = pd.read_csv(csv_path)

Let's take a glance at the data.

In [None]:
df.head()

Questions about the data:

- How often do we record the measures?

<details>
<summary markdown='span'>View solution
</summary>
As you can see above, an observation is recorded __every 10 minutes__. This means that, for a single hour, you will have 6 observations. Similarly, a single day will contain 144 (6x24) observations. 

- If we want to predict the temperature 6 hours in the future, and we choose 5 days of measures. How many observations should we have in one training input?

<details>
<summary markdown='span'>View solution
</summary>
In order to make this prediction, you would create a window containing the last 720(5x144) observations to train the model. Many such configurations are possible, making this dataset a good one to experiment with.

### Data Extraction

The function below returns the above described windows of time for the model to train on.

- The parameter `history_size` is the number of time steps of the past window of information to use.

- The `target_size` is how far in the future (= number of time steps) does the model need to learn to predict. The `target_size` is the label that needs to be predicted.

In [None]:
def extract_data_labels(dataset, start_index, end_index, history_size, target_size):
  data = []
  labels = []

  start_index = start_index + history_size
  if end_index is None:
    end_index = len(dataset) - target_size

  for i in range(start_index, end_index):
    indices = range(i-history_size, i)
    # Reshape data from (history_size,) to (history_size, 1)
    data.append(np.reshape(dataset[indices], (history_size, 1)))
    labels.append(dataset[i+target_size])
  return np.array(data), np.array(labels)

Here we're splittting our data in train & test with a temporal split since we want to be good at predicting the future. 

So the first 300,000 rows of the data will be the training dataset, and the remaining part will be the validation dataset. This amounts to ~2100 days worth of training data.

In [None]:
TRAIN_SPLIT = 300000

We set the seed for reproducibility

In [None]:
tf.random.set_seed(13)

## Part 1: Forecast a univariate time series
First, you will train a model using only a single feature (temperature), and use it to make predictions for that value in the future.

Let's first extract only the temperature from the dataset.

In [None]:
# Here we extract our univariate data, and we set the time as index to keep the order
uni_data = df['T (degC)']
uni_data.index = df['Date Time']
uni_data.head()

Let's observe how this data looks across time.

In [None]:
uni_data.plot(subplots=True)

### Standardisation

In [None]:
# We convert the data into a np.ndarray
uni_data = uni_data.values

It is important to normalize features before training a neural network. A common way to do so is by subtracting the mean and dividing by the standard deviation of each feature.

In [None]:
# TODO: normalize your input features

In [None]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

uni_data_train = uni_data[:TRAIN_SPLIT]
scaler.fit(uni_data_train[:, np.newaxis])

uni_data_scaled = scaler.transform(uni_data[:, np.newaxis])

<details>
<summary markdown='span'>View solution
</summary>

```python
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

uni_data_train = uni_data[:TRAIN_SPLIT]
scaler.fit(uni_data_train[:, np.newaxis])

uni_data_scaled = scaler.transform(uni_data[:, np.newaxis])
```

Let's now create the data for the univariate model. First, we want to predict the next time step given an history size of 20 steps

In [None]:
univariate_past_history = 20
univariate_future_target = 0

x_train_uni, y_train_uni = extract_data_labels(uni_data_scaled, 0, TRAIN_SPLIT,
                                           univariate_past_history,
                                           univariate_future_target)
x_val_uni, y_val_uni = extract_data_labels(uni_data_scaled, TRAIN_SPLIT, None,
                                       univariate_past_history,
                                       univariate_future_target)

This is what the `univariate_data` function returns.

In [None]:
print ('Single window of past history')
print (x_train_uni[0])
print ('\n Target temperature to predict')
print (y_train_uni[0])

Now that the data has been created, let's take a look at a single example. The information given to the network is given in blue, and it must predict the value at the red cross.

In [None]:
def create_time_steps(length):
  time_steps = []
  for i in range(-length, 0, 1):
    time_steps.append(i)
  return time_steps

In [None]:
def show_plot(plot_data, names, delta , title):
  labels = ['History', 'True Future'] + names
  m = len(plot_data)
  marker = ['.-', 'rx']
  for i in range(m-2):
    marker.append('o')
  time_steps = create_time_steps(plot_data[0].shape[0])
  if delta:
    future = delta
  else:
    future = 0

  plt.title(title)
  for i, x in enumerate(plot_data):
    if i:
      plt.plot(future, plot_data[i], marker[i], markersize=10,
               label=labels[i])
    else:
      plt.plot(time_steps, plot_data[i].flatten(), marker[i], label=labels[i])
  plt.legend()
  plt.xlim([time_steps[0], (future+5)*2])
  plt.xlabel('Time-Step')
  return plt

In [None]:
show_plot([x_train_uni[0], y_train_uni[0]], [], 0, 'Sample Example')

### Baseline
Before proceeding to train a model, let's first set a simple baseline. Given an input point, the baseline method looks at all the history and predicts the next point to be the average of the last 20 observations.

In [None]:
# TODO: define a baseline which simply compute the mean of the history
# [EXTRA]: if you have other ideas for a simple baseline, you can add them there
def baseline(history):
  pass



<details>
<summary markdown='span'>View solution
</summary>

```python
def baseline(history):
  return np.mean(history)
```

In [None]:
show_plot([x_train_uni[0], y_train_uni[0], baseline(x_train_uni[0])], ['Baseline Prediction'], 0,
           'Baseline Prediction Example')

Let's see if you can beat this baseline using a recurrent neural network.

### Recurrent neural network

Recall:

A Recurrent Neural Network (RNN) is a type of neural network well-suited to time series data. RNNs process a time series step-by-step, maintaining an internal state summarizing the information they've seen so far.

Here, you will use a specialized RNN layer called Long Short Term Memory ([LSTM](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/layers/LSTM))

In [None]:
x_train_uni.shape

In [None]:
# TODO: create a Network with 1 LSTM hidden layer and compile it

<details>
<summary markdown='span'>Hints
</summary>
Which loss should you use?
</details>

<details>
<summary markdown='span'>View solution
</summary>

```python
simple_lstm_model = tf.keras.models.Sequential([
    tf.keras.layers.LSTM(8, input_shape=x_train_uni.shape[-2:]),
    tf.keras.layers.Dense(1)
])

simple_lstm_model.compile(optimizer='adam', loss='mae')
```

Let's make a sample prediction, to check the output of the model and make sure it goes through.

In [None]:
# TODO: predict on one random sample of the validation set

<details>
<summary markdown='span'>View solution
</summary>

```python
print(simple_lstm_model.predict(x_val_uni[:1]).shape)
```

Let's train the model now. Due to the large size of the dataset, in the interest of saving time, each epoch will only run for 200 steps, instead of the complete training data as normally done.

In [None]:
# TODO: fit your model here using the following settings

<details>
<summary markdown='span'>View solution
</summary>

```python
simple_lstm_model.fit(x_train_uni, y_train_uni, batch_size=BATCH_SIZE,
                      epochs=EPOCHS, steps_per_epoch=EVALUATION_INTERVAL,
                      shuffle=True,
                      validation_data=(x_val_uni, y_val_uni), validation_steps=50)
```

In [None]:
history_df = pd.DataFrame(simple_lstm_model.history.history)
history_df['epochs'] = history_df.index
history_df

In [None]:
# TODO: plot the loss and metrics curv on both the train and validation set


<details>
<summary markdown='span'>View solution
</summary>

```python
fig, axes = plt.subplots(2, 1, figsize=(14, 12))
 
for i, metric in enumerate(['loss', 'mse']):
  ax = axes[i]
  history_df.plot('epochs', f'{metric}', color='g', label='train', ax=ax)
  history_df.plot('epochs', f'val_{metric}', color='r', label='val', ax=ax)
  ax.set_ylabel(metric)
plt.show()
```

#### Predict using the simple LSTM model
Now that you have trained your simple LSTM, let's try and make a few predictions.

In [None]:
x_val_uni.shape

In [None]:
# Here we plot the predictions using an LSTM and the Baseline on 3 sample
ids = np.random.randint(0, len(x_val_uni), 3)
for i in ids:
  x, y = x_val_uni[i], y_val_uni[i]
  plot = show_plot([x, y, baseline(x),
                    simple_lstm_model.predict(x[np.newaxis, :])], ['Baseline Prediction', 'LSTM Prediction'],
                   0, 'Simple LSTM model')
  plot.show()

This looks better than the baseline. Now that you have seen the basics, let's move on to part two, where you will work with a multivariate time series.

## Part 2: Forecast a multivariate time series

The original dataset contains 14 features. For simplicity, this section considers only three of the original fourteen. The features used are air temperature, atmospheric pressure, and air density. 

To use more features, add their names to this list.

In [None]:
features_considered = ['p (mbar)', 'T (degC)', 'rho (g/m**3)']

In [None]:
features = df[features_considered]
features.index = df['Date Time']
features.head()

Let's have a look at how each of these features vary across time.

In [None]:
features.plot(subplots=True)

### Standardisation

As mentioned, the first step will be to normalize the dataset using the mean and standard deviation of the training data.

In [None]:
# TODO: normalize your data

<details>
<summary markdown='span'>View solution
</summary>

```python
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

dataset = features.values
dataset_train = dataset[:TRAIN_SPLIT]
scaler.fit(dataset_train)

dataset_scaled = scaler.transform(dataset_train)
```

### Single step model
In a single step setup, the model learns to predict a single point in the future based on some history provided.

The below function performs the same windowing task as below, however, here it samples the past observation based on the step size given.

In [None]:
def extract_data_labels_multi(dataset, target, start_index, end_index, history_size,
                      target_size, step, single_step=False):
  '''
  Extract features and label from the dataset, sampling the data in the index
  range (start_index, end_index), with a step size of step.
  It uses a number of time steps defined by history_size for the features and
  return information after target_size timesteps for the label.
  '''
  data = []
  labels = []

  start_index = start_index + history_size
  if end_index is None:
    end_index = len(dataset) - target_size

  for i in range(start_index, end_index):
    indices = range(i-history_size, i, step)
    data.append(dataset[indices])

    if single_step:
      labels.append(target[i+target_size])
    else:
      labels.append(target[i:i+target_size])

  return np.array(data), np.array(labels)

In this tutorial, the network is shown data from the last five (5) days, i.e. 720 observations that are sampled every hour. The sampling is done every one hour since a drastic change is not expected within 60 minutes. Thus, 120 observation represent history of the last five days.  For the single step prediction model, the label for a datapoint is the temperature 12 hours into the future. In order to create a label for this, the temperature after 72(12*6) observations is used.

In [None]:
past_history = 720
future_target = 72
STEP = 6

x_train_single, y_train_single = extract_data_labels_multi(dataset_scaled, dataset_scaled[:, 1], 0,
                                                   TRAIN_SPLIT, past_history,
                                                   future_target, STEP,
                                                   single_step=True)
x_val_single, y_val_single = extract_data_labels_multi(dataset_scaled, dataset_scaled[:, 1],
                                               TRAIN_SPLIT, None, past_history,
                                               future_target, STEP,
                                               single_step=True)

In [None]:
x_train_single.shape

In [None]:
y_train_single.shape

Let's look at a single data-point.


In [None]:
print ('Single window of past history : {}'.format(x_train_single[0].shape))

In [None]:
# TODO: build a Keras NN with one hidden layer of LSTM for the single step task

<details>
<summary markdown='span'>View solution
</summary>

```python
single_step_model = tf.keras.models.Sequential([
    tf.keras.layers.LSTM(32, input_shape=x_train_single.shape[-2:]),
    tf.keras.layers.Dense(1)
])
single_step_model.compile(optimizer=tf.keras.optimizers.RMSprop(), loss='mae', metrics=['mse'])
```

Let's check out a sample prediction.

In [None]:
# TODO: check that your model can predict on an input sample

<details>
<summary markdown='span'>View solution
</summary>

```python
print(single_step_model.predict(x_val_single[:1]).shape)
```

In [None]:
# TODO: Fit your model with the same settings as previously
single_step_history = single_step_model.fit(x_train_single, y_train_single,
                                            epochs=EPOCHS, batch_size=BATCH_SIZE,
                                            steps_per_epoch=EVALUATION_INTERVAL,
                                            shuffle=True,
                                            validation_data=(x_val_single, y_val_single),
                                            validation_steps=50)

In [None]:
# TODO: plot metrics of train and val

<details>
<summary markdown='span'>View solution
</summary>

```python
history_df = pd.DataFrame(single_step_history.history)
history_df['epochs'] = history_df.index
history_df

fig, axes = plt.subplots(2, 1, figsize=(14, 12))
 
for i, metric in enumerate(['loss', 'mse']):
  ax = axes[i]
  history_df.plot('epochs', f'{metric}', color='g', label='train', ax=ax)
  history_df.plot('epochs', f'val_{metric}', color='r', label='val', ax=ax)
  ax.set_ylabel(metric)
plt.show()
```

#### Predict a single step future
Now that the model is trained, let's make a few sample predictions. The model is given the history of three features over the past five days sampled every hour (120 data-points), since the goal is to predict the temperature, the plot only displays the past temperature. The prediction is made one day into the future (hence the gap between the history and prediction). 

In [None]:
x.shape

In [None]:
ids = np.random.randint(0, len(x_val_single), 3)
for i in ids:
  x, y = x_val_single[i], y_val_single[i]
  plot = show_plot([x[:, 1], y, single_step_model.predict(x[np.newaxis, :])],
                   ['Single Step LSTM'],
                    12,
                   'Single Step Prediction')
  plot.show()

In [None]:
# [EXTRA]: Include more features in your model and/or tune the architecture to improve your predictions !

### Multi-Step model
In a multi-step prediction model, given a past history, the model needs to learn to predict a range of future values. Thus, unlike a single step model, where only a single future point is predicted, a multi-step model predict a sequence of the future.

For the multi-step model, the training data again consists of recordings __over the past 5 days__ sampled __every hour__. However, here, the model needs to learn to predict the temperature __for the next 12 hours__. Since an obversation is taken every 10 minutes, the output is __72 predictions__. For this task, the dataset needs to be prepared accordingly, thus the first step is just to create it again, but with a different target window.

In [None]:
# TODO: create the train and val dataset with this new setting
# (use the extract_data_labels_multi function defined before)

<details>
<summary markdown='span'>View solution
</summary>

```python
future_target = 72
x_train_multi, y_train_multi = extract_data_labels_multi(dataset, dataset[:, 1], 0,
                                                 TRAIN_SPLIT, past_history,
                                                 future_target, STEP)
x_val_multi, y_val_multi = extract_data_labels_multi(dataset, dataset[:, 1],
                                             TRAIN_SPLIT, None, past_history,
                                             future_target, STEP)
```

Let's check out a sample data-point.

In [None]:
print ('Single window of past history : {}'.format(x_train_multi[0].shape))
print ('\n Target temperature to predict : {}'.format(y_train_multi[0].shape))

Plotting a sample data-point.

In [None]:
def multi_step_plot(history, true_future, prediction):
  plt.figure(figsize=(12, 6))
  num_in = create_time_steps(len(history))
  num_out = len(true_future)

  plt.plot(num_in, np.array(history[:, 1]), label='History')
  plt.plot(np.arange(num_out)/STEP, np.array(true_future), 'bo',
           label='True Future')
  if prediction.any():
    plt.plot(np.arange(num_out)/STEP, np.array(prediction), 'ro',
             label='Predicted Future')
  plt.legend(loc='upper left')
  plt.show()

In this plot and subsequent similar plots, the history and the future data are sampled every hour.

In [None]:
multi_step_plot(x_train_multi[0], y_train_multi[0], np.array([0]))

Since the task here is a bit more complicated than the previous task, the model now consists of two LSTM layers. Finally, since 72 predictions are made, the dense layer outputs 72 predictions.

In [None]:
x_train_single.shape[-2:]

In [None]:
# TODO: build and compile the model

In [None]:
multi_step_model.summary()

<details>
<summary markdown='span'>View solution
</summary>

```python
multi_step_model = tf.keras.models.Sequential([
    tf.keras.layers.LSTM(32, return_sequences=True, input_shape=x_train_multi.shape[-2:]),
    tf.keras.layers.LSTM(16, activation='relu'),
    tf.keras.layers.Dense(72)
])

multi_step_model.compile(optimizer=tf.keras.optimizers.RMSprop(clipvalue=1.0), loss='mae', metrics=['mse'])
```

Let's see how the model predicts before it trains.

In [None]:
# TODO: check your model is able to predict on one sample

<details>
<summary markdown='span'>View solution
</summary>

```python
print(multi_step_model.predict(x_train_multi[:1]).shape)
```

In [None]:
# TODO: fit your model with the same settins as before


<details>
<summary markdown='span'>View solution
</summary>

```python
multi_step_history = multi_step_model.fit(x_train_multi, y_train_multi,
                                          epochs=EPOCHS, batch_size=BATCH_SIZE,
                                          steps_per_epoch=EVALUATION_INTERVAL,
                                          shuffle=True,
                                          validation_data=(x_val_multi, y_val_multi),
                                          validation_steps=50)
```

In [None]:
history_df = pd.DataFrame(multi_step_history.history)
history_df['epochs'] = history_df.index
history_df

fig, axes = plt.subplots(2, 1, figsize=(14, 12))
 
for i, metric in enumerate(['loss', 'mse']):
  ax = axes[i]
  history_df.plot('epochs', f'{metric}', color='g', label='train', ax=ax)
  history_df.plot('epochs', f'val_{metric}', color='r', label='val', ax=ax)
  ax.set_ylabel(metric)
plt.show()

<details>
<summary markdown='span'>View solution
</summary>

```python
history_df = pd.DataFrame(single_step_history.history)
history_df['epochs'] = history_df.index
history_df

fig, axes = plt.subplots(2, 1, figsize=(14, 12))
 
for i, metric in enumerate(['loss', 'mse']):
  ax = axes[i]
  history_df.plot('epochs', f'{metric}', color='g', label='train', ax=ax)
  history_df.plot('epochs', f'val_{metric}', color='r', label='val', ax=ax)
  ax.set_ylabel(metric)
plt.show()
```

#### Predict a multi-step future
Let's now have a look at how well your network has learnt to predict the future.

In [None]:
multi_step_model.predict(x[np.newaxis, :]).shape

In [None]:
ids = np.random.randint(0, len(x_val_multi), 3)
for i in ids:
  x, y = x_val_multi[i], y_val_multi[i]
  multi_step_plot(x, y, multi_step_model.predict(x[np.newaxis, :])[0])