# Forecasting Next Time Step

In [1]:
import numpy as np

In [2]:
'''This function creates as many time series as requested (via the batch_size argument), each of length n_steps, and there is just one
value per time step in each series (i.e., all series are univariate). The function returns a NumPy array of shape [batch size, time steps, 1], where each series is the sum of two sine waves of fixed amplitudes but random frequencies and phases, plus a bit of noise.'''

def generate_time_series(batch_size, n_steps):
    freq1, freq2, offsets1, offsets2 = np.random.rand(4, batch_size, 1)
    time = np.linspace(0, 1, n_steps)
    series = 0.5 * np.sin((time - offsets1) * (freq1 * 10 + 10))  #  wave 1
    series += 0.2 * np.sin((time - offsets2) * (freq2 * 20 + 20)) # +wave 2
    series += 0.1 * (np.random.rand(batch_size, n_steps) - 0.5)   # +noise
    return series[..., np.newaxis].astype(np.float32)

In [3]:
n_steps = 50
series = generate_time_series(10000, n_steps + 1)
X_train, y_train = series[:7000, :n_steps], series[:7000, -1]
X_valid, y_valid = series[7000:9000, :n_steps], series[7000:9000, -1]
X_test, y_test = series[9000:, :n_steps], series[9000:, -1]

# X_train contains 7,000 time series (i.e., its shape is [7000, 50, 1])'.  Since we want to forecast a single value for each series, the targets are column vectors (e.g. y_train has a shape of [7000, 1])

## Methods

In [4]:
import tensorflow as tf
import keras

### 1) Naive Forecasting
Predicting the last value in each series.

In [5]:
y_pred = X_valid[:, -1]
np.mean(keras.losses.mean_squared_error(y_valid, y_pred))

0.02037423

### 2) Linear Regression

In [6]:
model = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[50, 1]),
    keras.layers.Dense(1)
])

In [7]:
model.compile(optimizer='adam', loss='mse')

In [8]:
model.fit(X_train, y_train, epochs=20)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x1d10ff4a3e0>

In [9]:
model.evaluate(X_valid, y_valid)



0.0038358382880687714

### 3) Single Simple RNN

In [10]:
model = keras.models.Sequential([
    keras.layers.SimpleRNN(1, input_shape=[None, 1])
])

'''It contains a single layer with a single neuron.(=1 given by us). Do not need to specify the length of the input sequence, since a RNN can process any number of time steps (hence first input dimension = None). By default, the SimpleRNN layer uses hyperbolic tangent activation function.'''

'It contains a single layer with a single neuron.(=1 given by us). Do not need to specify the length of the input sequence, since a RNN can process any number of time steps (hence first input dimension = None). By default, the SimpleRNN layer uses hyperbolic tangent activation function.'

In [11]:
model.compile(optimizer='adam', loss='mse')
model.fit(X_train, y_train, epochs=20)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x1d11028f1c0>

In [12]:
model.evaluate(X_valid, y_valid)



0.012219041585922241

### 4) Multilayered Simple RNN

In [13]:
'''Set return_sequences=True for all recurrent layers (except the last one, if you only care about the last output). If you don’t, they will output a 2D array (containing only the output of the last time step) instead of a 3D array (containing outputs for all time steps), and the next recurrent layer will complain that you are not feeding it sequences in the expected 3D format.'''

model = keras.models.Sequential([
    keras.layers.SimpleRNN(20, return_sequences=True, input_shape=[None,1]),    # no of neurons = 20
    keras.layers.SimpleRNN(20, return_sequences=True),
    keras.layers.SimpleRNN(1)
])

In [14]:
''' RNN will mostly use the hidden states of the other recurrent layers to carry over all the information it needs from time step to time step, and it will not use the final layer’s hidden state very much as it's return_sequence if false. Moreover, a SimpleRNN layer uses the tanh activation function by default, the predicted values must lie within the range –1 to 1. But what if you want to use another activation function?
 For both these reasons, replace the output layer with a Dense layer. Also make sure to remove return_sequences=True from the second (now last) recurrent layer'''

model = keras.models.Sequential([
    keras.layers.SimpleRNN(20, return_sequences=True, input_shape=[None,1]),
    keras.layers.SimpleRNN(20),
    keras.layers.Dense(1)
])

In [15]:
model.compile(optimizer='adam', loss='mse')
model.fit(X_train, y_train, epochs=20)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x1d11357e6b0>

In [16]:
model.evaluate(X_valid, y_valid)



0.0026811794377863407

## Conclusion

Accuracy --> Stacked RNN > ANN > Single RNN

# Forecasting Several Time Steps Ahead
Eg: We want to predict the next 10 values

## Methods

## 1) Multilayered Simple RNN with single output
Use the model we already trained, make it predict the next value, then add that value to the inputs (acting as if this predicted value had actually occurred), and use the model again to predict the following value, and so on.

In [17]:
series = generate_time_series(1, n_steps + 10)              # generating actual data upto 10 steps ahead
X_new, Y_new = series[:, :n_steps], series[:, n_steps:]     # X = instances from start to n-steps, Y from n_steps to n_steps + 10
X = X_new
for step_ahead in range(10):
    y_pred_one = model.predict(X[:, step_ahead:])[:, np.newaxis, :]     # explained below
    X = np.concatenate([X, y_pred_one], axis=1)     # axis=1 specifies that concatenation should be done along the second axis, which represents the time steps.

## [:, step_ahead:]: means include all rows (instances) of X, and step_ahead: specifies the range of columns (time steps) to select from step_ahead until the end of the sequence
## [:, np.newaxis, :]: This part of the code reshapes the predicted values to match the target shape. The np.newaxis adds a new axis (dimension) to the predictions, specifically as the second axis. The : on both sides ensures that all rows and columns of the predictions are included. This reshaping is necessary to ensure compatibility when concatenating the predicted values later.
## y_pred_one will have the shape (num_instances, 1, num_features), where num_instances represents the number of rows in X, 1 represents the newly added axis, and num_features represents the number of features in the data.



In [18]:
Y_pred = X[:, n_steps:]

In [19]:
model.evaluate(Y_pred, Y_new)



0.1583329141139984

## 2) Multilayered Simple RNN with 10 outputs
Predict all 10 next values at once. We can still use a sequence-to-vector model, but it will output 10 values instead of 1. However, we first need to change the targets to be vectors containing the next 10 values

In [20]:
series = generate_time_series(10000, n_steps + 10)
X_train, Y_train = series[:7000, :n_steps], series[:7000, -10:, 0] # (#instances, #time steps, #features / dimensionality)
X_valid, Y_valid = series[7000:9000, :n_steps], series[7000:9000, -10:, 0]
X_test, Y_test = series[9000:, :n_steps], series[9000:, -10:, 0]

# 0 in series[:, :,  ], indicates that it wants to extract the values of the first feature for the selected rows and columns.
# Therefore, series[9000:, -10:, 0] is selecting the last 10 steps of the first feature (index 0) for the instances starting from the 9000th row until the end of the series dataset.

In [21]:
model = keras.models.Sequential([
    keras.layers.SimpleRNN(20, return_sequences=True, input_shape=[None, 1]),
    keras.layers.SimpleRNN(20),
    keras.layers.Dense(10)
])

In [22]:
Y_pred = model.predict(X_new)



## 3) Multilayer RNN with 10 output using TimeDistributed layer
At time step 0 the model will output a vector containing the forecasts for time steps 1 to 10, then at time step 1 the model will forecast time steps 2 to 11, and so on. So each target must be a sequence of the same length as the input sequence, containing a 10-dimensional vector at each step.

In [23]:
Y = np.empty((10000, n_steps, 10))  # (no of instances, no of time steps, dimensionality for each target is a sequence of 10D vectors)
for step_ahead in range(1, 10 + 1):
    Y[:, :, step_ahead - 1] = series[:, step_ahead:step_ahead + n_steps, 0]
Y_train = Y[:7000]
Y_valid = Y[7000:9000]
Y_test = Y[9000:]

# series[:, step_ahead:step_ahead + n_steps, 0] This selects a slice from the series dataset. The : before , indicates that all rows of series are included. step_ahead:step_ahead + n_steps specifies the range of columns (time steps) to select from step_ahead until step_ahead + n_steps. 0 at the end selects the values of the first feature or column.
# The resulting Y array will have dimensions (num_instances, n_steps, 10), where num_instances represents the number of instances in the dataset. Each slice of Y at Y[:, :, step_ahead - 1] will contain the target values for the corresponding step ahead.

In [24]:
'''To turn the model into a sequence-to-sequence model, set return_sequences=True in all recurrent layers (even the last one), and
we must apply the output Dense layer at every time step. Keras offers a TimeDistributed layer for this very purpose: it wraps any layer (e.g., a Dense layer) and applies it at every time step of its input sequence. It does so, by reshaping the inputs so that each time step is treated as a separate instance (i.e., it reshapes the inputs from [batch size, time steps, input dimensions] to [batch size × time steps, input dimensions]; in this example, the number of input dimensions is 20 because the previous SimpleRNN layer has 20 units), then it runs the Dense layer, and finally it reshapes the outputs back to sequences (i.e., it reshapes the outputs from [batch size × time steps, output dimensions] to [batch size, time steps, output dimensions]; in this example the number of output dimensions is 10, since the Dense layer has 10 units).'''

model = keras.models.Sequential([
    keras.layers.SimpleRNN(20, return_sequences=True, input_shape=[None, 1]),
    keras.layers.SimpleRNN(20, return_sequences=True),
    keras.layers.TimeDistributed(keras.layers.Dense(10))
])

# we could replace the last layer with just Dense(10).

In [27]:
'''All outputs are needed during training, but only the output at the last time step is useful for predictions and for evaluation. So although we will rely on the MSE over all the outputs for training, we will use a custom metric for evaluation, to only compute the MSE over the output at the last time step'''

def last_time_step_mse(Y_true, Y_pred):
    return keras.metrics.mean_squared_error(Y_true[:, -1], Y_pred[:, -1])

optimizer = keras.optimizers.Adam(learning_rate=0.01)
model.compile(loss="mse", optimizer=optimizer, metrics=[last_time_step_mse])