# Week 2: Deep Neural Networks for Time Series
## Preparing features and labels
we have to divide our data into features and labels. In this case our feature is effectively a number of values in the series, with our label being the next value.
We'll call that number of values that will treat as our feature, the **window size**, where we're taking a window of the data and training an ML model to predict the next value.
So for example, if we take our time series data, say, 30 days at a time, we'll use 30 values as the feature and the next value is the label. Then over time, we'll train a neural network to match the 30 features to the single label.
Create some data for us:

In [1]:
import tensorflow as tf
dataset = tf.data.Dataset.range(10) # Make a range of 10 values
for val in dataset:
    print(val.numpy())



0
1
2
3
4
5
6
7
8
9


In [33]:
dataset = tf.data.Dataset.range(10)
# expand dataset using windowing:
dataset = dataset.window(5, shift=1, drop_remainder=True) # size of the window and how much to shift each time
# drop_remainder - will truncate the data and will give us only windows of 5 items
for window_dataset in dataset:
    for val in window_dataset:
        print(val.numpy(), end=" ")
    print()

0 1 2 3 4 
1 2 3 4 5 
2 3 4 5 6 
3 4 5 6 7 
4 5 6 7 8 
5 6 7 8 9 


In [34]:
# Create numpy lists:
dataset = dataset.flat_map(lambda window: window.batch(5))
for window in dataset:
    print(window.numpy())

[0 1 2 3 4]
[1 2 3 4 5]
[2 3 4 5 6]
[3 4 5 6 7]
[4 5 6 7 8]
[5 6 7 8 9]


In [35]:
# Split data into features and labels:
dataset = dataset.map(lambda window: (window[:-1], window[-1:]))
for x,y in dataset:
    print(x.numpy(), y.numpy())

[0 1 2 3] [4]
[1 2 3 4] [5]
[2 3 4 5] [6]
[3 4 5 6] [7]
[4 5 6 7] [8]
[5 6 7 8] [9]


In [36]:
# Shuffle data before training:
dataset = dataset.shuffle(buffer_size=10) # 10 - is amount of data items that we have
# Batch some data:
dataset = dataset.batch(2).prefetch(1) # batch the data in the sets of two
for x,y in dataset:
    print("x = ", x.numpy())
    print("y = ", y.numpy())


x =  [[1 2 3 4]
 [5 6 7 8]]
y =  [[5]
 [9]]
x =  [[2 3 4 5]
 [0 1 2 3]]
y =  [[6]
 [4]]
x =  [[3 4 5 6]
 [4 5 6 7]]
y =  [[7]
 [8]]


## Feeding windowed dataset into a neural network
First we start with a helpful function that will split data series into windows.

In [None]:
def windowed_dataset(series, window_size, batch_size, shuffle_buffer):
    dataset = tf.data.Dataset.from_tensor_slices(series) # create a dataset from series
    dataset = dataset.window(window_size + 1, shift=1, drop_remainder=True) # slice data up into appropriate windows
    dataset = dataset.flat_map(lambda window: window.batch(window_size + 1)) # flatten data into chunks of 'window_size+1'
    dataset = dataset.shuffle(shuffle_buffer).map(lambda window: (window[:-1], window[-1])) # Shuffle the dataset and split it into X and Y. Using shuffle buffer speeds up shuffling
    dataset.batch(batch_size).prefetch(1) # create data batches and return it back
    return dataset



## Single layer neural network
Before we can do a training, we have to split our data set into training and validation sets:
```jupyterpython
split_time = 1000
time_train = time[:split_time]
x_train = series[:split_time]
time_valid = time[split_time:]
x_valid = series[split_time:]
```

Code to perform a simple linear regression:

In [None]:
window_size = 20
batch_size = 32
shuffle_buffer_size = 1000

dataset = window_dataset(series, window_size, batch_size, shuffle_buffer_size)
l0 = tf.keras.layers.Dense(1, input_shape=[window_size])
model = tf.keras.models.Sequential([l0])

model.compile(loss="mse", optimizer=tf.keras.optimizers.SGD(lr=1e-6, momentum=0.9))
model.fit(dataset, epochs=100, verbose=0)

# Inspect layer weights (W and b):
print("Layer weights {}".format(l0.get_weights()))

# To see X values:
print(series[1:21])
# To predict Y values:
model.predict(series[1:21][np.newaxis]) # newaxis reshapes data to the input dimension that is used by the model

# To plot forecast on every point on a time series:
forecast = []
for time in range(len(series) - window_size):
    forecast.append(model.predict(series[time:time + window_size][np.newaxis])) # iterate over series taking slices of forecasts

forecast = forecast[split_time-window_size:]
results = np.array(forecast)[:, 0, 0]

# Measure MAE:
tf.keras.metrics.mean_absolute_error(x_valid, results).numpy()

## Deep neural network training, tuning and prediction
A simple 3-layer NN will look similar to a single layer network:

In [None]:
dataset = windowed_dataset(x_train, window_size, batch_size, shuffle_buffer_size)

# 10-10-1 neurons network
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(10, input_shape=[window_size], activation="relu"),
    tf.keras.layers.Dense(10, activation="relu"),
    tf.keras.layers.Dense(1)
])

model.compile(loss="mse", optimizer=tf.keras.optimizers.SGD(lr=1e-6, momentum=0.9))
model.fit(dataset, epochs=100, verbose=0)

# Calculate MAE:
tf.keras.metrics.mean_absolute_error(x_valid, results).numpy()

As a result MAE is a little bit better, but still not that far from 1-layer network.
It would be nice to pick an optimal learning rate instead of a hardcoded one. For this we can use a callback technique (LearningRateScheduler), used previously in a course:

In [None]:
import numpy as np
dataset = windowed_dataset(x_train, window_size, batch_size, shuffle_buffer_size)

model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(10, input_shape=[window_size], activation="relu"),
    tf.keras.layers.Dense(10, activation="relu"),
    tf.keras.layers.Dense(1)
])

# LearningRateScheduler will be called as a callback at the end of each epoch
# It will change the learning rate to a number based on the epoch value.
lr_schedule = tf.keras.callbacks.LearningRateScheduler(lambda epoch: 1e-8 * 10**(epoch / 20))
optimizer = tf.keras.optimizers.SGD(lr=1e-8, momentum=0.9)
model.compile(loss="mse", optimizer=optimizer)
history=model.fit(dataset, epochs=100, callbacks=[lr_schedule]) # trigger learning rate recalc on every epoch end

# Get loss and learning rate per epoch:
lrs = 1e-8 * (10 ** (np.arange(100) / 20))
plt.semilogx(lrs, history.history["loss"])
plt.axis([1e-8, 1e-3, 0, 300])

loss = history.history['loss']
epochs = range(len(acc))
plt.plot(epochs, loss, 'b', label='Training Loss')
plt.show()