In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras

> recurrent neural networks
> In this chapter we will first look at the fundamental concepts underlying RNNs and
how to train them using backpropagation through time, then we will use them to
forecast a time series.

# Recurrent Neurons and Layers
- A recurrent neural network looks very much like a feedforward neural network, except it also has connections pointing backward
- At each time step t (also called a frame)
- Each recurrent neuron has two sets of weights: one for the inputs x(t) and the other for the outputs of the previous time step, y(t–1).

## Memory Cells
- Since the output of a recurrent neuron at time step t is a function of all the inputs from previous time steps, you could say it has a form of memory.
    - This makes Y(t) a function of all the inputs since time t = 0 (that is, X(0), X(1), …, X(t)).
- A part of a neural network that preserves some state across time steps is called a memory cell (or simply a cell)


## Input and Output Sequences
- 

## Training RNNs

- To train an RNN, the trick is to unroll it through time (like we just did) and then simply use regular backpropagation (see Figure 15-5). This strategy is called backpropagation through time (BPTT).
- Note that the gradients flow backward through all the outputs used by the cost function, not just through the final output

# Forecasting a Time Series
- multivariate time series

In [8]:
def generate_time_series(batch_size, n_steps):
    freq1, freq2, offsets1, offsets2 = np.random.rand(4, batch_size, 1)
    time = np.linspace(0, 1, n_steps)
    series = 0.5 * np.sin((time - offsets1) * (freq1 * 10 + 10)) # wave 1
    series += 0.2 * np.sin((time - offsets2) * (freq2 * 20 + 20)) # + wave 2
    # print(series.shape)
    series += 0.1 * (np.random.rand(batch_size, n_steps) - 0.5) # + noise
    # ellipsis use
    return series[..., np.newaxis].astype(np.float32)

In [9]:
n_steps = 50
series = generate_time_series(10000, n_steps + 1)
X_train, y_train = series[:7000, :n_steps], series[:7000, -1]
X_valid, y_valid = series[7000:9000, :n_steps], series[7000:9000, -1]
X_test, y_test = series[9000:, :n_steps], series[9000:, -1]

In [10]:
X_train.shape

(7000, 50, 1)

## Baseline Metrics

In [12]:
model = keras.models.Sequential([
    keras.layers.Flatten(input_shape = [50, 1]),
    keras.layers.Dense(1)
])

In [15]:
model.compile(loss='mse')

In [19]:
model.fit(X_train, y_train, epochs = 20, validation_data = (X_valid, y_valid))

Train on 7000 samples, validate on 2000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<tensorflow.python.keras.callbacks.History at 0x14c098a1608>

In [20]:
y_pred = model.predict(X_valid)
np.mean(keras.losses.mean_squared_error(y_valid, y_pred))

0.003033986

## Implementing a Simple RNN

In [21]:
model = keras.models.Sequential([
    keras.layers.SimpleRNN(1, input_shape = [None, 1])
])

In [25]:
optimizer = keras.optimizers.Adam(lr=0.005)
model.compile(loss='mse',optimizer = optimizer)

In [26]:
model.fit(X_train, y_train, epochs = 20, validation_data = (X_valid, y_valid))

Train on 7000 samples, validate on 2000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<tensorflow.python.keras.callbacks.History at 0x14c0af04688>

In [27]:
y_pred = model.predict(X_valid)
np.mean(keras.losses.mean_squared_error(y_valid, y_pred))

0.011219648

- 用时间序列到rnn，每次只输入一个时间点的数据，而不是所有，所以对于simpleRnn, 只有三个参数（w_x, w_y, bias）
- Note that for each neuron, a linear model has one parameter per input and per time step, plus a bias term (in the
simple linear model we used, that’s a total of 51 parameters). In contrast, for each
recurrent neuron in a simple RNN, there is just one parameter per input and per hidden
state dimension (in a simple RNN, that’s just the number of recurrent neurons in
the layer), plus a bias term. In this simple RNN, that’s a total of just three parameters.

## Deep RNNs
