[toc]

# Tensorflow2 SimpleRNN

## 原型

In [1]:
tf.keras.layers.SimpleRNN(
    units, activation='tanh', use_bias=True, kernel_initializer='glorot_uniform',
    recurrent_initializer='orthogonal', bias_initializer='zeros',
    kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None,
    activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None,
    bias_constraint=None, dropout=0.0, recurrent_dropout=0.0,
    return_sequences=False, return_state=False, go_backwards=False, stateful=False,
    unroll=False, **kwargs
)

NameError: name 'tf' is not defined

需要注意的点：
1. activation 已经默认使用 tanh 了
2. SimpleRNN 中可以设置 dropout，有 `dropout` 和 `recurrent_dropout` 两个可以设置。如果是单独使用 Dropout() 的话，只能设置 `dropout` 不能设置 `recurrent_dropout`
3. tensorflow 默认的是 batch_major，即输入的数据格式为 [batch_size, seq_len, feature_size]，而 pytorch 中默认的是 time major（pytorch称为time first）

## 使用实例

默认只会输出最后一个时间步的 output，如图

![](https://gitee.com/EdwardElric_1683260718/picture_bed/raw/master/img/20200818130355.png)

In [3]:
import tensorflow as tf

batch_size = 2
seq_len = 20
feature_size = 10
x = tf.random.normal(shape=(batch_size, seq_len, feature_size))
rnn = tf.keras.layers.SimpleRNN(5)
output = rnn(x)
print(output.shape)

(2, 5)


## return_state

使用 return_state 不仅会输出最后一个时间步的 output，还会输出最后一个时间步的 hidden_state

![](https://gitee.com/EdwardElric_1683260718/picture_bed/raw/master/img/20200818141330.png)

而最后一个时间步的 output，实际上就是最后一个时间步的 hidden_state

In [5]:
import tensorflow as tf

batch_size = 2
seq_len = 20
feature_size = 10
x = tf.random.normal(shape=(batch_size, seq_len, feature_size))
rnn = tf.keras.layers.SimpleRNN(5, return_state=True)
output, hidden_state = rnn(x)
print(output)
print(hidden_state)
print(output==hidden_state) # hidden_state 是 output

tf.Tensor(
[[-0.9258157  -0.50056565  0.4142676   0.89130914 -0.76337886]
 [ 0.32924217  0.92086834 -0.86574733 -0.80070317  0.98537374]], shape=(2, 5), dtype=float32)
tf.Tensor(
[[-0.9258157  -0.50056565  0.4142676   0.89130914 -0.76337886]
 [ 0.32924217  0.92086834 -0.86574733 -0.80070317  0.98537374]], shape=(2, 5), dtype=float32)
tf.Tensor(
[[ True  True  True  True  True]
 [ True  True  True  True  True]], shape=(2, 5), dtype=bool)


## return_sequences

默认只会输出最后一个时间步的结果。而使用 return_sequences=True，可以输出所有时间步的结果。这在多个 RNN 沿深度方向叠加时会用到。

![](https://gitee.com/EdwardElric_1683260718/picture_bed/raw/master/img/20200818130909.png)

In [None]:
import tensorflow as tf

batch_size = 2
seq_len = 20
feature_size = 10
x = tf.random.normal(shape=(batch_size, seq_len, feature_size))
rnn = tf.keras.layers.SimpleRNN(5, return_sequence=True)
output = rnn(x)
print(output.shape)

## return_state + return_sequences

![](https://gitee.com/EdwardElric_1683260718/picture_bed/raw/master/img/20200818131403.png)

In [7]:
import tensorflow as tf

batch_size = 2
seq_len = 20
feature_size = 10
x = tf.random.normal(shape=(batch_size, seq_len, feature_size))
rnn = tf.keras.layers.SimpleRNN(5, return_state=True, return_sequences=True)
output, hidden_state = rnn(x)
print(output.shape)
print(hidden_state.shape)  # return_sequences 不会影响到 hidden_state
print(output[:, -1, :] == hidden_state) # hidden_state 取出每个batch的output的最后一个时间步

(2, 20, 5)
(2, 5)
tf.Tensor(
[[ True  True  True  True  True]
 [ True  True  True  True  True]], shape=(2, 5), dtype=bool)


# References

[tf.keras.layers.SimpleRNN  |  TensorFlow Core v2.2.0](https://www.tensorflow.org/api_docs/python/tf/keras/layers/SimpleRNN)