# 循环神经网络(RNN)
前面已经说明全连接神经网络和卷积神经网络，在训练和预测阶段，都只是单独的取出每个输入，经过处理后
给出输出，前一个输入和后一个输入没有任何关系，在输入上可以随机。

但是对于一些任务需要前一个输入和后一个输入有关联关系。例如理解一句话时，单个单词是不够的，以及处理视频也是如此

循环神经网络增加了循环机制，使得信号从一个神经元传递到另一个神经元后，其值不会马上消失，而是继续存活，
以达到输入前后相关联的目的。循环神经网络会将隐层的内部神经元连接起来，使得隐层的输入不仅仅包含上一层的输出
，还包括上一时刻隐层的输出

In [1]:
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
print("number of train data is %d"%(mnist.train.num_examples))
print("number of test data is %d"%(mnist.test.num_examples))
trainimg=mnist.train.images
trainlabel=mnist.train.labels
testimg=mnist.test.images
testlabel=mnist.test.labels
print("MNIST ready")

W1115 10:06:12.539461 11132 deprecation.py:323] From <ipython-input-1-f6d522b1203c>:4: read_data_sets (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
W1115 10:06:12.542674 11132 deprecation.py:323] From D:\ProgramData\Anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:260: maybe_download (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Please write your own downloading logic.
W1115 10:06:12.545664 11132 deprecation.py:323] From D:\ProgramData\Anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:262: extract_images (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please

Extracting MNIST_data\train-images-idx3-ubyte.gz
Extracting MNIST_data\train-labels-idx1-ubyte.gz
Extracting MNIST_data\t10k-images-idx3-ubyte.gz
Extracting MNIST_data\t10k-labels-idx1-ubyte.gz
number of train data is 55000
number of test data is 10000
MNIST ready


### 构建神经网络模型
LSTM网络模型由一个输入层、一个全连接神经网络层、一个LSTM层和一个输出层组成。

In [5]:
# RNN神经网络参数
# 输入层的数量
n_input=28
n_steps=28
# 隐层的数量
n_hidden=128
# 输出的数量，因为是分类问题，这里一共有10类
n_classes=10
batch_size=100

x=tf.placeholder(tf.float32,[None,n_steps,n_input])
y=tf.placeholder(tf.float32,[None,n_classes])
# 随机初始化每层权重值和偏置量
weights={
    'hidden':tf.Variable(tf.random_normal([n_input,n_hidden])),
    'out':tf.Variable(tf.random_normal([n_hidden,n_classes]))
}
biases={
    'hidden':tf.Variable(tf.constant(0.1,shape=([n_hidden,]))),
    'out':tf.Variable(tf.constant(0.1,shape=([n_classes,])))
}
def RNN(_X,_weights,_biases):
    _X=tf.reshape(_X,[-1,n_input])
    # 输入层到隐层，第一次直接运算
    X_in=tf.matmul(_X,_weights['hidden'])+_biases['hidden']
    # 规则数据
    X_in=tf.reshape(X_in,[-1,n_steps,n_hidden])
    # 之后使用LSTM，定义一个 LSTM 结构，LSTM 中使用的变量会在该函数中自动被声明
    lstm_cell=tf.contrib.rnn.BasicLSTMCell(n_hidden,forget_bias=1.0,state_is_tuple=True)
    # 初始化，将 LSTM 中的状态初始化为全 0 数组，batch_size 给出一个 batch 的大小
    init_state=lstm_cell.zero_state(batch_size,dtype=tf.float32)
    # 使用dynamic_rnn方法，执行RNN运算
    # 一对(outputs, state),其中：
    # outputs： RNN输出Tensor。如果time_major == False(默认),这将是shape为[batch_size, max_time, cell.output_size]的Tensor.
    # 如果time_major == True,这将是shape为[max_time, batch_size, cell.output_size]的Tensor.
    # state： 最终的状态。一般情况下state的形状为 [batch_size, cell.output_size ]
    # 如果cell是LSTMCells,则state将是包含每个单元格的LSTMStateTuple的元组，state的形状为[2，batch_size, cell.output_size ]
    outputs,final_state=tf.nn.dynamic_rnn(lstm_cell,X_in,init_state=init_state,time_major=False)
    # 输出层
    results=tf.matmul(final_state[1],_weights['out'])+_biases['out']
    return results

pred=RNN(x,weights,biases)
# 定义损失函数和优化方法，其中损失函数使用sotfmax交叉熵，优化方法为Adam
cost=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred,labels=y))
optimizer=tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost)
 

ValueError: sequence_length must be a vector of length batch_size, but saw shape: (2, 100, 128)

进行数据训练和评估模型

In [7]:
# 进行模型评估
correct_pred=tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
accuracy=tf.reduce_mean(tf.cast(correct_pred,tf.float32))
# 初始化
init=tf.global_variables_initializer()
# 开始运行
with tf.Session() as sess:
    sess.run(init)
    # 每次选择100个样本
    display_step=100
    step=0
    train_iters=int(mnist.train.num_examples)
    # 持续迭代
    while step*batch_size<train_iters:
        batch_xs,batch_ys=mnist.train.next_batch(batch_size)
        # 对数据进行处理，使其符合输入
        batch_xs=batch_xs.reshape((batch_size,n_steps,n_input))
        sess.run(optimizer,{x:batch_xs,y:batch_ys})
        # 在特定的迭代回合进行数据的输出
        if step % display_step==0:
            acc=sess.run(accuracy,{x:batch_xs,y:batch_ys,})
            print('step %d,training accuracy %g'%(step,acc))
        step+=1


step 0,training accuracy 0.13
test accuracy 0.1009
step 100,training accuracy 0.89
test accuracy 0.8939
step 200,training accuracy 0.92
test accuracy 0.9215
step 300,training accuracy 0.94
test accuracy 0.9363
step 400,training accuracy 0.95
test accuracy 0.9435
step 500,training accuracy 0.94
test accuracy 0.9507
step 600,training accuracy 0.99
test accuracy 0.9516
step 700,training accuracy 0.98
test accuracy 0.9484
step 800,training accuracy 0.92
test accuracy 0.9461
step 900,training accuracy 0.97
test accuracy 0.9529
step 1000,training accuracy 0.93
test accuracy 0.9518
