# 循环神经网络

本节将介绍如何在MNIST数据集中，搭建一个简单的循环神经网络，使用Tensorflow将进行手写字体分为0-9之间的10个类别。

## 代码环境
1. Python 3.6.1
1. Tensorflow 1.4.0
1. Jupyter 4.3.0


- import 需要的库

In [2]:
import tensorflow as tf
from tensorflow.contrib import rnn
from tensorflow.examples.tutorials.mnist import input_data

- import MNIST data

In [4]:
mnist = input_data.read_data_sets("data/", one_hot=True)

Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting data/t10k-labels-idx1-ubyte.gz


## 构建模型
首先设置训练的超参数，分别设置学习率、训练轮数和每轮训练的数据大小：
- 设置训练的超参数，学习率为0.001、训练轮数100000次，以及batch_size

In [5]:
lr = 0.001
training_iters = 100000
batch_size = 128
display_step = 10

为了使用RNN来做图片分类，可以把图片看成一个像素序列。MNIST图片的大小是28x28像素，所以把每一个图像样本看成一行行的序列。因此共有（28个元素序列）x（28行），每一步输入序列的长度是28，输入的步数是28步
- 设置神经网络参数，序列长度28，步数28，隐藏单元128，分类的类别10

In [6]:
n_input = 28
n_step = 28
n_hidden = 128
n_classes = 10

定义输入数据以及权重

In [7]:
x = tf.placeholder(tf.float32, [None, n_step, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])

weights = {
    # (28,128)
    'in': tf.Variable(tf.random_normal([n_input,n_hidden])),
    # (128,10)
    'out': tf.Variable(tf.random_normal([n_hidden,n_classes]))
}

biases = {
    # (128)
    'in': tf.Variable(tf.constant(0.1,shape=[n_hidden,])),
    # (10,)
    'out': tf.Variable(tf.constant(0.1,shape=[n_classes,]))
}

## 定义RNN模型
- 把输入的X转换成　X ==> (128 batch*28 steps,28 inputs)
- 采用基本的LSTM循环网络单元
- 输出该序列的各个分类概率

In [8]:
def rnn_model(X,weights,biases):
    # X ==> (128 batch*28 steps,28 inputs)
    X = tf.reshape(X,[-1,n_input])
    # X_in = (128 batch*28 steps,128 hidden)
    X_in = tf.matmul(X,weights['in']+biases['in'])
    # X_in ==> (128 batch,28 steps,128 hidden)
    X_in = tf.reshape(X_in,[-1,n_step,n_hidden])

    #use basic LSTM Cell
    lstm_cell = rnn.BasicLSTMCell(n_hidden,forget_bias=1.0,
                                  state_is_tuple=True)
    init_state = lstm_cell.zero_state(batch_size,dtype=tf.float32)
    outputs,final_state = tf.nn.dynamic_rnn(lstm_cell,X_in,initial_state=init_state,
                                            time_major=False)
    results = tf.matmul(final_state[1],weights['out'] + biases['out'])
    return results

定义损失函数和优化器，优化器采用AdamOptimizer

In [9]:
pred = rnn_model(x,weights,biases)
cost = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(lr).minimize(cost)

定义模型预测结果以及准确率的计算方法

In [10]:
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

## 训练数据以及评估模型

In [11]:
tf.summary.scalar('accuracy', accuracy)
tf.summary.scalar('loss', cost)
summaries = tf.summary.merge_all()

with tf.Session() as sess:
    train_writer = tf.summary.FileWriter('logs/', sess.graph)
    init = tf.global_variables_initializer()
    sess.run(init)
    step = 1
    while batch_size * step < training_iters:
        batch_x, batch_y = mnist.train.next_batch(batch_size)
        batch_x = batch_x.reshape(batch_size, n_step, n_input)
        sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
        if step % display_step == 0:
            acc, loss = sess.run(
                [accuracy, cost], feed_dict={x: batch_x,
                                             y: batch_y})
            print("Iteration " + str(step * batch_size) + ", Minibatch Loss= " + \
                  "{:.6f}".format(loss) + ", Training Accuracy= " + \
                  "{:.5f}".format(acc))
        if step % 100 == 0:
            s = sess.run(summaries, feed_dict={x: batch_x, y: batch_y})
            train_writer.add_summary(s, global_step=step)

        step += 1
    print("Optimization Finished!")


Iteration 1280, Minibatch Loss= 1.301216, Training Accuracy= 0.55469
Iteration 2560, Minibatch Loss= 1.010427, Training Accuracy= 0.66406
Iteration 3840, Minibatch Loss= 0.788430, Training Accuracy= 0.70312
Iteration 5120, Minibatch Loss= 0.667551, Training Accuracy= 0.78906
Iteration 6400, Minibatch Loss= 0.595816, Training Accuracy= 0.78125
Iteration 7680, Minibatch Loss= 0.350965, Training Accuracy= 0.88281
Iteration 8960, Minibatch Loss= 0.499101, Training Accuracy= 0.79688
Iteration 10240, Minibatch Loss= 0.482088, Training Accuracy= 0.82812
Iteration 11520, Minibatch Loss= 0.504503, Training Accuracy= 0.83594
Iteration 12800, Minibatch Loss= 0.271221, Training Accuracy= 0.91406
Iteration 14080, Minibatch Loss= 0.464995, Training Accuracy= 0.86719
Iteration 15360, Minibatch Loss= 0.322582, Training Accuracy= 0.89844
Iteration 16640, Minibatch Loss= 0.347899, Training Accuracy= 0.88281
Iteration 17920, Minibatch Loss= 0.394192, Training Accuracy= 0.88281
Iteration 19200, Minibatch 

代码参考[https://github.com/nlintz/TensorFlow-Tutorials/blob/master/07_lstm.py](https://github.com/nlintz/TensorFlow-Tutorials/blob/master/07_lstm.py "title")