[View in Colaboratory](https://colab.research.google.com/github/JozeeLin/google-tensorflow-exercise/blob/master/Bidirectional_LSTM_Classifier.ipynb)

双向循环神经网络(Bidirectional Recurrent Neural Networks,Bi-RNN)的主要目标是增加RNN可利用的信息。比如普通的MLP对数据长度等有限制，而RNN虽然可以处理不固定长度的时序数据，

但是无法利用某个历史输入的未来信息。Bi-RNN则正好相反，它可以同时使用时序数据中某个输入的历史及未来数据。其实现原理很简单，将时序方向相反的两个循环神经网络连接到同一个输出，

通过这种结构，输出层就可以同时获取历史和未来信息了。

在需要上下文环境的情况中，Bi-RNN将会非常有用，比如在手写文字识别时，如果有当前要识别的单词的前面和后面一个单词的信息，那么将非常有利于识别。同样的，当我们在阅读文章时，

有时也需要通过下文的语境来预测文中某句话的准确含义。**对语言模型这类问题，可能Bi-RNN并不合适，因为我们的目标就是通过前文预测下一个单词，这里不能讲下文信息传给模型**。

对于很多分类问题，如手写文字识别、机器翻译、蛋白结构预测等，使用Bi-RNN将会大大提升模型效果。

**百度在其语音识别中也是通过Bi-RNN综合考虑上下文语境，将其模型准确率大大提升**。

Bi-RNN网络结构的核心是把一个普通的单向的RNN拆成两个方向，一个是随时序正向的，一个是逆着时序的反向的。这样当前时间节点的输出就可以同时利用正向、反向两个方向的信息，

而不像普通RNN需要等到后面时间节点才可以获取未来信息。

## 本节代码来自TensorFlow-Examples的开源实现

In [1]:
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('/tmp/data/', one_hot=True)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use urllib or similar directly.
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting /tmp/data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from t

In [0]:
learning_rate = 0.01
max_samples = 400000
batch_size = 128
display_step = 10

In [0]:
n_input = 28
n_steps = 28
n_hidden = 256
n_classes = 10

In [0]:
x = tf.placeholder('float',[None, n_steps, n_input])
y = tf.placeholder('float', [None, n_classes])

weights = tf.Variable(tf.random_normal([2*n_hidden, n_classes]))
biases = tf.Variable(tf.random_normal([n_classes]))

In [0]:
def BiRNN(x, weights, biases):
  x = tf.transpose(x, [1,0,2])
  x = tf.reshape(x, [-1, n_input])
  x = tf.split(x, n_steps)
  
  lstm_fw_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)
  lstm_bw_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)
  
  outputs, _, _ = tf.contrib.rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, x, dtype=tf.float32)
  
  return tf.matmul(outputs[-1], weights)+biases

In [6]:
pred = BiRNN(x, weights, biases)

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred,labels=y))

optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

init = tf.global_variables_initializer()

Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.



In [7]:
with tf.Session() as sess:
  sess.run(init)
  
  step = 1
  while step*batch_size < max_samples:
    batch_x, batch_y = mnist.train.next_batch(batch_size)
    
    batch_x = batch_x.reshape((batch_size, n_steps, n_input))
    
    sess.run(optimizer, feed_dict={x:batch_x,y:batch_y})
    
    if step%display_step == 0:
      acc = sess.run(accuracy, feed_dict={x:batch_x, y:batch_y})
      loss = sess.run(cost, feed_dict={x:batch_x, y:batch_y})
      print('Iter '+str(step*batch_size)+", Minibatch Loss= "+ \
           "{:.6}".format(loss)+", Training Accuracy= "+\
           "{:.5f}".format(acc))
      
    step += 1
  print('Optimization Finished!')
  
  #对mnist.test.images中全部的测试数据进行预测，并将准确率展示出来
  test_len = 10000
  test_data = mnist.test.images[:test_len].reshape((-1, n_steps, n_input))
  test_label = mnist.test.labels[:test_len]
  print('Testing Accuracy:', sess.run(accuracy, feed_dict={x:test_data, y:test_label}))

Iter 1280, Minibatch Loss= 2.35805, Training Accuracy= 0.25000
Iter 2560, Minibatch Loss= 1.42496, Training Accuracy= 0.52344
Iter 3840, Minibatch Loss= 1.02555, Training Accuracy= 0.62500
Iter 5120, Minibatch Loss= 1.08392, Training Accuracy= 0.64844
Iter 6400, Minibatch Loss= 0.559484, Training Accuracy= 0.82031
Iter 7680, Minibatch Loss= 0.545343, Training Accuracy= 0.78125
Iter 8960, Minibatch Loss= 0.369744, Training Accuracy= 0.86719
Iter 10240, Minibatch Loss= 0.210627, Training Accuracy= 0.93750
Iter 11520, Minibatch Loss= 0.117498, Training Accuracy= 0.97656
Iter 12800, Minibatch Loss= 0.25528, Training Accuracy= 0.92188
Iter 14080, Minibatch Loss= 0.391438, Training Accuracy= 0.85938
Iter 15360, Minibatch Loss= 0.239144, Training Accuracy= 0.92969
Iter 16640, Minibatch Loss= 0.408407, Training Accuracy= 0.91406
Iter 17920, Minibatch Loss= 0.288785, Training Accuracy= 0.92969
Iter 19200, Minibatch Loss= 0.33451, Training Accuracy= 0.88281
Iter 20480, Minibatch Loss= 0.141443, 

Iter 71680, Minibatch Loss= 0.0882015, Training Accuracy= 0.96875
Iter 72960, Minibatch Loss= 0.0570838, Training Accuracy= 0.98438
Iter 74240, Minibatch Loss= 0.0510431, Training Accuracy= 0.98438
Iter 75520, Minibatch Loss= 0.070563, Training Accuracy= 0.98438
Iter 76800, Minibatch Loss= 0.0549392, Training Accuracy= 0.98438
Iter 78080, Minibatch Loss= 0.0593532, Training Accuracy= 0.97656
Iter 79360, Minibatch Loss= 0.0308864, Training Accuracy= 0.99219
Iter 80640, Minibatch Loss= 0.0529998, Training Accuracy= 0.98438
Iter 81920, Minibatch Loss= 0.0374699, Training Accuracy= 0.98438
Iter 83200, Minibatch Loss= 0.0418591, Training Accuracy= 0.98438
Iter 84480, Minibatch Loss= 0.0196275, Training Accuracy= 0.99219
Iter 85760, Minibatch Loss= 0.0818101, Training Accuracy= 0.97656
Iter 87040, Minibatch Loss= 0.0266132, Training Accuracy= 1.00000
Iter 88320, Minibatch Loss= 0.0238477, Training Accuracy= 0.99219
Iter 89600, Minibatch Loss= 0.122139, Training Accuracy= 0.98438
Iter 90880, 

Iter 140800, Minibatch Loss= 0.0906404, Training Accuracy= 0.97656
Iter 142080, Minibatch Loss= 0.0152759, Training Accuracy= 1.00000
Iter 143360, Minibatch Loss= 0.0147332, Training Accuracy= 1.00000
Iter 144640, Minibatch Loss= 0.0120303, Training Accuracy= 1.00000
Iter 145920, Minibatch Loss= 0.0508728, Training Accuracy= 0.98438
Iter 147200, Minibatch Loss= 0.0271843, Training Accuracy= 0.99219
Iter 148480, Minibatch Loss= 0.0478371, Training Accuracy= 0.97656
Iter 149760, Minibatch Loss= 0.0282381, Training Accuracy= 0.99219
Iter 151040, Minibatch Loss= 0.0684309, Training Accuracy= 0.98438
Iter 152320, Minibatch Loss= 0.0509333, Training Accuracy= 0.97656
Iter 153600, Minibatch Loss= 0.0499943, Training Accuracy= 0.99219
Iter 154880, Minibatch Loss= 0.0148821, Training Accuracy= 1.00000
Iter 156160, Minibatch Loss= 0.0341163, Training Accuracy= 0.98438
Iter 157440, Minibatch Loss= 0.0124518, Training Accuracy= 1.00000
Iter 158720, Minibatch Loss= 0.0400689, Training Accuracy= 0.9

Iter 209920, Minibatch Loss= 0.065367, Training Accuracy= 0.97656
Iter 211200, Minibatch Loss= 0.0199244, Training Accuracy= 0.98438
Iter 212480, Minibatch Loss= 0.0925493, Training Accuracy= 0.98438
Iter 213760, Minibatch Loss= 0.0178739, Training Accuracy= 0.99219
Iter 215040, Minibatch Loss= 0.031884, Training Accuracy= 0.99219
Iter 216320, Minibatch Loss= 0.082046, Training Accuracy= 0.97656
Iter 217600, Minibatch Loss= 0.0370816, Training Accuracy= 0.98438
Iter 218880, Minibatch Loss= 0.0438505, Training Accuracy= 0.99219
Iter 220160, Minibatch Loss= 0.0814698, Training Accuracy= 0.98438
Iter 221440, Minibatch Loss= 0.0224496, Training Accuracy= 0.99219
Iter 222720, Minibatch Loss= 0.0621614, Training Accuracy= 0.98438
Iter 224000, Minibatch Loss= 0.0166266, Training Accuracy= 1.00000
Iter 225280, Minibatch Loss= 0.0276492, Training Accuracy= 0.98438
Iter 226560, Minibatch Loss= 0.029827, Training Accuracy= 0.98438
Iter 227840, Minibatch Loss= 0.0630167, Training Accuracy= 0.98438

Iter 279040, Minibatch Loss= 0.028718, Training Accuracy= 0.99219
Iter 280320, Minibatch Loss= 0.00773497, Training Accuracy= 1.00000
Iter 281600, Minibatch Loss= 0.00762161, Training Accuracy= 1.00000
Iter 282880, Minibatch Loss= 0.0124966, Training Accuracy= 1.00000
Iter 284160, Minibatch Loss= 0.053494, Training Accuracy= 0.98438
Iter 285440, Minibatch Loss= 0.021671, Training Accuracy= 0.99219
Iter 286720, Minibatch Loss= 0.0218721, Training Accuracy= 0.99219
Iter 288000, Minibatch Loss= 0.00344791, Training Accuracy= 1.00000
Iter 289280, Minibatch Loss= 0.0242906, Training Accuracy= 1.00000
Iter 290560, Minibatch Loss= 0.00761749, Training Accuracy= 1.00000
Iter 291840, Minibatch Loss= 0.00223829, Training Accuracy= 1.00000
Iter 293120, Minibatch Loss= 0.03785, Training Accuracy= 0.98438
Iter 294400, Minibatch Loss= 0.0755082, Training Accuracy= 0.97656
Iter 295680, Minibatch Loss= 0.0110098, Training Accuracy= 1.00000
Iter 296960, Minibatch Loss= 0.013355, Training Accuracy= 1.00

Iter 348160, Minibatch Loss= 0.00358348, Training Accuracy= 1.00000
Iter 349440, Minibatch Loss= 0.0240266, Training Accuracy= 0.99219
Iter 350720, Minibatch Loss= 0.00203763, Training Accuracy= 1.00000
Iter 352000, Minibatch Loss= 0.0100538, Training Accuracy= 1.00000
Iter 353280, Minibatch Loss= 0.00763917, Training Accuracy= 1.00000
Iter 354560, Minibatch Loss= 0.00892685, Training Accuracy= 1.00000
Iter 355840, Minibatch Loss= 0.00209191, Training Accuracy= 1.00000
Iter 357120, Minibatch Loss= 0.0255446, Training Accuracy= 0.99219
Iter 358400, Minibatch Loss= 0.0222952, Training Accuracy= 0.98438
Iter 359680, Minibatch Loss= 0.0182557, Training Accuracy= 0.99219
Iter 360960, Minibatch Loss= 0.0224101, Training Accuracy= 0.99219
Iter 362240, Minibatch Loss= 0.0176188, Training Accuracy= 1.00000
Iter 363520, Minibatch Loss= 0.0240097, Training Accuracy= 1.00000
Iter 364800, Minibatch Loss= 0.0422063, Training Accuracy= 0.99219
Iter 366080, Minibatch Loss= 0.0250617, Training Accuracy