# Nguồn tham khảo:
* https://github.com/chiphuyen/stanford-tensorflow-tutorials/blob/master/examples/11_char_rnn.py
* https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/recurrent_network.ipynb
* https://docs.google.com/presentation/d/1QydMhsGFeUzDYZr7dV0tt9WJryleOoKUyJth4ftRDbA/edit#slide=id.g1ce324a1fd_0_403

Trong phần này, chúng ta sẽ cùng tìm hiểu về RNN và LSTM trong Tensorflow.

Trong Tensorflow, ngoài module keras thì Tensorflow còn hỗ trợ một loạt các cell (trong class `tf.nn.rnn_cell`):
* **BasicRNNCell**: Cell cơ bản nhất của RNN
* **RNNCell**: Đối tượng trừu tượng đại diện cho một cell RNN
* **BasicLSTMCell**: Cell mạng hồi quy LSTM cơ bản (https://arxiv.org/pdf/1409.2329.pdf)
* **LSTMCell**: Cell LSTM
* **GRUCell**: Cell GRU

Để biết thêm chi tiết, các bạn có thể tham khảo thêm document của Tensorflow.

Trong phần này, chúng ta sẽ cùng xây dựng mạng LSTM để phân loại chữ số viết tay trên bộ dữ liệu MNIST. Tương tự có thể làm với RNN. Dataset gồm 60000 mẫu để training và 10000 mẫu để test. Các ảnh số có kích thước chuẩn là 28x28 pixels với giá trị đã được chuẩn hoá từ 0 đến 1. Để đơn giản, mỗi ảnh đã được "duỗi thẳng" và thành mảng numpy 1 chiều 784 đặc trưng (28 * 28).

![digit](./images/digits.png)

Để phân lớp sử dụng mạng hồi quy LSTM, chúng ta xem mỗi hàng của ảnh như 1 chuỗi các pixel. Nởi vì chiều của các ảnh MNIST là 28 * 28px, tôi sẽ xử lý 28 chuỗi của 28 timestep cho mỗi mẫu.

In [1]:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting /tmp/data/train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


In [2]:
# Tham số training
learning_rate = 0.001
training_steps = 10000
batch_size = 128
display_step = 200

# Tham số mạng
num_input = 28 # Mnist data input (img shape: 28*28)
timesteps = 28
num_hidden = 128 # hidden layer num of features
num_classes = 10

# tf Graph
X = tf.placeholder("float", [None, timesteps, num_input])
Y = tf.placeholder("float", [None, num_classes])

In [3]:
# define weights
weights = {
    'out': tf.Variable(tf.random_normal([num_hidden, num_classes]))
}

biases = {
    'out': tf.Variable(tf.random_normal([num_classes]))
}

In [4]:
from tensorflow.contrib import rnn
def LSTM(x, weights, biases):
    # prepare data shape to match 'LSTM' function requirements
    # current data input shape: (batch_size, timesteps, n_input)
    # required shape: 'timestep' tensors list of shape (batch_size, n_input)
    
    # unstack to get a list of 'timestep' tensors of shape (batch_size, n_input)
    x = tf.unstack(x, timesteps, 1)
    
    # define a lstm cell with tensorflow
    lstm_cell = rnn.BasicLSTMCell(num_hidden, forget_bias=1.0)
    
    # get LSTM cell output
    outputs, states = rnn.static_rnn(lstm_cell, x, dtype=tf.float32)
    
    # linear activation, using rnn inner loop last output
    return tf.matmul(outputs[-1], weights['out']) + biases['out']

In [5]:
logits = LSTM(X, weights, biases)
prediction  = tf.nn.softmax(logits)

# define loss and optimizer
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
train_op = optimizer.minimize(loss_op)

# evaluate model (with test logits, for dropout to be disabled)
correct_pred = tf.equal(tf.argmax(prediction, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# initialize the variables
init = tf.global_variables_initializer()

Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
Instructions for updating:
Please use `keras.layers.RNN(cell, unroll=True)`, which is equivalent to this API
Instructions for updating:
Please use `layer.add_weight` method instead.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.



In [6]:
with tf.Session() as sess:
    # run initializer
    sess.run(init)
    
    for step in range(1, training_steps+1):
        batch_x, batch_y = mnist.train.next_batch(batch_size)
        # reshape data to get 28 seq of 28 elements
        batch_x = batch_x.reshape((batch_size, timesteps, num_input))
        # run optimization op (backprop)
        sess.run(train_op, feed_dict={X: batch_x, Y: batch_y})
        if step % display_step == 0 or step == 1:
            # calculate batch loss and accuracy
            loss, acc = sess.run([loss_op, accuracy], feed_dict={X: batch_x, Y: batch_y})
            print("Step {}, Minibatch loss={:.4f}, Traning accuracy = {:.3f}".format(step, loss, acc))
    print("optinmization Finised!")
    
    # calculate accuracy for 128 mnist test images
    test_len = 128
    test_data = mnist.test.images[:test_len].reshape((-1, timesteps, num_input))
    test_label = mnist.test.labels[:test_len]
    print("Testing accuracy:", sess.run(accuracy, feed_dict={X: test_data, Y: test_label}))

Step 1, Minibatch loss=2.4365, Traning accuracy = 0.172
Step 200, Minibatch loss=1.9859, Traning accuracy = 0.398
Step 400, Minibatch loss=1.8468, Traning accuracy = 0.445
Step 600, Minibatch loss=1.8170, Traning accuracy = 0.383
Step 800, Minibatch loss=1.7426, Traning accuracy = 0.383
Step 1000, Minibatch loss=1.5940, Traning accuracy = 0.477
Step 1200, Minibatch loss=1.5104, Traning accuracy = 0.516
Step 1400, Minibatch loss=1.5210, Traning accuracy = 0.516
Step 1600, Minibatch loss=1.4317, Traning accuracy = 0.555
Step 1800, Minibatch loss=1.2745, Traning accuracy = 0.633
Step 2000, Minibatch loss=1.3996, Traning accuracy = 0.531
Step 2200, Minibatch loss=1.2006, Traning accuracy = 0.641
Step 2400, Minibatch loss=1.2262, Traning accuracy = 0.586
Step 2600, Minibatch loss=1.1255, Traning accuracy = 0.695
Step 2800, Minibatch loss=1.1237, Traning accuracy = 0.664
Step 3000, Minibatch loss=1.1135, Traning accuracy = 0.625
Step 3200, Minibatch loss=1.0435, Traning accuracy = 0.734
Step