# 动态RNN
实验目的：判断序列是否为线性，即序列二分类问题。序列的长度不是固定的，这个LSTM模型也是为多对1的，即输入为一个序列，输出为一个固定长度的数值，跟前面的用RNN做手写识别的例子类似，不同点是<font color="red">这个例子的输入序列的长度不同</font>，即`timestep`不同，每一个序列按照自身的序列长度 = `timestep`做更新。

### 1、生成数据：ToySequenceDate类
生成一组长度不等的随机序列，最短序列长度默认为3，最长序列默认为20，最短序列默认为3,其中的数据用0填充；用类函数next可以批量地取数据，得到的分别是`batch_x`,`batch_y`,`batch_seqlen`;
如把batch_size = 3;

$ batch_x

> [[[0.375], [0.417], [0.675], [0.212], [0.799], [0.713], [0.489], [0.181], [0.757], [0.97], [0.755], [0.357], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0]], [[0.426], [0.787], [0.323], [0.295], [0.572], [0.696], [0.651], [0.344], [0.156], [0.855], [0.933], [0.505], [0.533], [0.881], [0.055], [0.318], [0.532], [0.196], [0.54], [0.0]]] # 维度为(2,20,1)

$ batch_y
> [[0.0, 1.0], [0.0, 1.0]]    # 如果是线性序列，分类标签为[1.0,0.0]，如果非线性序列，分类标签为[0.0,1.0]

$batch_seqlen

> [12, 19]  # 序列有效的真实长度

### 2、网络模型：dynamicRNN(x, seqlen, weights, biases):
```
输入节点：1个
LSTM层节点： 64个
输出节点：2个
```
比上一个模型多了seqlen的变量

![模型图](http://ogtxggxo6.bkt.clouddn.com/ls.png?imageslim)

In [1]:
# -*- coding: utf-8 -*-
'''
A Dynamic Recurrent Neural Network (LSTM) implementation example using
TensorFlow library. This example is using a toy dataset to classify linear
sequences. The generated sequences have variable length.

Long Short Term Memory paper: http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf
'''
from __future__ import print_function

import tensorflow as tf
import random

# ====================
#  TOY DATA GENERATOR
# ====================
class ToySequenceData(object):
    """ Generate sequence of data with dynamic length.
    This class generate samples for training:
    - Class 0: linear sequences (i.e. [0, 1, 2, 3,...])
    - Class 1: random sequences (i.e. [1, 3, 10, 7,...])

    NOTICE:
    We have to pad each sequence to reach 'max_seq_len' for TensorFlow
    consistency (we cannot feed a numpy array with inconsistent
    dimensions). The dynamic calculation will then be perform thanks to
    'seqlen' attribute that records every actual sequence length.
    """
    def __init__(self, n_samples=1000, max_seq_len=20, min_seq_len=3,
                 max_value=1000):
        self.data = []
        self.labels = []
        self.seqlen = []
        for i in range(n_samples):
            # Random sequence length
            len = random.randint(min_seq_len, max_seq_len)
            # Monitor sequence length for TensorFlow dynamic calculation
            self.seqlen.append(len)
            # Add a random or linear int sequence (50% prob)
            if random.random() < .5:
                # Generate a linear sequence
                rand_start = random.randint(0, max_value - len)
                s = [[float(i)/max_value] for i in
                     range(rand_start, rand_start + len)]
                # Pad sequence for dimension consistency
                s += [[0.] for i in range(max_seq_len - len)]
                self.data.append(s)
                self.labels.append([1., 0.])
            else:
                # Generate a random sequence
                s = [[float(random.randint(0, max_value))/max_value]
                     for i in range(len)]
                # Pad sequence for dimension consistency
                s += [[0.] for i in range(max_seq_len - len)]
                self.data.append(s)
                self.labels.append([0., 1.])
        self.batch_id = 0

    def next(self, batch_size):
        """ Return a batch of data. When dataset end is reached, start over.
        """
        if self.batch_id == len(self.data):
            self.batch_id = 0
        batch_data = (self.data[self.batch_id:min(self.batch_id +
                                                  batch_size, len(self.data))])
        batch_labels = (self.labels[self.batch_id:min(self.batch_id +
                                                  batch_size, len(self.data))])
        batch_seqlen = (self.seqlen[self.batch_id:min(self.batch_id +
                                                  batch_size, len(self.data))])
        self.batch_id = min(self.batch_id + batch_size, len(self.data))
        return batch_data, batch_labels, batch_seqlen

In [2]:

# ==========
#   MODEL
# ==========

# Parameters
learning_rate = 0.01
training_iters = 2000000
batch_size = 128
display_step = 200

# Network Parameters
seq_max_len = 20 # Sequence max length
n_hidden = 64 # hidden layer num of features
n_classes = 2 # linear sequence or not

trainset = ToySequenceData(n_samples=1000, max_seq_len=seq_max_len)
testset = ToySequenceData(n_samples=500, max_seq_len=seq_max_len)

# tf Graph input
x = tf.placeholder("float", [None, seq_max_len, 1])
y = tf.placeholder("float", [None, n_classes])
# A placeholder for indicating each sequence length
seqlen = tf.placeholder(tf.int32, [None])

# Define weights
weights = {
    'out': tf.Variable(tf.random_normal([n_hidden, n_classes]))
}
biases = {
    'out': tf.Variable(tf.random_normal([n_classes]))
}


def dynamicRNN(x, seqlen, weights, biases):

    # Prepare data shape to match `rnn` function requirements
    # Current data input shape: (batch_size, n_steps, n_input)  # 原来数据维度(batch_size,n_steps,n_input)
    # Required shape: 'n_steps' tensors list of shape (batch_size, n_input)
    
    # Unstack to get a list of 'n_steps' tensors of shape (batch_size, n_input) # 重构数据维度(n_steps,batch_size,n_input)
    x = tf.unstack(x, seq_max_len, 1)  # 默认n_steps = seq_max_len = 20

    # Define a lstm cell with tensorflow
    lstm_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden)

    # Get lstm cell output, providing 'sequence_length' will perform dynamic
    # calculation.
    outputs, states = tf.contrib.rnn.static_rnn(lstm_cell, x, dtype=tf.float32,
                                sequence_length=seqlen)  # 这里就是与最基本的LSTM的区别

    # 上一个例子中，如果定义一个seqlen的序列为[28,28,28,...]，结果是一样的
    # When performing dynamic calculation, we must retrieve the last
    # dynamically computed output, i.e., if a sequence length is 10, we need
    # to retrieve the 10th output.
    # However TensorFlow doesn't support advanced indexing yet, so we build
    # a custom op that for each sample in batch size, get its length and
    # get the corresponding relevant output.

    # 'outputs' is a list of output at every timestep, we pack them in a Tensor
    # and change back dimension to [batch_size, n_step, n_input]
    outputs = tf.stack(outputs)  # 影响不大,把list转为tensor
    outputs = tf.transpose(outputs, [1, 0, 2])  # 转置函数，本来的outpus维度为(n_step，batch_size,hidden),
    # 现在的维度为(batch_size,n_step,hidden)
          
    # Hack to build the indexing and retrieve the right output.
    batch_size = tf.shape(outputs)[0] # 取batch_size
    # Start indices for each sample
    index = tf.range(0, batch_size) * seq_max_len + (seqlen - 1)
    # Indexing
    outputs = tf.gather(tf.reshape(outputs, [-1, n_hidden]), index) # 这里的outputs先是由原来的(batch_size,n_step,hidden)
    # 转为(batch_size*n_step,hidden),这样的每一列的hidden输出，都包括了多个batch_size,每一个batch_size相差seqlen_max,即20，通过
    # 加上seqlen来定位到真正有效的长度，即长度为10，即为10th的隐藏层输出
    # Linear activation, using outputs computed above
    return tf.matmul(outputs, weights['out']) + biases['out']

pred = dynamicRNN(x, seqlen, weights, biases)

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

# Evaluate model
correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf.global_variables_initializer()

  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


In [3]:
# Launch the graph
with tf.Session() as sess:
    sess.run(init)
    step = 1
    # Keep training until reach max iterations
    while step * batch_size < training_iters:
        batch_x, batch_y, batch_seqlen = trainset.next(batch_size)
        # Run optimization op (backprop)
        sess.run(optimizer, feed_dict={x: batch_x, y: batch_y,
                                       seqlen: batch_seqlen})
        if step % display_step == 0:
            # Calculate batch accuracy
            acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y,
                                                seqlen: batch_seqlen})
            # Calculate batch loss
            loss = sess.run(cost, feed_dict={x: batch_x, y: batch_y,
                                             seqlen: batch_seqlen})
            print("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
                  "{:.6f}".format(loss) + ", Training Accuracy= " + \
                  "{:.5f}".format(acc))
        step += 1
    print("Optimization Finished!")

    # Calculate accuracy
    test_data = testset.data
    test_label = testset.labels
    test_seqlen = testset.seqlen
    print("Testing Accuracy:", \
        sess.run(accuracy, feed_dict={x: test_data, y: test_label,
                                      seqlen: test_seqlen}))



Iter 25600, Minibatch Loss= 0.688734, Training Accuracy= 0.53846
Iter 51200, Minibatch Loss= 0.686731, Training Accuracy= 0.50962
Iter 76800, Minibatch Loss= 0.685059, Training Accuracy= 0.54808
Iter 102400, Minibatch Loss= 0.683110, Training Accuracy= 0.57692
Iter 128000, Minibatch Loss= 0.680413, Training Accuracy= 0.57692
Iter 153600, Minibatch Loss= 0.676176, Training Accuracy= 0.64423
Iter 179200, Minibatch Loss= 0.668738, Training Accuracy= 0.70192
Iter 204800, Minibatch Loss= 0.653560, Training Accuracy= 0.73077
Iter 230400, Minibatch Loss= 0.614810, Training Accuracy= 0.73077
Iter 256000, Minibatch Loss= 0.531648, Training Accuracy= 0.78846
Iter 281600, Minibatch Loss= 0.467554, Training Accuracy= 0.82692
Iter 307200, Minibatch Loss= 0.444708, Training Accuracy= 0.83654
Iter 332800, Minibatch Loss= 0.431049, Training Accuracy= 0.82692
Iter 358400, Minibatch Loss= 0.420798, Training Accuracy= 0.82692
Iter 384000, Minibatch Loss= 0.412461, Training Accuracy= 0.83654
Iter 409600, 