## Sequence Classification Problem

We will define a simple sequence classification problem to explore bidirectional LSTMs.

The problem is defined as a sequence of random values between 0 and 1. 

A binary label (0 or 1) is associated with each input. Initially, the output values are all 0. Once the cumulative sum of the input values in the sequence exceeds a threshold, then the output value flips from 0 to 1.

A threshold of 1/4 the sequence length is used.

For example, below is a sequence of 10 input timesteps (X):

```python
0.63144003 0.29414551 0.91587952 0.95189228 0.32195638 0.60742236 0.83895793 0.18023048 0.84762691 0.29165514
```

In this case the threshold is `2.5` and the corresponding classification output (y) would be:

```python
0 0 0 1 1 1 1 1 1 1
```

In [13]:
import random as rand
from random import random
from numpy import array
from numpy import cumsum

In [17]:
# create a sequence classification instance
def get_sequence(sequence_length):
    # create a sequence of random numbers in [0,1]
    X = array([random() for _ in range(sequence_length)])
    # calculate cut-off value to change class values
    limit = sequence_length / 4.0
    # determine the class outcome for each item in cumulative sequence
    y = array([0 if x < limit else 1 for x in cumsum(X)])
    # reshape input and output data to be suitable for LSTMs
    X = X.reshape(1, sequence_length, 1)
    y = y.reshape(1, sequence_length, 1)
    
    return X, y

# create n examples with random sequence lengths between 5 and 15
def get_examples(n):
    X_list = []
    y_list = []
    sequence_length_list = []
    for _ in range(n):
        sequence_length = rand.randrange(start=5, stop=15)
        X, y = get_sequence(sequence_length)
        X_list.append(X)
        y_list.append(y)
        sequence_length_list.append(sequence_length)
    
    return X_list, y_list, sequence_length_list

In [21]:
x_train, y_train, sequence_length_train = get_examples(100)
x_test, y_test, sequence_length_test = get_examples(30)
print(x_train[0])
print(y_train[0])
print(sequence_length_train[0])

[[[ 0.46206663]
  [ 0.09480065]
  [ 0.9923607 ]
  [ 0.20633283]
  [ 0.50758544]
  [ 0.58507118]
  [ 0.3267947 ]]]
[[[0]
  [0]
  [0]
  [1]
  [1]
  [1]
  [1]]]
7


In [None]:
# https://stackoverflow.com/questions/39808336/tensorflow-bidirectional-dynamic-rnn-none-values-error
# https://www.tensorflow.org/api_docs/python/tf/nn/bidirectional_dynamic_rnn
# https://guillaumegenthial.github.io/sequence-tagging-with-tensorflow.html

In [None]:
# bidirectional lstm + CRF
learning_rate   = 0.001
training_epochs = 100
input_size = 1
batch_size  = 32
num_units = 128 # the number of units in the LSTM cell
number_of_classes = 2 # one-hot encoding

input_data   = tf.placeholder(tf.float32, [None, None, input_size], name="input_data")
labels = tf.placeholder(tf.int32, shape=[None, None], name="labels") # shape = (batch, sentence)
seq_len = tf.placeholder(tf.int32, [None])

lstm_fw_cell = tf.nn.rnn_cell.LSTMCell(num_units, forget_bias=1.0, state_is_tuple=True)
lstm_bw_cell = tf.nn.rnn_cell.LSTMCell(num_units, forget_bias=1.0, state_is_tuple=True)
(output_fw, output_bw), states = tf.nn.bidirectional_dynamic_rnn(cell_fw=lstm_fw_cell, 
                                                  cell_bw=lstm_bw_cell, 
                                                  inputs=input_data,
                                                  sequence_length=seq_len, 
                                                  dtype=tf.float32)

# As we have Bi-LSTM, we have two output, which are not connected. So merge them
outputs = tf.concat([output_fw, output_bw], axis=2)

# fully connected layer
W = tf.get_variable(name="W", shape=[2 * num_units, number_of_classes],
                dtype=tf.float32)

b = tf.get_variable(name="b", shape=[number_of_classes], dtype=tf.float32,
                initializer=tf.zeros_initializer())

outputs_flat = tf.reshape(outputs, [-1, 2 * num_units])
pred = tf.matmul(outputs_flat, W) + b
scores = tf.reshape(pred, [-1, seq_len, number_of_classes])

# linear-CRF
log_likelihood, transition_params = tf.contrib.crf.crf_log_likelihood(scores, labels, sequence_lengths)

loss = tf.reduce_mean(-log_likelihood)

# training
optimizer = tf.train.AdamOptimizer(learning_rate)
train_op = optimizer.minimize(loss)