In [1]:
import numpy as np
import tensorflow as tf

We will use TensorFlow for RNN and Numpy to prepare our own data.

In [2]:
num_examples = 2**10
seq_length = 10
                  
sequences = np.random.randint(2, size=(num_examples,seq_length)).astype('float32')

print("shape of input data:",sequences.shape)
print("first element:", sequences[0])

shape of input data: (1024, 10)
first element: [ 0.  0.  1.  0.  1.  0.  1.  1.  0.  1.]


Our objective is to classify sequences with size 10. For this purpose we're creating a dataset that includes 1024(2^10) example array. Each array consists of 0's and 1's. Number of 1's and 0's is random. Number of 1's determines the array's class. So we can only have 11 classes at most because our array length is 10. (Don't forget we can have all zeroes)

In [3]:
target_classes = []
for input in sequences: 
    target = (input == 1).sum()
    target_classes.append(target)

target_classes = np.asarray(target_classes)
target_classes

array([5, 4, 6, ..., 5, 8, 8])

For a supervised learning method we need to know the answers. This means we need to have the correct classes of our sequences for training. So we count the number of 1's for each array and we append them to a 1D array.

Now we need to encode our label array with 1-hot encoding. Because in Machine Learning algorithms we tend to encode our class labels with 1-hot encoding. There are a couple of reasons for this. For example, in this problem our network can predict the class labels with probabilities instead of exact class labels. Like below:

In [4]:
sample = np.random.exponential(2,10)
sample /= sample.sum()
sample

array([ 0.13317494,  0.01547636,  0.0434029 ,  0.05370335,  0.05800456,
        0.01255314,  0.07310165,  0.34522375,  0.07085784,  0.19450151])

I think you can see why 1-hot encoding is useful in this case. You can think like that:
"With 1-hot encoding we say:
"This is an apple 100% and this is a banana 0%", instead of saying just "This is an apple". Now let's see how can encode our label array."

In [5]:
np.eye(11)

array([[ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.]])

The function below is just creating a unit matrix with the size as a parameter. But if you look carefully this is 1-hot encoded array between 0 and 10. This is actually our 1-hot encoded class labels. We just need to encode our training data using this matrix.

In [6]:
target_classes = np.eye(11)[target_classes]
target_classes[0]

array([ 0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.])

That's it! This is why Python is awesome.

In [7]:
trainX  = sequences[:1000]
trainY = target_classes[:1000]
testX = sequences[1000:]
testY = target_classes[1000:]

We've splitted our data in two parts. 1.000 of them is for training. And the remaining 24 of them is for testing. We are done with preparing data. Now let's create our RNN model.

In [8]:
batch_size = 100
n_hidden = 128
n_chunks = 28
x = tf.placeholder("float32", [None, None, seq_length])
y = tf.placeholder("int32", [None, 11])

We need two placeholders. x for input y for class labels. 'n_hidden' means "number of hidden layers".

In [9]:
weights = tf.Variable(tf.random_normal([n_hidden, 11]))
biases = tf.Variable(tf.random_normal([11]))

trainX = trainX.reshape(1,1000,seq_length)
trainX.shape

(1, 1000, 10)

We've defined our weight and bias variables.

In [10]:
def RNN(x, weights, biases):
    
    # Define a lstm cell with tensorflow
    lstm_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)

    # Get lstm cell output
    outputs, states = tf.nn.dynamic_rnn(lstm_cell, x, dtype=tf.float32)

    # Linear activation, using rnn inner loop last output
    return tf.matmul(outputs[-1], weights) + biases

'output[-1]' means the last output in a an array of outputs.

In [11]:
def train(x):
    pred = RNN(x,weights,biases)
    output = pred
    softmax = tf.nn.softmax(output)
    index_of_max_prob = tf.argmax(softmax, 1)
    correct_labels =  tf.argmax(y, 1)
    
    cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=pred,labels=y) )
    optimizer = tf.train.AdamOptimizer().minimize(cost)
    
    hm_epochs = 64
    with tf.variable_scope('training'):
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            print("Before training|Prediction for first 10 sequence:",index_of_max_prob.eval({x:trainX[0,0:10].reshape(1,10,10)}))
            for epoch in range(hm_epochs):
                epoch_loss = 0
                for _ in range(int(num_examples/batch_size)):
                    epoch_x, epoch_y = trainX,trainY
                    _, c = sess.run([optimizer, cost], feed_dict={x: epoch_x, y: epoch_y})
                    epoch_loss += c

                print('Epoch', epoch, 'completed out of',hm_epochs,'loss:',epoch_loss)

            correct = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
            accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
            print('Accuracy:',accuracy.eval({x:trainX, y:trainY}))
            
            print("After training|Prediction for first 10 sequence:",index_of_max_prob.eval({x:trainX[0,0:10].reshape(1,10,10)}))
            print("Correct labels for first 10 sequence",correct_labels.eval({y:trainY[:10]}))

Here we train our network and make predictions with it.

In [12]:
train(x)

Before training|Prediction for first 10 sequence: [ 5 10  2  2  2 10 10 10  5 10]
Epoch 0 completed out of 64 loss: 23.3513219357
Epoch 1 completed out of 64 loss: 19.1296386719
Epoch 2 completed out of 64 loss: 18.3281830549
Epoch 3 completed out of 64 loss: 17.6780289412
Epoch 4 completed out of 64 loss: 16.9752991199
Epoch 5 completed out of 64 loss: 16.1118475199
Epoch 6 completed out of 64 loss: 15.2795728445
Epoch 7 completed out of 64 loss: 14.5651035309
Epoch 8 completed out of 64 loss: 13.9359779358
Epoch 9 completed out of 64 loss: 13.3870819807
Epoch 10 completed out of 64 loss: 12.9158462286
Epoch 11 completed out of 64 loss: 12.9056702852
Epoch 12 completed out of 64 loss: 12.3442387581
Epoch 13 completed out of 64 loss: 11.9593622684
Epoch 14 completed out of 64 loss: 11.6067547798
Epoch 15 completed out of 64 loss: 11.3422005177
Epoch 16 completed out of 64 loss: 11.0622347593
Epoch 17 completed out of 64 loss: 10.7403671741
Epoch 18 completed out of 64 loss: 10.43338215