How to implement a deep bidirectional LSTM? #1629

udani969 · 2016-02-03T05:47:05Z

I am trying to implement a LSTM based speech recognizer. So far I could set up bidirectional LSTM (i think it is working as a bidirectional LSTM) by following the example in Merge layer. Now I want to try it with another bidirectional LSTM layer, which make it a deep bidirectional LSTM. But I am unable to figure out how to connect the output of the previously merged two layers into a second set of LSTM layers. I don't know whether it is possible with Keras. Hope someone can help me with this.

Code for my single layer bidirectional LSTM is as follows

left = Sequential()
left.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
               forget_bias_init='one', return_sequences=True, activation='tanh',
               inner_activation='sigmoid', input_shape=(99, 13)))
right = Sequential()
right.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
               forget_bias_init='one', return_sequences=True, activation='tanh',
               inner_activation='sigmoid', input_shape=(99, 13), go_backwards=True))

model = Sequential()
model.add(Merge([left, right], mode='sum'))

model.add(TimeDistributedDense(nb_classes))
model.add(Activation('softmax'))

sgd = SGD(lr=0.1, decay=1e-5, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
print("Train...")
model.fit([X_train, X_train], Y_train, batch_size=1, nb_epoch=nb_epoches, validation_data=([X_test, X_test], Y_test), verbose=1, show_accuracy=True)

Dimensions of my x and y values are as follows.

(100, 'train sequences')
(20, 'test sequences')
('X_train shape:', (100, 99, 13))
('X_test shape:', (20, 99, 13))
('y_train shape:', (100, 99, 11))
('y_test shape:', (20, 99, 11))

The text was updated successfully, but these errors were encountered:

farizrahman4u · 2016-02-03T07:25:04Z

#1282 will help. Works only for theano though.

farizrahman4u · 2016-02-03T07:37:18Z

Or you could simply use the following fork function to make 2 copies of your merged layer:

def fork (model, n=2):
    forks = []
    for i in range(n):
        f = Sequential()
        f.add (model)
        forks.append(f)
    return forks

left = Sequential()
left.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
               forget_bias_init='one', return_sequences=True, activation='tanh',
               inner_activation='sigmoid', input_shape=(99, 13)))
right = Sequential()
right.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
               forget_bias_init='one', return_sequences=True, activation='tanh',
               inner_activation='sigmoid', input_shape=(99, 13), go_backwards=True))

model = Sequential()
model.add(Merge([left, right], mode='sum'))

#Add second Bidirectional LSTM layer

left, right = fork(model)

left.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
               forget_bias_init='one', return_sequences=True, activation='tanh',
               inner_activation='sigmoid'))

right.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
               forget_bias_init='one', return_sequences=True, activation='tanh',
               inner_activation='sigmoid',  go_backwards=True))

#Rest of the stuff as it is

model = Sequential()
model.add(Merge([left, right], mode='sum'))

model.add(TimeDistributedDense(nb_classes))
model.add(Activation('softmax'))

sgd = SGD(lr=0.1, decay=1e-5, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
print("Train...")
model.fit([X_train, X_train], Y_train, batch_size=1, nb_epoch=nb_epoches, validation_data=([X_test, X_test], Y_test), verbose=1, show_accuracy=True)

It would be better to use the Bidirectional wrapper or the Graph for this sort of stuff.

udani969 · 2016-02-03T14:51:32Z

Wow it worked. I used the fork method, because it said some checks were not successful under the Wrapper approach. Just now only I could get it to work. Thanks a lot for the support.

talentlei · 2016-05-03T10:06:16Z

@farizrahman4u I use your code as above and get a model. but When I load the model and test , I got error as follow:

File "BLSTM_NER.py", line 1058, in
test()
File "BLSTM_NER.py", line 1038, in test
ner.rnn_test(resfile,model_file,weights)
File "BLSTM_NER.py", line 943, in rnn_test
out = model.predict([self.X_test,self.X_test],batch_size=batch_size)
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 693, in predict
return self._predict_loop(self._predict, X, batch_size, verbose)[0]
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 356, in _predict_loop
batch_outs = f(ins_batch)
File "/usr/local/lib/python2.7/dist-packages/keras/backend/theano_backend.py", line 448, in call
return self.function(*inputs)
File "/home/cl/download/Theano/theano/compile/function_module.py", line 845, in call
self.inv_finder[c]))
TypeError: Missing required input: <TensorType(float32, 3D)>

my code of test is as follow:

    print "load model"
       model = model_from_json(open(my_model).read())
       model.load_weights(weights)
       print "load model finish" 
       out = model.predict([self.X_test,self.X_test],batch_size=batch_size)

How I got this error ? can you help me ? thanks~

Windy-Ground · 2016-06-20T11:39:29Z

https://github.com/fchollet/keras/blob/master/examples/imdb_bidirectional_lstm.py

vinayakumarr · 2016-08-07T08:00:17Z

i was trying @farizrahman4u example of deep bidirectional LSTM for my dataset whic has 50000 rows and 20 columns(19 features and 1 class label) and

X_train = sequence.pad_sequences(X_train, maxlen=100)
X_test = sequence.pad_sequences(X_test, maxlen=100)

I am getting the following error. i know it is because of dimension shape in model.fit function but i dont know how to resolve this.

farizrahman4u · 2016-08-07T11:23:15Z

The problem is with the shape of your input data. The error message is pretty clear, lstm needs 3d data, but you are providing it 2d. The example I provided above is obsolete, use the functional api instead.

9thDimension · 2016-08-07T18:17:17Z

@farizrahman4u When you say "functional API", what do you mean exactly?

I saw this syntax here:
model.add(Bidirectional(LSTM(10, input_shape=(5, 10), return_sequences=True)))
But I don't know which package to import the Bidirectional() class from

and this syntax here:
backwards = LSTM(64, go_backwards=True)(embedded)
But then I'm not exactly sure how to make a multi-layer biridectional LSTM (use the forking approach you described above on Feb 3rd?)

P.S. I want many-to-many sequence labelling, so where do I need to put the return_sequences=True flags?

farizrahman4u · 2016-08-07T18:49:55Z

Google for Keras functional api. The bidirectional wrapper is from my seq2seq library.

9thDimension · 2016-08-07T18:58:00Z

@farizrahman4u Oh it's part of the seq2seq library I see.

Is this the correct usage to make a 2-layer bidirectional LSTM to output a category prediction for every input character?

Input chars are 43-dimensional, and there are 5 possible output categories.

from keras.models import Sequential
from keras.layers import Activation, LSTM, Merge, TimeDistributedDense
from keras.optimizers import SGD

def fork (model, n=2):
    forks = []
    for i in range(n):
        f = Sequential()
        f.add (model)
        forks.append(f)
    return forks

# First bidirectional LSTM layer

forward = Sequential()
forward.add(LSTM(output_dim=512, input_shape=(50, 43), return_sequences=True))
backward = Sequential()
backward.add(LSTM(output_dim=512, input_shape=(50, 43), return_sequences=True, go_backwards=True))

model = Sequential()
model.add(Merge([forward, backward], mode='concat'))


# Second bidirectionl LSTM layer

forward_2, backward_2 = fork(model)

forward_2.add(LSTM(output_dim=512, input_shape=(50, 512), return_sequences=True))
backward_2.add(LSTM(output_dim=512, input_shape=(50, 512), return_sequences=True, go_backwards=True))

model = Sequential()
model.add(Merge([forward_2, backward_2], mode='concat'))


# Softmax decision layer

model.add(TimeDistributedDense(output_dim=5))
model.add(Activation('softmax'))


# Optimizer function

sgd = SGD(lr=0.1, decay=1e-5, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)

print("Train...")
model.fit([X_train, X_train], Y_train, batch_size=1, nb_epoch=nb_epoches, validation_data=([X_test, X_test], Y_test), verbose=1, show_accuracy=True)

Also, For this type of architecture, do the inputs have to "overlap" like so:

x_0 = [0, 1, 2, 3, 4], y_0 = [A, B, C, D, E]
x_1 = [1, 2, 3, 4, 5], y_1 = [B, C, D, E, F]
x_2 = [2, 3, 4, 5, 6], y_2 = [C, D, E, F, G]

or not overlap like so:

x_0 = [0, 1, 2, 3, 4],      y_0 = [A, B, C, D, E]
x_1 = [5, 6, 7, 8, 9],      y_1 = [F, G, H, I, J]
x_2 = [10, 11, 12, 13, 14], y_2 = [K, L, M, N, O]

vinayakumarr · 2016-08-08T04:21:29Z

@farizrahman4u before posting it i know the error i am getting because of dimension problem. I have train data set which is of size 390321 and 23 classes and test data set 20000 (i have correct label also which has 40) I am loading train, test and correct label data set and i am trying to apply deep bidirectional stateful lstm.

train data set size is 390321_41 (40 features and another one is class label)
test data set size is 20000_40
corrected label size is 20000*1

how to reshape the dimension and apply to deep bidirectional stateful lstm?

strin · 2016-09-09T08:08:03Z

@farizrahman4u @9thDimension when running lstm in the reverse direction, shouldn't the output corresponds to input_n, input_{n-1}, input_{n-2}, ..., input_1? In that case, when concatenating with the output from the forward direction, we should reverse it?

farizrahman4u · 2016-09-09T09:30:24Z

@strin I have added the Bidirectional wrapper to Keras.. set the bidirectional lstm example.

williamjqk · 2017-05-05T02:05:02Z

Official manual can be referenced here, https://keras.io/layers/wrappers/#bidirectional

grafael · 2017-08-31T15:58:40Z

I'm afraid that the Bidirectional Wrapper will not work in Keras Functional Api.
Any help in this sort of thing:

main_input = Input(shape=(100,), dtype='int32', name='main_input')
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)
lstm = LSTM(32)(x)
bidirectional = Bidirectional()(lstm) #how bidirectional should be instantiated?

jojonki · 2017-10-28T03:28:30Z

@grafael

How about this? Bidirectional has a layer at first arg.
bidirectional = Bidirectional(LSTM(32))(x)

ylmeng · 2017-10-28T19:54:36Z

Doesn't the 'go_backwards' option reverse the output order too? so model.add(Merge([left, right], mode='sum')) does not make sense (you must flip one of them before adding)?

Ap1075 · 2018-05-15T07:04:49Z

@ylmeng
Yes, it is handled automatically. You don't have to flip it before merging as far as i know.

udani969 closed this as completed Feb 3, 2016

orsonadams mentioned this issue May 27, 2016

What is the current state of Deep Bidirectional RNNs in Keras? #2838

Closed

albertz mentioned this issue Aug 31, 2016

How to write a Bidirectional LSTM in keras? #715

Closed

carrasRuf mentioned this issue Oct 23, 2016

LSTM fully connected architecture #4149

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to implement a deep bidirectional LSTM? #1629

How to implement a deep bidirectional LSTM? #1629

udani969 commented Feb 3, 2016

farizrahman4u commented Feb 3, 2016

farizrahman4u commented Feb 3, 2016

udani969 commented Feb 3, 2016

talentlei commented May 3, 2016 •

edited

Loading

Windy-Ground commented Jun 20, 2016

vinayakumarr commented Aug 7, 2016

farizrahman4u commented Aug 7, 2016

9thDimension commented Aug 7, 2016 •

edited

Loading

farizrahman4u commented Aug 7, 2016

9thDimension commented Aug 7, 2016 •

edited

Loading

vinayakumarr commented Aug 8, 2016

strin commented Sep 9, 2016

farizrahman4u commented Sep 9, 2016

williamjqk commented May 5, 2017

grafael commented Aug 31, 2017

jojonki commented Oct 28, 2017

ylmeng commented Oct 28, 2017

Ap1075 commented May 15, 2018

How to implement a deep bidirectional LSTM? #1629

How to implement a deep bidirectional LSTM? #1629

Comments

udani969 commented Feb 3, 2016

farizrahman4u commented Feb 3, 2016

farizrahman4u commented Feb 3, 2016

udani969 commented Feb 3, 2016

talentlei commented May 3, 2016 • edited Loading

Windy-Ground commented Jun 20, 2016

vinayakumarr commented Aug 7, 2016

farizrahman4u commented Aug 7, 2016

9thDimension commented Aug 7, 2016 • edited Loading

farizrahman4u commented Aug 7, 2016

9thDimension commented Aug 7, 2016 • edited Loading

vinayakumarr commented Aug 8, 2016

strin commented Sep 9, 2016

farizrahman4u commented Sep 9, 2016

williamjqk commented May 5, 2017

grafael commented Aug 31, 2017

jojonki commented Oct 28, 2017

ylmeng commented Oct 28, 2017

Ap1075 commented May 15, 2018

talentlei commented May 3, 2016 •

edited

Loading

9thDimension commented Aug 7, 2016 •

edited

Loading

9thDimension commented Aug 7, 2016 •

edited

Loading