-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to implement a deep bidirectional LSTM? #1629
Comments
#1282 will help. Works only for theano though. |
Or you could simply use the following fork function to make 2 copies of your merged layer: def fork (model, n=2):
forks = []
for i in range(n):
f = Sequential()
f.add (model)
forks.append(f)
return forks left = Sequential()
left.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
forget_bias_init='one', return_sequences=True, activation='tanh',
inner_activation='sigmoid', input_shape=(99, 13)))
right = Sequential()
right.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
forget_bias_init='one', return_sequences=True, activation='tanh',
inner_activation='sigmoid', input_shape=(99, 13), go_backwards=True))
model = Sequential()
model.add(Merge([left, right], mode='sum'))
#Add second Bidirectional LSTM layer
left, right = fork(model)
left.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
forget_bias_init='one', return_sequences=True, activation='tanh',
inner_activation='sigmoid'))
right.add(LSTM(output_dim=hidden_units, init='uniform', inner_init='uniform',
forget_bias_init='one', return_sequences=True, activation='tanh',
inner_activation='sigmoid', go_backwards=True))
#Rest of the stuff as it is
model = Sequential()
model.add(Merge([left, right], mode='sum'))
model.add(TimeDistributedDense(nb_classes))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-5, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)
print("Train...")
model.fit([X_train, X_train], Y_train, batch_size=1, nb_epoch=nb_epoches, validation_data=([X_test, X_test], Y_test), verbose=1, show_accuracy=True) It would be better to use the Bidirectional wrapper or the |
Wow it worked. I used the fork method, because it said some checks were not successful under the Wrapper approach. Just now only I could get it to work. Thanks a lot for the support. |
@farizrahman4u I use your code as above and get a model. but When I load the model and test , I got error as follow:
my code of test is as follow:
How I got this error ? can you help me ? thanks~ |
i was trying @farizrahman4u example of deep bidirectional LSTM for my dataset whic has 50000 rows and 20 columns(19 features and 1 class label) and X_train = sequence.pad_sequences(X_train, maxlen=100) I am getting the following error. i know it is because of dimension shape in model.fit function but i dont know how to resolve this. |
The problem is with the shape of your input data. The error message is pretty clear, lstm needs 3d data, but you are providing it 2d. The example I provided above is obsolete, use the functional api instead. |
@farizrahman4u When you say "functional API", what do you mean exactly? I saw this syntax here: and this syntax here: P.S. I want many-to-many sequence labelling, so where do I need to put the |
Google for Keras functional api. The bidirectional wrapper is from my seq2seq library. |
@farizrahman4u Oh it's part of the seq2seq library I see. Is this the correct usage to make a 2-layer bidirectional LSTM to output a category prediction for every input character? Input chars are 43-dimensional, and there are 5 possible output categories.
Also, For this type of architecture, do the inputs have to "overlap" like so:
or not overlap like so:
|
@farizrahman4u before posting it i know the error i am getting because of dimension problem. I have train data set which is of size 390321 and 23 classes and test data set 20000 (i have correct label also which has 40) I am loading train, test and correct label data set and i am trying to apply deep bidirectional stateful lstm. train data set size is 390321_41 (40 features and another one is class label) how to reshape the dimension and apply to deep bidirectional stateful lstm? |
@farizrahman4u @9thDimension when running lstm in the reverse direction, shouldn't the output corresponds to input_n, input_{n-1}, input_{n-2}, ..., input_1? In that case, when concatenating with the output from the forward direction, we should reverse it? |
@strin I have added the Bidirectional wrapper to Keras.. set the bidirectional lstm example. |
Official manual can be referenced here, https://keras.io/layers/wrappers/#bidirectional |
I'm afraid that the Bidirectional Wrapper will not work in Keras Functional Api. main_input = Input(shape=(100,), dtype='int32', name='main_input')
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)
lstm = LSTM(32)(x)
bidirectional = Bidirectional()(lstm) #how bidirectional should be instantiated? |
How about this? Bidirectional has a layer at first arg. |
Doesn't the 'go_backwards' option reverse the output order too? so |
@ylmeng |
I am trying to implement a LSTM based speech recognizer. So far I could set up bidirectional LSTM (i think it is working as a bidirectional LSTM) by following the example in Merge layer. Now I want to try it with another bidirectional LSTM layer, which make it a deep bidirectional LSTM. But I am unable to figure out how to connect the output of the previously merged two layers into a second set of LSTM layers. I don't know whether it is possible with Keras. Hope someone can help me with this.
Code for my single layer bidirectional LSTM is as follows
Dimensions of my x and y values are as follows.
(100, 'train sequences')
(20, 'test sequences')
('X_train shape:', (100, 99, 13))
('X_test shape:', (20, 99, 13))
('y_train shape:', (100, 99, 11))
('y_test shape:', (20, 99, 11))
The text was updated successfully, but these errors were encountered: