LSTM Autoencoder #1401

dpappas · 2016-01-04T10:19:44Z

Hello everyone and happy new year

I am trying to create an LSTM Autoencoder as shown on the image bellow.

The encoder consumes the input "the cat sat",
and creates a vector depicted as the big red arrow.

The decoder takes this vector and tries to reconstruct the sequence
given the position in the sentence.

I would like to save this vector (big red arrow) to use it on another model.

The code i have wrote so far is the following:

from keras.layers import containers
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, AutoEncoder
import numpy as np
from keras.layers.recurrent import LSTM

train_x = [
    [[ 1 ,3 ],[ 1 ,3 ]],
    [[ 2 ,4 ],[ 2 ,4 ]],
    [[ 3 ,5 ],[ 3 ,5 ]]
]
train_x = np.array(train_x)

encoder = containers.Sequential([ LSTM(output_dim=5, input_dim = 2, activation='tanh' , return_sequences=True) ])
decoder = containers.Sequential([ LSTM(output_dim=2, input_dim = 5, activation='tanh', return_sequences=True) ])
autoencoder = Sequential()
autoencoder.add(AutoEncoder(encoder=encoder, decoder=decoder, output_reconstruction=False))
autoencoder.compile(loss='mean_squared_error', optimizer='sgd')
autoencoder.fit(train_x,train_x, nb_epoch=10)

It is not clear to me if the code above does what i ask for.
If i do not use return_sequences=True it yields an error.

Should i use a graph model to do exactly what i ask for?

Thank you in advance for your help.

The text was updated successfully, but these errors were encountered:

fchollet · 2016-01-04T19:12:14Z

What you posted does do what you figure describes.

You don't need an AutoEncoder layer to achieve this. You could simply do:

m = Sequential()
m.add(LSTM(5, input_dim=2, return_sequences=True))
m.add(LSTM(5, return_sequences=True))

Also don't train RNNs with SGD. Use RMSprop instead.

dpappas · 2016-01-05T08:03:59Z

Thank you very much.

I assume that then i can save the weights of the 1st Lstm's final state weigths using
m.layers[0].get_weights()

lemuriandezapada · 2016-01-06T10:45:55Z

In my experience Adam does better than RMSprop

dpappas · 2016-01-15T10:45:37Z

Hello again

I would like to get the outputs of the first layer of the following model


from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import containers
from keras.layers.core import Dense, AutoEncoder, TimeDistributedDense, Activation
from keras.optimizers import RMSprop
from keras.utils import np_utils
from keras.layers.recurrent import LSTM

import numpy as np

LSTM_size_1 = 5

data = [
    [ [1 ,3] ],
    [ [2 ,4] ],
    [ [3 ,5 ] ],
    [ [4 ,6 ] ],
    [ [5 ,7 ] ],
    [ [6 ,8 ] ],
    [ [7 ,9 ] ],
    [ [8 ,10 ] ],
    [ [9 ,11 ] ]
]

data = np.array(data)

in_dim = data.shape[-1]
m = Sequential()
m.add(LSTM(LSTM_size_1, input_dim=in_dim, return_sequences=True))
m.add(LSTM(in_dim, return_sequences=True))
m.add(Activation('linear'))
m.compile(loss='mse', optimizer='RMSprop')
m.fit(data,data, nb_epoch=2)

when i type

m.layers[0].get_output()

i get

DimShuffle{1,0,2}.0

Is there a way to get the final output as a numpy array instead of the weights?

I believe the output is something like this :

O_t = σ ( W_o  *  X_t  +  U_o  *  h_t-1  +  V_o  *  C_t  +  b_o )

Thank you in advance

jgc128 · 2016-01-15T15:56:45Z

Hi,

you can save the weights of the first LSTM, create a separate model with only one LSTM layer and set the weights of this LSTM to your saved weights. After that you can use predict method to get the output of the first LSTM.

dpappas · 2016-01-16T09:32:23Z

Truthfully this is not what i want.

I do not want to use the trained LSTM
as an input to another Neural Net

I want to use the output of the LSTM as an embedding.

So i do not want 4 matrices ( the trained weights of the LSTM )
but 1 matrix with dimensionality n*1 where n is the number on nodes in the LSTM.

This matrix is the output of the 1st LSTM which was used as input to the
second LSTM as shown with the red arrow in the picture

jgc128 · 2016-01-16T15:47:32Z

If you feed the same data to this new network with one LSTM layer you will get exactly what you want as the result of the predictions. You can save these results and use it anywhere you want.

dpappas · 2016-01-18T12:41:36Z

If there are 100 instances there will be 100 autoencoders.

I want an autoencoder to overtrain on a specific instance and extract an embedding.

Think of it as compressing all the information for a text in a vector of size 10.

I want to use these 100 embeddings as input on another network. (size: 100 X 10)

I cannot connect all LSTMs at the same time and feed the original data once more.

Neither could i connect one lstm at a time.

I just want the output of the 1st layer to numpy array.

How can i get it ?

lemuriandezapada · 2016-01-18T13:04:59Z

http://keras.io/faq/#how-can-i-visualize-the-output-of-an-intermediate-layer

dpappas · 2016-01-18T14:46:58Z

Thank you.

This is what i was searching for!

MdAsifKhan · 2016-04-21T11:17:26Z

@dpappas , I am also facing the same issue. I tried the above link and I get the attribute error
'LSTM' object has no attribute 'initial weights'
Could you please post the exactly snippet what you did.

GUR9000 · 2016-12-24T15:02:03Z

@fchollet, using "return_sequences=True" does NOT produce what is described in the figure!
This will cause the "decoder" LSTM layer to use the output vector of the encoder at each time step instead of only the single final vector after processing the whole sequence (as shown in the figure). Or am I mistaken here?

ypxie · 2017-05-03T21:34:22Z

@GUR9000 I think you are right. At every time step, the decoder needs to take the output from last step of decoder rather than the output of the encoder.

ScientiaEtVeritas · 2017-06-19T13:42:00Z

I want to build a LSTM autoencoder.
My data looks like this, with shape (1200, 10, 5), which is (training_size, timesteps, input_dim):

array([[[0, 0, 0, 1, 0],
        [0, 0, 0, 0, 1],
        ...
        [0, 1, 0, 0, 0]],
        [[0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        ...
        [0, 0, 0, 0, 0]]])

Code:

from keras.layers.recurrent import LSTM
from keras.models import Sequential

model = Sequential()

encoded = LSTM(3, input_shape=(10,5))
model.add(encoded)

decoded = LSTM((10,5), return_sequences=True)
model.add(decoded)

But for the decoded step it returns ValueError: Input 0 is incompatible with layer lstm_5: expected ndim=3, found ndim=2.

Thank you in advance.

dpappas · 2017-06-19T13:52:00Z

@ScientiaEtVeritas

Your encoded LSTM returns only the last output of the LSTM
you need to change the encoded line to

encoded = LSTM(3, input_shape=(10,5), return_sequences=True)

Finally your decoded LSTM needs a proper number of nodes for the lstm.
You give a tuple as size
You need to change this to

decoded = LSTM( 10 , return_sequences=True )

or change the number 10 to the size you want.

I suggest you read the documentation and some explanation on LSTMs
Understanding LSTM Networks

ScientiaEtVeritas · 2017-06-19T14:32:49Z

@dpappas : Thank you for your answer.
But for clarification and as mentioned before, using return_sequences=True is not what is shown in the picture. My goal is actually a single, final vector that represents the whole sequence as good as possible. I don't think this is the case when using return_sequences=True.
There is another issue I found that actually describes why and that return_sequences=False for the encoder is the actual way to go #5138 (at the bottom).

dpappas · 2017-06-19T15:10:16Z

@ScientiaEtVeritas

You are wright.
I have not managed to do that with keras.
You could do it with tensorflow using seq2seq.

Maybe in keras you could do it with the step funtion of the lstm or a callback function to use the output of the decoder from the previous timestep as input to the new timestep

NeilYager · 2017-08-23T17:49:13Z

@dpappas @ScientiaEtVeritas

Perhaps this has the desired effect:

timesteps, input_dim = 10, 5

model.add(LSTM(3), input_shape=(timesteps, input_dim))
model.add(RepeatVector(timesteps))
model.add(LSTM(10, return_sequences=True))

dpappas closed this as completed Jan 18, 2016

brunoalano mentioned this issue Jul 6, 2016

Keras learn Sequence Embeddings with RNN #3160

Closed

Yingyingzhang15 mentioned this issue Jul 14, 2016

How is mean_squared_error caculated when the last layer is LSTM layer? #3222

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LSTM Autoencoder #1401

LSTM Autoencoder #1401

dpappas commented Jan 4, 2016

fchollet commented Jan 4, 2016

dpappas commented Jan 5, 2016

lemuriandezapada commented Jan 6, 2016

dpappas commented Jan 15, 2016

jgc128 commented Jan 15, 2016

dpappas commented Jan 16, 2016

jgc128 commented Jan 16, 2016

dpappas commented Jan 18, 2016

lemuriandezapada commented Jan 18, 2016

dpappas commented Jan 18, 2016

MdAsifKhan commented Apr 21, 2016

GUR9000 commented Dec 24, 2016

ypxie commented May 3, 2017

ScientiaEtVeritas commented Jun 19, 2017

dpappas commented Jun 19, 2017 •

edited

Loading

ScientiaEtVeritas commented Jun 19, 2017 •

edited

Loading

dpappas commented Jun 19, 2017

NeilYager commented Aug 23, 2017 •

edited

Loading

LSTM Autoencoder #1401

LSTM Autoencoder #1401

Comments

dpappas commented Jan 4, 2016

fchollet commented Jan 4, 2016

dpappas commented Jan 5, 2016

lemuriandezapada commented Jan 6, 2016

dpappas commented Jan 15, 2016

jgc128 commented Jan 15, 2016

dpappas commented Jan 16, 2016

jgc128 commented Jan 16, 2016

dpappas commented Jan 18, 2016

lemuriandezapada commented Jan 18, 2016

dpappas commented Jan 18, 2016

MdAsifKhan commented Apr 21, 2016

GUR9000 commented Dec 24, 2016

ypxie commented May 3, 2017

ScientiaEtVeritas commented Jun 19, 2017

dpappas commented Jun 19, 2017 • edited Loading

ScientiaEtVeritas commented Jun 19, 2017 • edited Loading

dpappas commented Jun 19, 2017

NeilYager commented Aug 23, 2017 • edited Loading

dpappas commented Jun 19, 2017 •

edited

Loading

ScientiaEtVeritas commented Jun 19, 2017 •

edited

Loading

NeilYager commented Aug 23, 2017 •

edited

Loading