Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LSTM: How to feed the output back to the input? #4068

Closed
wgmao opened this issue Oct 14, 2016 · 12 comments
Closed

LSTM: How to feed the output back to the input? #4068

wgmao opened this issue Oct 14, 2016 · 12 comments

Comments

@wgmao
Copy link

wgmao commented Oct 14, 2016

model = Sequential()
model.add(LSTM(512, input_dim = 4, return_sequences = True))
model.add(TimeDistributed(Dense(4)))
model.add(Activation('softmax'))

The input here is the one hot representation of a string and the dictionary size is set to be 4. In other word, there are four types of chars in this string. The output here is the probabilities that the next char ought to be.

If the length of input sequence is 1, the output dimension is 4 by 1. I just wonder could I feed the output back to the input and get an arbitrary length of output sequence (illustrated as follows). It may not be reasonable to plug back the probabilities but I just want to know the possibility to implement this one-to-many structure in keras. Thanks.

Example:

input1 -(LSTM)-> output1
output1 -(LSTM) -> output2
output2 - (LSTM) -> output3

We could get a 4 by 3 output in the end.

@kgrm
Copy link

kgrm commented Oct 16, 2016

You have to do it in an external loop, as in

seq = [np.random.rand(4)]
for i in range(n_iter):
    seq.append(model.predict(seq))

@wgmao
Copy link
Author

wgmao commented Oct 16, 2016

Thanks, kgrm. But how about the training process? If you add an external loop, I don't think model.fit( ) will work.

@kgrm
Copy link

kgrm commented Oct 16, 2016

It will, but you'll need to add a Masking layer to train it on arbitrary-length (eg, zero-padded) sequences.

@EderSantana
Copy link
Contributor

actually, I think he will have to write his own custom layer to do that. See this DreamyRNN for example: https://github.com/commaai/research/blob/master/models/layers.py#L334-L397
It takes a n frames and input and outputs n+m where the last m frames are generated by feeding outputs back as input.

@kgrm
Copy link

kgrm commented Oct 16, 2016

That's not the case, you just have to reframe and rearrange your training data accordingly for the n+1-th step prediction task.

@wgmao
Copy link
Author

wgmao commented Oct 17, 2016

Thanks, EderSantana.

To kgrm: As I expect, only the first char is the input. In this scenario, I don't think the external loop will help construct the right output, which illustrated as follows,

1st char -> (LSTM) -> output1 (one single of LSTM parameters embedded)
output1 -> (LSTM) -> output2 (two sets of LSTM parameters convoluted with each other)

It's this convolution that makes things complicated. Thanks for your reply.

@stale stale bot added the stale label May 23, 2017
@stale
Copy link

stale bot commented May 23, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open it if needed.

@stale stale bot closed this as completed Jun 22, 2017
@gokhanntosun
Copy link

I have the same problem. Is there any way around this issue?

@wgmao
Copy link
Author

wgmao commented Jan 14, 2019

I switched to tensorflow and wrote everything from scratch.

@AtrCheema
Copy link

@wgmao Can you share your code here please that you wrote in tensorflow? Thanks

@wgmao
Copy link
Author

wgmao commented Dec 2, 2019

I referred to https://github.com/LantaoYu/SeqGAN/blob/e2b52fb6309851b14765290e8a972ccac09f1bec/target_lstm.py to write customized recurrent layers.

@br3xk
Copy link

br3xk commented Jun 25, 2022

@wgmao May I know what changes you make to this referred link (https://github.com/LantaoYu/SeqGAN/blob/e2b52fb6309851b14765290e8a972ccac09f1bec/target_lstm.py) to use the output of prior step as the input of the next step? I am new in coding and LSTM stuffs. It would be great help if you mention it specifically. Kindly help me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants