Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doubt about Stacked LSTMs #46

Closed
gabrevaya opened this issue Aug 30, 2022 · 0 comments
Closed

Doubt about Stacked LSTMs #46

gabrevaya opened this issue Aug 30, 2022 · 0 comments

Comments

@gabrevaya
Copy link

Hi! First of all, thanks a lot for FluxArchitectures.jl! :)

More than an issue, this is a question because of my misunderstanding. While checking the documentation, I realized that I might have been using a wrong implementation of a Stacked LSTM in my codes. However, I’m confused with your current implementation.

I was comparing the description of the Stacked LSTM from your blog with your current implementation and I don’t understand why you are not using the HiddenRecur anymore. Now it seems like you are chaining the LSTMs just in the regular way, matching the inner dimensions but not feeding the inner cells with the previous hidden states and memories.

Also, in

function (m::StackedLSTMCell)(x)
	out = m.chain(x)
	m.state = out
	return out
end

you are saving only the output in the state of the StackedLSTMCell, so I don’t understand how this would be fixing the Flux issue that you describe in your blog (Flux.jl's standard setup only allows feeding the output of one cell as the new input) or is it the case that now Flux behavior has changed fixing this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant