Doubt about Stacked LSTMs #46

gabrevaya · 2022-08-30T13:39:36Z

Hi! First of all, thanks a lot for FluxArchitectures.jl! :)

More than an issue, this is a question because of my misunderstanding. While checking the documentation, I realized that I might have been using a wrong implementation of a Stacked LSTM in my codes. However, I’m confused with your current implementation.

I was comparing the description of the Stacked LSTM from your blog with your current implementation and I don’t understand why you are not using the HiddenRecur anymore. Now it seems like you are chaining the LSTMs just in the regular way, matching the inner dimensions but not feeding the inner cells with the previous hidden states and memories.

Also, in

function (m::StackedLSTMCell)(x)
	out = m.chain(x)
	m.state = out
	return out
end

you are saving only the output in the state of the StackedLSTMCell, so I don’t understand how this would be fixing the Flux issue that you describe in your blog (Flux.jl's standard setup only allows feeding the output of one cell as the new input) or is it the case that now Flux behavior has changed fixing this issue?

The text was updated successfully, but these errors were encountered:

gabrevaya closed this as completed Sep 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Doubt about Stacked LSTMs #46

Doubt about Stacked LSTMs #46

gabrevaya commented Aug 30, 2022

Doubt about Stacked LSTMs #46

Doubt about Stacked LSTMs #46

Comments

gabrevaya commented Aug 30, 2022