Questions about the implementation #2

dogancan · 2016-02-08T08:18:48Z

Thanks for sharing the research code along with the paper. I am reading the code in parallel with the paper and I think it is very useful for understanding details that are not explicitly mentioned in the paper. I am not sure if this is the right place to ask but I have two questions regarding the implementation.

The classifier backprop call on line 193 of SNLI-attention/LSTMN.lua is taking {rnn_alpha, rnn_h_dec} as input. I don't quite understand why rnn_alpha is part of the input. Shouldn't the input be the hidden state vectors for the source and target sequences, i.e. {rnn_h_enc, rnn_h_dec}?
Why do we add the gradients drnn_c_dec[1] and drnn_h_dec[1] to the gradients drnn_c_enc[max_length+1] and drnn_h_enc[max_length+1] on lines 222-223 of SNLI-attention/LSTMN.lua? After reading the paper and the rest of the implementation, I have the impression that the initial hidden state and memory vectors of the decoder are random vectors and they don't depend on the final hidden state and memory vectors of the encoder.

The text was updated successfully, but these errors were encountered:

cheng6076 · 2016-02-08T09:54:01Z

Thanks for the comments. Yes, you are right there is a mistake in line 193 and it should be rnn_h_enc. We are still working on the latest version of the code. In terms of whether the last hidden states/memories should be copied, I think it is a subtle detail that worth to be explored.

dogancan · 2016-02-08T10:45:52Z

Thanks for the answers. I am still a bit confused about the propagation of error gradients from the first hidden state/memory of the decoder to the last hidden state/memory of the encoder. From what I can tell, during the the forward pass the initial hidden state/memory of the decoder is not copied from the last hidden state/memory of the encoder but is initialized as a random vector (evaluation, training). I agree that whether these vectors should be copied is more of a research question and should definitely be explored. What I don't quite understand is the apparent discrepancy between the forward and backward passes given in the SNLI-attention/LSTMN.lua file.

cheng6076 · 2016-02-08T11:06:07Z

Ah sorry, I see what you mean. There is clearly some problems here in this implementation and I will fix it.

cheng6076 closed this as completed Feb 26, 2016

jpilaul mentioned this issue Mar 28, 2017

BatchLoaderB.lua:34: bad argument #2 to 'sub' #16

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about the implementation #2

Questions about the implementation #2

dogancan commented Feb 8, 2016

cheng6076 commented Feb 8, 2016

dogancan commented Feb 8, 2016

cheng6076 commented Feb 8, 2016

Questions about the implementation #2

Questions about the implementation #2

Comments

dogancan commented Feb 8, 2016

cheng6076 commented Feb 8, 2016

dogancan commented Feb 8, 2016

cheng6076 commented Feb 8, 2016