You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question about weight matrix dimension,
In the SRUCell code, I found the k = 4 if n_in != out_size else 3
But When I read the paper, it's only have 3 weight matrix, W, Wf, Wr,
And I found the n_in will not equal to out_size when the layer number is 0, but I don't understand why k = 4, what's those weight other than W, Wf, Wr ?
k is the number of matrices and matrix multiplications in one cell. Normally k=3 as described in the paper. But when input and output size don't match, the input is multiplied by an additional W in the highway connection to change the dimension.
Similar to nn.LSTM and nn.GRU, there will be two sub-RNN modules for two directions when bidirectional=True. The output of each sub-RNN is then concatenated into the final output, and hence the actual output dimension is n_out*2. With the highway connection, this becomes:
Hi @taolei87 ,
I have a question about weight matrix dimension,
In the SRUCell code, I found the
k = 4 if n_in != out_size else 3
But When I read the paper, it's only have 3 weight matrix, W, Wf, Wr,
And I found the
n_in
will not equal toout_size
when the layer number is 0, but I don't understand why k = 4, what's those weight other than W, Wf, Wr ?below is init code:
below is when in_in is not equal to out_size:
Thanks
The text was updated successfully, but these errors were encountered: