Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiRNNCell implementation in Network.py is bugged. Possible solution included. #37

Closed
NekoApocalypse opened this issue Apr 4, 2018 · 1 comment

Comments

@NekoApocalypse
Copy link

NekoApocalypse commented Apr 4, 2018

As mentioned here: https://stackoverflow.com/questions/47371608/cannot-stack-lstm-with-multirnncell-and-dynamic-rnn
If MultiRNNCell is implemented like this:
cell_forward = tf.nn.rnn_cell.MultiRNNCell([gru_cell_forward] * settings.num_layers) cell_backward = tf.nn.rnn_cell.MultiRNNCell([gru_cell_backward] * settings.num_layers)

The RNN cells inside the MultiRNNCell will be exact copies of the first cell. This means that if the size of the input (word_embedding_size + pos_embedding_size*2 in this case) does not match with gru_size, and num_layers>1, TensorFlow will raise an error.

To be specific, TensorFlow will build the first (and all other) GRU Cells to map input into hidden units. Therefore when the second cell tries to map hidden output 1 to hidden units, there will be an error.

We can fix this by making the first GRUCell different from others:
cell_forward = tf.nn.rnn_cell.MultiRNNCell( [tf.nn.rnn_cell.GRUCell(gru_size)] + [gru_cell_forward] * (settings.num_layers - 1))

Or, we can add a dense layer to map input to the size of GRUCells.

@THUCSTHanxu13
Copy link
Member

We have reconstruct our package and toolkit. Now, our toolkit provides more neural models and selective mechanisms. You can just check the new code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants