-
Notifications
You must be signed in to change notification settings - Fork 343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About net topology and activation function in eesen #65
Comments
We use the standard activation function for (BI)LSTM. You can find more details by looking at Section 2 of the following paper The cell and hidden output dimension is the same. It can be set by changing "lstm_cell_dim" in run_ctc_xxx.sh. By default it's set to 320 |
Thanks a lot! |
When you said "sub-sampling", I assume that you meant sub-sampling the frames. This is hard because we pad the utterances to the same length within a batch. Your sub-sampling layer needs to figure out which of the frames are not padding frames and does the backpropagation accordingly. You may simply try inserting an affine transform between two LSTM layers, which projects LSTM outputs into a presumably lower-dimensional vector. But this may introduce training instability as the projection layer is not recurrent. In its vanilla definition, LSTM has to have the same cell and output dimension. Google's LSTM with recurrent projection layer is not supported. |
Thanks! |
Hi, Yajie
What activation function is used in BILSTM model in eesen? How to change the activation function? How to set different cell-number and recurrent-number separately in eesen?
Thanks!
The text was updated successfully, but these errors were encountered: