-
Notifications
You must be signed in to change notification settings - Fork 907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A Doubt In LocationSensitiveAttention #50
Comments
It helps convergence. You can see the alignment improved when adding cumulative attention states on the 4th floor of this issue keithito/tacotron#170 (comment). I have made series of ablation studies to confirm it. |
Hello @StevenZYj, thanks for reaching out! As stated by @begeekmyfriend, attention weights cumulation is a must to get proper alignments. This was actually stated in the paper as well: While the entire attention mechanism was just referenced in few words in the paper, by using the references they provide we managed to get a sense of what they are talking about. I believe what confirms this cumulation approach is the fact that when doing ablation studies (By the way great work @begeekmyfriend you never cease to impress!), we found out that when we don't use weights cumulation, the decoder tends to repeat of ignore some subsequences. |
@begeekmyfriend @Rayhane-mamah Thanks a lot! It's my bad that the point is actually in the tacotron 2 paper. |
Hi Rayhane, I'm currently looking into your LocationSensitiveAttention class and don't know the reason of using cumulate_weights when calculating the next state. I can't find any reference in the original essay.
By the way, your work is fantastic :D Appreciate it a lot.
The text was updated successfully, but these errors were encountered: