-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attention weights with partial flat line (non-english) #137
Comments
I think The problem is representation of your symbols with text. You must get good alignment in 10k steps. If don't its mean something get wrong. Yes. Warmstart helping. I don't know whats happening, but yes you can do inference with only 1 flow. |
I guess.. I will try changing my symbols to ASCII, will come update soon. |
@Bahm9919 Especially
Or anything else, surprisingly? |
I see your issue, will answer there. |
Hi, I have been trying to train this model with Thai-dataset (1 speaker, ~5 hour).
After ~80k Steps (batch size = 1, ~31 epoch), the attention weights turns out like this
Is it normal to see partial flat lines like this? all the issues I looked through only sees entire flat line or just straight diagonal...
Or am I being too impatient? it's just 80k steps after all.
Here's some additional info
![image](https://user-images.githubusercontent.com/43643389/137118802-9aaba686-e542-454e-936e-e316a76941e0.png)
(Is this even correct?)
![image](https://user-images.githubusercontent.com/43643389/137084726-be9627a3-931d-42b6-aa82-00480ac043c7.png)
The above result comes from me warm starting the model from
flowtron_ljs.pt
with theflow=1
config file (speaker_embedding.weight
ignored)Things I have done
embedding.weight
since they have different shape during warmstart.Additional Questions
flow=1
until attentions aligned, seconds, same butflow=2
, and then third, turns theattn_prior
off to attends the speaker. What's the sign to look for during third step? how do I know if the model has attended?ctc_loss
starts at 10k iters, do I need to change this? does starting this earlier or later affects anything?flowtron_ljs
really helps in learning different language? I'm wondering which parts did it helps with, the decoder?Thank you for reading and would really appreciate any answers or suggestions.
The text was updated successfully, but these errors were encountered: