New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spectogram Loss Value is NaN #73
Comments
Have you added your data in those pipelines, is the dataset cache properly created and are you using the pretrained models? What exactly is the configuration that you are running? |
In this section I can't find the |
The way you integrated the data into the pipeline looks good, I don't see an issue there. For LJSpeech, data cleaning with the scorer should not be necessary, because the data is already pretty clean, so I suspect that the problem is not in the data, but there is a mistake somewhere. The hyperparameters are meant for testing, not necessarily to get good results, but the loss should not become NaN even with the settings of the integration test. You're right about the acoustic_model missing, that part of the documentation is outdated. I will fix it with the next version. The acoustic model is now detected and loaded automatically. IMS-Toucan/Utility/corpus_preparation.py Line 40 in 1c581e0
I'm not sure where the problem lies, but you could try using this pipeline instead of the testing pipeline: The docuentation is pretty outdated, I have been very sick for a long time recently and still recovering, so everything is a bit behind and outdated at the moment. When I'm better I'll get back to updating the docs and prepare a new release. |
I hope you get better soon.
And I'll try this thing and report again |
The result is still the same |
Is the loss NaN at already the first step? Or does it turn to NaN over time? |
The loss NaN at the first step |
Then it really sounds like there is a bad datapoint in the dataset that causes this problem, maybe a complete mismatch of text and audio. Have you checked for your subset of LJSpeech that the texts and audios you are using actually match? Maybe there war a small mistake somewhere and the index of text and audio has shifted or so. If everything seems alright with the data and there are no obvious mismatches of text and audio, have you tried the scorer again? Was there still a problem that kept you from using it with the pretrained multilingual model? |
After I used another computer, I didn't have this problem anymore. I don't know what the cause is. The data I use is exactly the same |
Would be interesting to know what caused this, but I'm happy to hear that it works now! |
I'm trying to do some training and found that the spectrogram loss is NaN. After reading again I found in the section https://github.com/DigitalPhonetics/IMS-Toucan#faq-:~:text=Loss%20turns%20to,use%20for%20TTS. that I should try using the scorer. I do it like this:
python3 run_training_pipeline.py integration_test --gpu_id 0
, but even now the result is still NaN and I can't find the file best.pypython3 run_scorer.py
Is this step correct? I'm trying to run this using 1000 LJ Speech data. What should I do so that the spectrogram loss value is not NaN? For information, I’m using batch size: 8 and lr=0.001
The text was updated successfully, but these errors were encountered: