Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix readme typo #1

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ You may have to change tfr_dir and model_dir to work on your settings.
- For fp16 settings, you need 1 week to train 1M steps with 4 V100 GPUs.
- I haven't tried fp32 training, so there might be some issues to train high quality models.
- As fp16 training is not robust enough (at now), I usually train FiLM enabled model and unabled model consequently and choose one which survives.
- For a single speaker dataset(LJ Speech dataset), trained model vocoding quality is good enough compared to mel-spectrogram condtioned one.
- For a single speaker dataset(LJ Speech dataset), trained model vocoding quality is good enough compared to mel-spectrogram conditioned one.
- For multi-speaker dataset(VCTK Corpus), disentangling between speaker identity and local condition does not work well (at now). I am investigating reasons though.
- The next step would be training Text-to-LatentCodes model(as Transformer) so that fully TTS is possible.
- If you're interested in this project, please improve models with me!