Skip to content
Discussion options

You must be logged in to vote

Hi, I have tested multigpu training and it does work if you dont use batch weighed sampler, and use accelerate set to true. I did not encounter any issue with ljspeech. But when I try to train on a larger dataset (libritts) I got nccl watchdog timout issues, I have tryed setting os.environ["NCCL_BLOCKING_WAIT"] = "1" but without success. How to disable timeout as precomputing the phoneme take almost 45 minutes.
P.S The formatter for libritts dont allow to continue training if there is missing audio in libritts. I have made some modification to it if you want I can PR.
Thank you!

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@mrakotos
Comment options

@mrakotos
Comment options

Answer selected by mrakotos
@eginhard
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
Models: VITS Anything related to VITS/YourTTS/Fairseq models
2 participants