Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train new Waveglow model for diffirent language? #189

Closed
EuphoriaCelestial opened this issue Apr 9, 2020 · 17 comments
Closed

How to train new Waveglow model for diffirent language? #189

EuphoriaCelestial opened this issue Apr 9, 2020 · 17 comments

Comments

@EuphoriaCelestial
Copy link

As the title, I would like to know how to train a new model using another dataset, which have the same structure as LJ Speech dataset. What modifications need to be done for a diffirent language?

@lqniunjunlper
Copy link

Just prepare your language specific wav files for training

@Adizbek
Copy link

Adizbek commented Nov 11, 2020

@EuphoriaCelestial have you succeed to train your own model for your own language ?

@EuphoriaCelestial
Copy link
Author

@EuphoriaCelestial have you succeed to train your own model for your own language ?

Yes I have successfully trained my model, but there are some errors in output audio, I am still working on it.

@Adizbek
Copy link

Adizbek commented Nov 12, 2020

@EuphoriaCelestial thanks for response, can you tell how you achieved this ? Can you show any wiki or documentation to follow ?

@EuphoriaCelestial
Copy link
Author

@EuphoriaCelestial thanks for response, can you tell how you achieved this ? Can you show any wiki or documentation to follow ?

actually, I just follow those steps in README file, its pretty simple if your audio matched the default sample rate, bit rate, ...
I have trained tacotron2 model before so my dataset is already pre-processed
you can try training your own model with LJ Speech dataset first to understand the workflow, then try with your own dataset

@Adizbek
Copy link

Adizbek commented Nov 15, 2020

@EuphoriaCelestial thanks for response, can you tell how you achieved this ? Can you show any wiki or documentation to follow ?

actually, I just follow those steps in README file, its pretty simple if your audio matched the default sample rate, bit rate, ...
I have trained tacotron2 model before so my dataset is already pre-processed
you can try training your own model with LJ Speech dataset first to understand the workflow, then try with your own dataset

Finally I installed project successfully with all it's dependencies and I've synthesized a voice with pre trained model.

But I found out that training a model is pretty time consuming. I have core I7 9th gen, 16gb ram, 6 gb gpu, I understood that it's very poor hardware to train, what do you recommend to train a model? Any free cloud solutions?

Can tell how long it takes to train model for instance for you and your hardware?

Thanks

@EuphoriaCelestial
Copy link
Author

@EuphoriaCelestial thanks for response, can you tell how you achieved this ? Can you show any wiki or documentation to follow ?

actually, I just follow those steps in README file, its pretty simple if your audio matched the default sample rate, bit rate, ...
I have trained tacotron2 model before so my dataset is already pre-processed
you can try training your own model with LJ Speech dataset first to understand the workflow, then try with your own dataset

Finally I installed project successfully with all it's dependencies and I've synthesized a voice with pre trained model.

But I found out that training a model is pretty time consuming. I have core I7 9th gen, 16gb ram, 6 gb gpu, I understood that it's very poor hardware to train, what do you recommend to train a model? Any free cloud solutions?

Can tell how long it takes to train model for instance for you and your hardware?

Thanks

yeah it will take a long time. I have RTX 2080ti with 11gb VRAM and it take few days. I have not tried any cloud solutions yet so I cant give any advice.

@Ctibor67
Copy link

Unable to run train.py :
File "train.py", line 188, in
train(num_gpus, args.rank, args.group_name, **train_config)
File "train.py", line 90, in train
optimizer)
File "train.py", line 45, in load_checkpoint
optimizer.load_state_dict(checkpoint_dict['optimizer'])
File "C:\Python37\lib\site-packages\torch\optim\optimizer.py", line 124, in load_state_dict
raise ValueError("loaded state dict contains a parameter group "
ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's group

Can you help me, please?

@CookiePPP
Copy link

CookiePPP commented Nov 30, 2020

just comment out the line, you can add it back the next time you generate a checkpoint.

@Adizbek
Copy link

Adizbek commented Nov 30, 2020

@EuphoriaCelestial, I've installed successfully, how many hours (minimum) of audio need to generate correct spelled audio ?

@EuphoriaCelestial
Copy link
Author

@EuphoriaCelestial, I've installed successfully, how many hours (minimum) of audio need to generate correct spelled audio ?

I dont know the minimum, I havent tested because it will take a lot of time and effort
I have 19 hours of train+test audio in total

@Ctibor67
Copy link

Ctibor67 commented Dec 2, 2020

And one more question:
I changed
"checkpoint_path": "checkpoints/waveglow_256channels_universal_v5.pt"
to
"checkpoint_path": "checkpoints/waveglow_80000.pt"

and starting
python train.py -c config.json

but iteration starts again from 1. How to start iteration from 80000?

@EuphoriaCelestial
Copy link
Author

And one more question:
I changed
"checkpoint_path": "checkpoints/waveglow_256channels_universal_v5.pt"
to
"checkpoint_path": "checkpoints/waveglow_80000.pt"

and starting
python train.py -c config.json

but iteration starts again from 1. How to start iteration from 80000?

why does your checkpoint saved with .pt?
maybe you have exported model from the checkpoint, so it will start from iter 0, its normal
my checkpoints during training have no extension

@Ctibor67
Copy link

Ctibor67 commented Dec 4, 2020

I thought it should have the suffix pt, so I renamed it. How else do I create a pt file from a checkpoint?
How do I proceed then? I tried to copy my waveglow_150000 to the Tacotron2 folder, ran inference.ipynb, exchanged waveglow_256channels_universal_v5.pt for my waveglow_150000, but instead of speek there is only a crackle.

@EuphoriaCelestial
Copy link
Author

I thought it should have the suffix pt, so I renamed it. How else do I create a pt file from a checkpoint?

The inference code can load the checkpoint directly so I dont have to export pt file from checkpoint

How do I proceed then? I tried to copy my waveglow_150000 to the Tacotron2 folder, ran inference.ipynb, exchanged waveglow_256channels_universal_v5.pt for my waveglow_150000, but instead of speek there is only a crackle.

it will work if you rename it to waveglow_150000.pt, weird

@Ctibor67
Copy link

Ctibor67 commented Dec 5, 2020

When I train waveglow, this message appears to me with each new epoch:
Epoch: 7
C:\myprojects\tacotron2\waveglow\mel2samp.py:57: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ..\torch\csrc\utils\tensor_numpy.cpp:141.)
return torch.from_numpy(data).float(), sampling_rate

Is it OK or is this error causing waveglow not to work?

@EuphoriaCelestial
Copy link
Author

I've never encounter this problem before, maybe you should create a new issue so the creators can help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants