Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add duration predictor training #10

Closed
wants to merge 1 commit into from
Closed

Conversation

ductho9799
Copy link

Hello p0p4k! Your repository is very awesome. I trained VITS2 with your code on my private data. I have implemented duration predictor training code. You can test it.

@p0p4k
Copy link
Owner

p0p4k commented Aug 17, 2023

I have implemented duration predictor training code. You can test it.

Hi, I will check and review the code ASAP.

I trained VITS2 with your code on my private data.

How are the results? Can you share some samples? No need to share the weights, just wav samples if possible, to see the output quality. Thanks!

@ductho9799
Copy link
Author

ductho9799 commented Aug 17, 2023

I haven't had time to experiment with LJSpeech yet. I just tested with my private Vietnamese dataset. The result of VITS2 after training duration predictor is better than VITS with my dataset.
Here are some samples created by VITS, VITS2 and Ground Truth:
VITS: vits
VITS2: vits2
Human: gt

@p0p4k
Copy link
Owner

p0p4k commented Aug 17, 2023

Thanks for the samples. They do sound good. Can I ask if you transferred VITS-1 weights to VITS-2 or trained VITS-2 from scratch?

@ductho9799
Copy link
Author

ductho9799 commented Aug 17, 2023

I trained VITS-2 from-scratch. Here is my configs: vits-2-configs.json. I trained it on 4 RTX 3090 24 GVRAM

@p0p4k
Copy link
Owner

p0p4k commented Aug 17, 2023

Interesting! Can I add your samples on README of this repo? I still would advice to add discriminator and train the model.
Also, would be great if you can turn on the other flags and check any improvement in the output? Thanks!

@ductho9799
Copy link
Author

Thanks for your suggestions. I'm planning to train VITS-2 with the LJSpeech dataset next week. I will send you the checkpoint of LJSpeech and generated samples.

@p0p4k
Copy link
Owner

p0p4k commented Aug 21, 2023

Hi, I updated the code with 2 discriminators; please check it if you are interested.

@ductho9799
Copy link
Author

Thank you so much for updating the new discriminators. I will test and train with new discriminators. I'll share the result with you as soon as possible.

@egorsmkv
Copy link

@ductho9799 hello. What was improved in speech? I'm curious just pronunciation or other characteristics of the voice?

@ductho9799
Copy link
Author

ductho9799 commented Aug 22, 2023

@p0p4k @egorsmkv Hello, I trained a version of VITS-2 with the LJSpeech dataset. I share the weights, config, and audio samples of VITS-2 in VITS-2. Can you help me evaluate the quality of VITS-2 on LJSpeech dataset?

I trained VITS-2 with 390 epochs and the trained duration predictor with 200 epochs.

@p0p4k
Copy link
Owner

p0p4k commented Aug 22, 2023

@ductho9799 change access of your drive file. Thanks.

@ductho9799
Copy link
Author

Yes, try again it, please.

@p0p4k
Copy link
Owner

p0p4k commented Aug 22, 2023

Thanks for sharing the checkpoints! Samples sound not bad! Can you train the latest code with duration discriminator and HIFIGan Discriminator (multiperiod disc) with nosdp?

@p0p4k
Copy link
Owner

p0p4k commented Aug 22, 2023

I am booting a cloud GPU right now to train as well. I want to check if the duration discriminator is working or not. (no nan, inf values, etc)

@ductho9799
Copy link
Author

Thanks for sharing the checkpoints! Samples sound not bad! Can you train the latest code with duration discriminator and HIFIGan Discriminator (multiperiod disc) with nosdp?
I can train this config at the weekend.

@p0p4k
Copy link
Owner

p0p4k commented Aug 22, 2023

If the training works well, I will share the checkpoints so you can continue to train on that; else will try to fix the code before weekend.

@p0p4k
Copy link
Owner

p0p4k commented Aug 23, 2023

@ductho9799 checkpoints are on main page readme. Good luck!

@kingkong135
Copy link

I trained VITS-2 from-scratch. Here is my configs: vits-2-configs.json. I trained it on 4 RTX 3090 24 GVRAM

@ductho9799 Can you share file symbols.py, i trained in infore dataset but result not good. I used config like you. :((. All config, model and train.log in drive. Can you give me some advice? Thank you very much.

@ngocson1804
Copy link

@ductho9799 have you tried with an external embedding extractor?

@HuuHuy227
Copy link

@ductho9799 have you tried with an external embedding extractor?

Did you mean bert-vit2s?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants