New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Talromur2 recipe #4680
Add Talromur2 recipe #4680
Conversation
Reduced batch size of fastspeech2 to facilitate 1-gpu training Reduced VITS batch size to prevent OOM failures in 4-GPU setup
Reduced batch size of fastspeech2 to facilitate 1-gpu training Reduced VITS batch size to prevent OOM failures in 4-GPU setup
Could you merge the latest master to fix the CI? |
Codecov Report
@@ Coverage Diff @@
## master #4680 +/- ##
==========================================
- Coverage 80.32% 80.31% -0.02%
==========================================
Files 527 527
Lines 46311 46311
==========================================
- Hits 37200 37193 -7
- Misses 9111 9118 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late review and thank you for adding a great recipe!
It looks almost perfect but I just left minor comments. Could you reflect them?
.gitignore
Outdated
tools/anaconda | ||
tools/ice-g2p* | ||
tools/fairseq* | ||
tools/featbin* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add line break
|
||
# TODO(G-Thor) add alignment download option | ||
|
||
cd "${cwd}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add line break.
if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then | ||
log "stage 2: utils/subset_data_dir.sh" | ||
|
||
./local/split_train_dev_test.py --data_dir "data/${full_set}" --train_dir "data/${train_set}" --dev_dir "data/${dev_set}" --test_dir "data/${eval_set}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
./local/split_train_dev_test.py --data_dir "data/${full_set}" --train_dir "data/${train_set}" --dev_dir "data/${dev_set}" --test_dir "data/${eval_set}" | |
./local/split_train_dev_test.py \ | |
--data_dir "data/${full_set}" \ | |
--train_dir "data/${train_set}" \ | |
--dev_dir "data/${dev_set}" \ | |
--test_dir "data/${eval_set}" |
|
||
for dset in train dev eval1; do | ||
utils/copy_data_dir.sh data/"${dset}"{,_phn}; | ||
${train_cmd} --gpu 1 --num-threads 1 data/"${dset}_phn/log/conversion.log" ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is data/"${dset}"{,_phn}/text; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think --num-threads 1
is default.
${train_cmd} --gpu 1 --num-threads 1 data/"${dset}_phn/log/conversion.log" ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is data/"${dset}"{,_phn}/text; | |
${train_cmd} --gpu 1 data/"${dset}_phn/log/conversion.log" \ | |
./pyscripts/utils/convert_text_to_phn.py \ | |
--nj 1 \ | |
--g2p g2p_is \ | |
data/"${dset}"{,_phn}/text |
for dset in train dev eval1; do | ||
utils/copy_data_dir.sh data/"${dset}"{,_phn}; | ||
${train_cmd} --gpu 1 --num-threads 1 data/"${dset}_phn/log/conversion.log" ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is data/"${dset}"{,_phn}/text; | ||
# srun --gres=gpu:1 ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is --cleaner tacotron data/"${dset}"{,_phn}/text; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# srun --gres=gpu:1 ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is --cleaner tacotron data/"${dset}"{,_phn}/text; |
utils/copy_data_dir.sh data/"${dset}"{,_phn}; | ||
${train_cmd} --gpu 1 --num-threads 1 data/"${dset}_phn/log/conversion.log" ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is data/"${dset}"{,_phn}/text; | ||
# srun --gres=gpu:1 ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is --cleaner tacotron data/"${dset}"{,_phn}/text; | ||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add line break.
egs2/talromur2/tts1/run.sh
Outdated
# --valid_set "${valid_set}" \ | ||
# --test_sets "${test_sets}" \ | ||
# --expdir "${expdir}" \ | ||
# --srctexts "data/${train_set}/text" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add line break.
--ngpu 1 \ | ||
--expdir "$expdir" \ | ||
--train_config ./conf/tuning/train_xvector_tacotron2.yaml \ | ||
--inference_model valid.loss.ave_5best.pth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add line break.
@@ -0,0 +1,45 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just my preference, but could use lower case? VITS -> vits
This pull request is now in conflict :( |
Hi @G-Thor, we want to merge your great recipe. |
Also replaced data.sh with data_multi_speaker.sh. - since that is used for this multi-speaker dataset
Hi @kan-bayashi, thanks for your review. I've been on leave and waiting for updates to the official corpus repo. I also went ahead and changed the installation of ice-g2p to use the official PyPI version of that package rather than my personal fork since my suggested changes to that project have been merged. I hope it is okay to apply this change here, but if it isn't, just lmk and I'll open a separate PR for that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you for your great recipe :)
This adds a recipe for the Talrómur 2 multi-speaker corpus. I've trained an x-vector conditioned Tacotron 2 using this recipe with decent results.
The commit history is a bit messy so feel free to squash if you decide to merge this PR.
I'm open to any and all comments on how to improve this recipe.