Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Talromur2 recipe #4680

Merged
merged 43 commits into from Nov 8, 2022
Merged

Add Talromur2 recipe #4680

merged 43 commits into from Nov 8, 2022

Conversation

G-Thor
Copy link
Contributor

@G-Thor G-Thor commented Sep 30, 2022

This adds a recipe for the Talrómur 2 multi-speaker corpus. I've trained an x-vector conditioned Tacotron 2 using this recipe with decent results.

The commit history is a bit messy so feel free to squash if you decide to merge this PR.

I'm open to any and all comments on how to improve this recipe.

Reduced batch size of fastspeech2 to facilitate 1-gpu training
Reduced VITS batch size to prevent OOM failures in 4-GPU setup
Reduced batch size of fastspeech2 to facilitate 1-gpu training
Reduced VITS batch size to prevent OOM failures in 4-GPU setup
@sw005320 sw005320 added this to the v.202209 milestone Sep 30, 2022
@sw005320 sw005320 added Recipe TTS Text-to-speech labels Sep 30, 2022
@kan-bayashi
Copy link
Member

Could you merge the latest master to fix the CI?

@kan-bayashi kan-bayashi modified the milestones: v.202209, v.202211 Oct 4, 2022
@codecov
Copy link

codecov bot commented Oct 4, 2022

Codecov Report

Merging #4680 (e52cfd8) into master (b221db0) will decrease coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #4680      +/-   ##
==========================================
- Coverage   80.32%   80.31%   -0.02%     
==========================================
  Files         527      527              
  Lines       46311    46311              
==========================================
- Hits        37200    37193       -7     
- Misses       9111     9118       +7     
Flag Coverage Δ
test_integration_espnet1 66.23% <ø> (-0.14%) ⬇️
test_integration_espnet2 48.96% <ø> (ø)
test_python 68.56% <ø> (ø)
test_utils 23.30% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...pnet/nets/pytorch_backend/transformer/optimizer.py 86.11% <0.00%> (-2.78%) ⬇️
espnet/asr/asr_utils.py 75.65% <0.00%> (-0.88%) ⬇️
...et/nets/pytorch_backend/e2e_asr_mix_transformer.py 84.50% <0.00%> (-0.47%) ⬇️
espnet/tts/pytorch_backend/tts.py 78.33% <0.00%> (-0.30%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@kan-bayashi kan-bayashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review and thank you for adding a great recipe!
It looks almost perfect but I just left minor comments. Could you reflect them?

.gitignore Outdated
tools/anaconda
tools/ice-g2p*
tools/fairseq*
tools/featbin*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add line break


# TODO(G-Thor) add alignment download option

cd "${cwd}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add line break.

if [ ${stage} -le 1 ] && [ ${stop_stage} -ge 1 ]; then
log "stage 2: utils/subset_data_dir.sh"

./local/split_train_dev_test.py --data_dir "data/${full_set}" --train_dir "data/${train_set}" --dev_dir "data/${dev_set}" --test_dir "data/${eval_set}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
./local/split_train_dev_test.py --data_dir "data/${full_set}" --train_dir "data/${train_set}" --dev_dir "data/${dev_set}" --test_dir "data/${eval_set}"
./local/split_train_dev_test.py \
--data_dir "data/${full_set}" \
--train_dir "data/${train_set}" \
--dev_dir "data/${dev_set}" \
--test_dir "data/${eval_set}"


for dset in train dev eval1; do
utils/copy_data_dir.sh data/"${dset}"{,_phn};
${train_cmd} --gpu 1 --num-threads 1 data/"${dset}_phn/log/conversion.log" ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is data/"${dset}"{,_phn}/text;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think --num-threads 1 is default.

Suggested change
${train_cmd} --gpu 1 --num-threads 1 data/"${dset}_phn/log/conversion.log" ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is data/"${dset}"{,_phn}/text;
${train_cmd} --gpu 1 data/"${dset}_phn/log/conversion.log" \
./pyscripts/utils/convert_text_to_phn.py \
--nj 1 \
--g2p g2p_is \
data/"${dset}"{,_phn}/text

for dset in train dev eval1; do
utils/copy_data_dir.sh data/"${dset}"{,_phn};
${train_cmd} --gpu 1 --num-threads 1 data/"${dset}_phn/log/conversion.log" ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is data/"${dset}"{,_phn}/text;
# srun --gres=gpu:1 ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is --cleaner tacotron data/"${dset}"{,_phn}/text;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# srun --gres=gpu:1 ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is --cleaner tacotron data/"${dset}"{,_phn}/text;

utils/copy_data_dir.sh data/"${dset}"{,_phn};
${train_cmd} --gpu 1 --num-threads 1 data/"${dset}_phn/log/conversion.log" ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is data/"${dset}"{,_phn}/text;
# srun --gres=gpu:1 ./pyscripts/utils/convert_text_to_phn.py --nj 1 --g2p g2p_is --cleaner tacotron data/"${dset}"{,_phn}/text;
done
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add line break.

# --valid_set "${valid_set}" \
# --test_sets "${test_sets}" \
# --expdir "${expdir}" \
# --srctexts "data/${train_set}/text" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add line break.

--ngpu 1 \
--expdir "$expdir" \
--train_config ./conf/tuning/train_xvector_tacotron2.yaml \
--inference_model valid.loss.ave_5best.pth
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add line break.

@@ -0,0 +1,45 @@
#!/bin/bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just my preference, but could use lower case? VITS -> vits

@mergify
Copy link
Contributor

mergify bot commented Oct 9, 2022

This pull request is now in conflict :(

@mergify mergify bot added the conflicts label Oct 9, 2022
@kan-bayashi
Copy link
Member

Hi @G-Thor, we want to merge your great recipe.
You may struggle with ICASSP but could you reflect my review?

Also replaced data.sh with data_multi_speaker.sh.
- since that is used for this multi-speaker dataset
@mergify mergify bot added the Installation label Nov 7, 2022
@G-Thor
Copy link
Contributor Author

G-Thor commented Nov 7, 2022

Hi @kan-bayashi, thanks for your review. I've been on leave and waiting for updates to the official corpus repo.
I've now applied your suggested changes as well as removing some unused code.

I also went ahead and changed the installation of ice-g2p to use the official PyPI version of that package rather than my personal fork since my suggested changes to that project have been merged. I hope it is okay to apply this change here, but if it isn't, just lmk and I'll open a separate PR for that.

@mergify mergify bot removed the conflicts label Nov 7, 2022
Copy link
Member

@kan-bayashi kan-bayashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thank you for your great recipe :)

@kan-bayashi kan-bayashi merged commit 45ae496 into espnet:master Nov 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants