Adding MFA scripts #4557

iamanigeeit · 2022-08-04T15:04:05Z

From #4521 : this commit adds scripts necessary to convert MFA to ESPnet formats.

mfa.sh automates the entire conversion process
tts.sh has been modified to pass arguments to data.sh for MFA
data.sh has been edited to produce required text and durations files

To create .lab files for other datasets, define the relevant function in mfa_format.py.

…The recipe `mfa.sh` details the entire conversion process, and `tts.sh` and `data.sh` (for LJSpeech only) have been edited to produce required text and durations files.

kan-bayashi

Sorry for the late reviewing and thank you for adding cool features!
Could you reflect my comments and fix the CI errors?

kan-bayashi · 2022-08-08T12:42:53Z

egs2/TEMPLATE/tts1/tts.sh

-        local/data.sh ${local_data_opts}
+        local_data_opts+="--token_type ${token_type} --g2p ${g2p}"
+        local/data.sh "${local_data_opts}"


Please do not touch here, which may affect all of the recipes.
You can add local_data_opts in run.sh.
And ${local_data_opts} is different "${local_data_opts}" (latter is not parsed.)

kan-bayashi · 2022-08-08T12:49:50Z

egs2/ljspeech/tts1/local/data.sh

 log "$0 $*"
+token_type=''
+g2p=''
 . utils/parse_options.sh


How about adding use_mfa ?

Suggested change

log "$0 $*"

token_type=''

g2p=''

. utils/parse_options.sh

use_mfa=false

token_type=phn

g2p=g2p_en_no_space

log "$0 $*"

. utils/parse_options.sh

if "${use_mfa}"; then

if [ ${token_type} != "phn" ]; then

echo "ERROR: token_type must be phn when use_mfa=true."

exit 1

fi

if [ ${g2p} != "none" ]; then

echo "ERROR: g2p must be none when use_mfa=true."

exit 1

fi

fi

And could you add mfa installation check in the if expression?

kan-bayashi · 2022-08-08T12:50:21Z

egs2/ljspeech/tts1/local/data.sh

-        <(cut -d "|" -f 1 < ${db_root}/LJSpeech-1.1/metadata.csv) \
-        <(cut -d "|" -f 3 < ${db_root}/LJSpeech-1.1/metadata.csv) \
-        > ${text}
+    if [ "${token_type}" = 'phn' ] && [ -z "${g2p}" ]; then  # text should be phonemes!


Suggested change

if [ "${token_type}" = 'phn' ] && [ -z "${g2p}" ]; then # text should be phonemes!

if "${use_mfa}"; then # text should be phonemes!

kan-bayashi · 2022-08-08T12:51:22Z

egs2/ljspeech/tts1/run_mfa.sh

+    --n_shift "${n_shift}" \
+    --token_type phn \
+    --cleaner none \
+    --g2p none \


Suggested change

--g2p none \

--g2p none \

--local_data_opts "--token_type phn --g2p none --use_mfa true" \

kan-bayashi · 2022-08-08T12:53:38Z

egs2/ljspeech/tts1/run_mfa.sh

+#!/usr/bin/env bash
+# Set bash to 'debug' mode, it will exit on :
+# -e 'error', -u 'undefined variable', -o ... 'error in pipeline', -x 'print commands',


Please add the comments about this script especially focusing on difference of run.sh.

And if you make separated script, I think it is better to add all options in this scripts in order to train by just executing ./run_mfa.sh.

kan-bayashi · 2022-08-08T13:01:11Z

egs2/TEMPLATE/asr1/scripts/utils/mfa.sh

+set -e
+
+# If you are not using LJSpeech, be sure to define how your dataset is processed in mfa_format.py
+# Three
+
+#!/usr/bin/env bash
+# conda config --append channels conda-forge
+# conda install montreal-forced-aligner


Suggested change

set -e

# If you are not using LJSpeech, be sure to define how your dataset is processed in mfa_format.py

# Three

#!/usr/bin/env bash

# conda config --append channels conda-forge

# conda install montreal-forced-aligner

#!/usr/bin/env bash

# Generate MFA alignement

# You need to install the following tools to run this script:

# $ conda config --append channels conda-forge

# $ conda install montreal-forced-aligner

# If you are not using LJSpeech, be sure to define how your

# dataset is processed in `scripts/utils/mfa_format.py`.

set -e

kan-bayashi · 2022-08-08T13:05:49Z

egs2/TEMPLATE/asr1/scripts/utils/mfa.sh

+
+
+# Run the below yourself
+
+#./run_mfa.sh --stage 0 --stop_stage 0
+#./run_mfa.sh --stage 1 --stop_stage 1
+#./run_mfa.sh --stage 2 --stop_stage 2
+#./run_mfa.sh --stage 3 --stop_stage 3
+#./run_mfa.sh --stage 4 --stop_stage 4
+#./run_mfa.sh --stage 5 --stop_stage 5 \
+#    --train_config conf/tuning/train_fastspeech2.yaml \
+#    --teacher_dumpdir data \
+#    --tts_stats_dir data/stats \
+#    --write_collected_feats true
+#./run_mfa.sh --stage 6 --stop_stage 6 \
+#    --train_config conf/tuning/train_fastspeech2.yaml \
+#    --teacher_dumpdir data \
+#    --tts_stats_dir data/stats
+
+
+


Suggested change

# Run the below yourself

#./run_mfa.sh --stage 0 --stop_stage 0

#./run_mfa.sh --stage 1 --stop_stage 1

#./run_mfa.sh --stage 2 --stop_stage 2

#./run_mfa.sh --stage 3 --stop_stage 3

#./run_mfa.sh --stage 4 --stop_stage 4

#./run_mfa.sh --stage 5 --stop_stage 5 \

# --train_config conf/tuning/train_fastspeech2.yaml \

# --teacher_dumpdir data \

# --tts_stats_dir data/stats \

# --write_collected_feats true

#./run_mfa.sh --stage 6 --stop_stage 6 \

# --train_config conf/tuning/train_fastspeech2.yaml \

# --teacher_dumpdir data \

# --tts_stats_dir data/stats

echo "Successfully finished generating MFA alignments.

# NOTE(iamanigeeit): If you want to train FastSpeech2 with the alignments,

# please check `egs2/ljspeech/tts1/run_mfa.sh` as an example.

kan-bayashi · 2022-08-08T13:07:38Z

egs2/TEMPLATE/asr1/scripts/utils/mfa_format.py

+'''
+This script converts files to/from MFA format for use in ESPnet TTS.
+If you wish to add functions to create .lab files for MFA, add them like this:
+    `def make_labs_[dataset]:
+         ...`
+'''


Suggested change

'''

This script converts files to/from MFA format for use in ESPnet TTS.

If you wish to add functions to create .lab files for MFA, add them like this:

`def make_labs_[dataset]:

...`

'''

#!/usr/bin/env python3

"""Convert files to/from MFA format for use in ESPnet TTS.

If you wish to add functions to create .lab files for MFA, add them like this:

def make_labs_[dataset]:

...

"""

mergify · 2022-08-29T08:16:33Z

This pull request is now in conflict :(

mergify · 2022-09-23T19:56:45Z

This pull request is now in conflict :(

iamanigeeit · 2022-11-14T11:43:47Z

@kan-bayashi Sorry for the delay because of ICASSP submission. I will pull the latest version of ESPnet and fix my PR this week!

� Conflicts: � README.md � espnet2/asr/encoder/wav2vec2_encoder.py � espnet2/asr/frontend/s3prl.py � espnet2/asr/postencoder/hugging_face_transformers_postencoder.py � espnet2/gan_tts/jets/jets.py � setup.py � test/espnet2/asr/frontend/test_s3prl.py

iamanigeeit · 2022-11-29T12:01:16Z

Hello @kan-bayashi

I have made the requested changes, but i can't submit the PR, because i have modified other files for my own work on iamanigeeit:master.

I tried to fork espnet again, but i am only allowed to fork once. When i create a new branch iamanigeeit:espnet/mfa and try to create a new PR from there, i get

Can't create a new pull request: Although you appear to have the correct authorization credentials, the espnet organization has enabled OAuth App access restrictions, meaning that data access to third-parties is limited.

What should i do?

kan-bayashi · 2022-11-29T13:14:06Z

Thank you for your modification, @iamanigeeit.

Can't create a new pull request: Although you appear to have the correct authorization credentials, the espnet organization has enabled OAuth App access restrictions, meaning that data access to third-parties is limited.

I've never met this kind of error. Let me confirm your procedure.

make new branch and perform cherry-pick in your local
push your branch to your fork iamanigeeit/espnet.
make PR from github HP

Is it correct?

mergify · 2022-11-30T03:15:33Z

This pull request is now in conflict :(

iamanigeeit · 2022-12-02T07:59:19Z

Thanks @kan-bayashi

Creating a new branch and making the PR from Github instead of Pycharm seems to work. The new commit is at #4801

iamanigeeit added 3 commits August 4, 2022 22:49

This commit adds scripts necessary to convert MFA to ESPnet formats. …

f21b9df

…The recipe `mfa.sh` details the entire conversion process, and `tts.sh` and `data.sh` (for LJSpeech only) have been edited to produce required text and durations files.

This commit adds scripts necessary to convert MFA to ESPnet formats. …

bbb7959

…The recipe `mfa.sh` details the entire conversion process, and `tts.sh` and `data.sh` (for LJSpeech only) have been edited to produce required text and durations files.

Merge branch 'espnet:master' into master

9a5ef5e

mergify bot added the ESPnet2 label Aug 4, 2022

sw005320 added this to the v.202209 milestone Aug 4, 2022

sw005320 requested a review from kan-bayashi August 4, 2022 15:15

kan-bayashi requested changes Aug 8, 2022

View reviewed changes

kan-bayashi added New Features TTS Text-to-speech labels Aug 8, 2022

Working version of SNIPER

de189f2

mergify bot added conflicts ESPnet1 ASR Automatic speech recogntion MT Machine translation Installation labels Aug 29, 2022

mergify bot removed the conflicts label Sep 23, 2022

mergify bot added the conflicts label Sep 23, 2022

kan-bayashi modified the milestones: v.202209, v.202211 Oct 4, 2022

Final version of SNIPER after submission

77e5cc6

mergify bot added the README label Nov 14, 2022

iamanigeeit added 5 commits November 14, 2022 20:25

Final version of SNIPER after submission

20bd312

update readme

38efb18

Corrected mistakes in previous version

f7a37e7

Update to figures

dcb02e7

mergify bot removed the conflicts label Nov 29, 2022

Changes requested by kan_bayashi for PR into espnet/main.

f10f911

mergify bot added the conflicts label Nov 30, 2022

This was referenced Dec 2, 2022

Adding MFA scripts #4800

Closed

Adding MFA scripts for LJSpeech #4801

Merged

iamanigeeit closed this Dec 2, 2022

mergify bot removed the conflicts label Dec 2, 2022

Fhrozen mentioned this pull request Dec 4, 2022

Generating MFA aligments #4803

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding MFA scripts #4557

Adding MFA scripts #4557

iamanigeeit commented Aug 4, 2022 •

edited

kan-bayashi left a comment

kan-bayashi Aug 8, 2022

kan-bayashi Aug 8, 2022

kan-bayashi Aug 8, 2022

kan-bayashi Aug 8, 2022

kan-bayashi Aug 8, 2022

kan-bayashi Aug 8, 2022

kan-bayashi Aug 8, 2022

kan-bayashi Aug 8, 2022

kan-bayashi Aug 8, 2022

kan-bayashi Aug 8, 2022

mergify bot commented Aug 29, 2022

mergify bot commented Sep 23, 2022

iamanigeeit commented Nov 14, 2022

iamanigeeit commented Nov 29, 2022

kan-bayashi commented Nov 29, 2022

mergify bot commented Nov 30, 2022

iamanigeeit commented Dec 2, 2022

-log "$0 $*"
-token_type=''
-g2p=''
-. utils/parse_options.sh
+use_mfa=false
+token_type=phn
+g2p=g2p_en_no_space
+log "$0 $*"
+. utils/parse_options.sh
+if "${use_mfa}"; then
+    if [ ${token_type} != "phn" ]; then
+        echo "ERROR: token_type must be phn when use_mfa=true."
+        exit 1
+    fi
+    if [ ${g2p} != "none" ]; then
+        echo "ERROR: g2p must be none when use_mfa=true."
+        exit 1
+    fi
+fi

	if [ "${token_type}" = 'phn' ] && [ -z "${g2p}" ]; then # text should be phonemes!
	if "${use_mfa}"; then # text should be phonemes!

	--g2p none \
	--g2p none \
	--local_data_opts "--token_type phn --g2p none --use_mfa true" \

Adding MFA scripts #4557

Adding MFA scripts #4557

Conversation

iamanigeeit commented Aug 4, 2022 • edited

kan-bayashi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mergify bot commented Aug 29, 2022

mergify bot commented Sep 23, 2022

iamanigeeit commented Nov 14, 2022

iamanigeeit commented Nov 29, 2022

kan-bayashi commented Nov 29, 2022

mergify bot commented Nov 30, 2022

iamanigeeit commented Dec 2, 2022

iamanigeeit commented Aug 4, 2022 •

edited