enh_s2t joint model #4226

simpleoier · 2022-03-31T15:02:14Z

Support Enhancement and [ASR, SLU, ST] downstream tasks.

Future PR (TODO)

Pass the category info instead of using utterance ids
Pre-trained models upload.
MixIT

mergify · 2022-04-01T19:44:02Z

This pull request is now in conflict :(

mergify · 2022-04-12T22:31:17Z

This pull request is now in conflict :(

Emrys365

I finished my second-round review, except for two s3prl scripts as I am not familiar with them.

Emrys365 · 2022-04-12T16:09:36Z

egs2/TEMPLATE/enh_asr1/enh_asr.sh

+use_k2=false      # Whether to use k2 based decoder
+batch_size=1
+inference_tag=    # Suffix to the result dir for decoding.
+inference_config= # Config for decoding.


After #4251 is merged, would you consider also applying the updates here?

~~But it might take additional efforts to merge the updated espnet2/bin/enh_inference.py script into espnet2/bin/enh_s2t_inference.py.~~

Expand to see the diff in #4251

diff --git a/egs2/TEMPLATE/enh1/enh.sh b/egs2/TEMPLATE/enh1/enh.sh index fcb4f324f..0afc31881 100755 --- a/egs2/TEMPLATE/enh1/enh.sh +++ b/egs2/TEMPLATE/enh1/enh.sh @@ -78,6 +78,8 @@ download_model= # Evaluation related scoring_protocol="STOI SDR SAR SIR SI_SNR" ref_channel=0 +inference_tag= # Prefix to the result dir for ENH inference. +inference_enh_config= # Config for enhancement. score_with_asr=false asr_exp="" # asr model for scoring WER lm_exp="" # lm model for scoring WER @@ -151,8 +153,9 @@ Options: --init_param # pretrained model path and module name (default="${init_param}") # Enhancement related - --inference_args # Arguments for enhancement in the inference stage (default="${inference_args}") - --inference_model # Enhancement model path for inference (default="${inference_model}"). + --inference_args # Arguments for enhancement in the inference stage (default="${inference_args}") + --inference_model # Enhancement model path for inference (default="${inference_model}"). + --inference_enh_config # Configuration file for overwriting some model attributes during SE inference. (default="${inference_enh_config}") # Evaluation related --scoring_protocol # Metrics to be used for scoring (default="${scoring_protocol}") @@ -247,6 +250,14 @@ if [ -n "${speed_perturb_factors}" ]; then enh_exp="${enh_exp}_sp" fi +if [ -z "${inference_tag}" ]; then + if [ -n "${inference_enh_config}" ]; then + inference_tag="$(basename "${inference_enh_config}" .yaml)" + else + inference_tag=enhanced + fi +fi + # ========================== Main stages start from here. ========================== if ! "${skip_data_prep}"; then @@ -614,7 +625,7 @@ if ! "${skip_eval}"; then for dset in "${valid_set}" ${test_sets}; do _data="${data_feats}/${dset}" - _dir="${enh_exp}/enhanced_${dset}" + _dir="${enh_exp}/${inference_tag}_${dset}" _logdir="${_dir}/logdir" mkdir -p "${_logdir}" @@ -646,6 +657,7 @@ if ! "${skip_eval}"; then --data_path_and_name_and_type "${_data}/${_scp},speech_mix,${_type}" \ --key_file "${_logdir}"/keys.JOB.scp \ --train_config "${enh_exp}"/config.yaml \ + ${inference_enh_config:+--inference_config "$inference_enh_config"} \ --model_file "${enh_exp}"/"${inference_model}" \ --output_dir "${_logdir}"/output.JOB \ ${_opts} ${inference_args} @@ -686,7 +698,7 @@ if ! "${skip_eval}"; then if "${score_obs}"; then _dir="${data_feats}/${dset}/scoring" else - _dir="${enh_exp}/enhanced_${dset}/scoring" + _dir="${enh_exp}/${inference_tag}_${dset}/scoring" fi _logdir="${_dir}/logdir" @@ -713,7 +725,7 @@ if ! "${skip_eval}"; then # To compute the score of observation, input original wav.scp _inf_scp+="--inf_scp ${data_feats}/${dset}/wav.scp " else - _inf_scp+="--inf_scp ${enh_exp}/enhanced_${dset}/spk${spk}.scp " + _inf_scp+="--inf_scp ${enh_exp}/${inference_tag}_${dset}/spk${spk}.scp " fi done @@ -749,7 +761,7 @@ if ! "${skip_eval}"; then ./scripts/utils/show_enh_score.sh "${_dir}/../.." > "${_dir}/../../RESULTS.md" done log "Evaluation result for observation: ${data_feats}/RESULTS.md" - log "Evaluation result for enhancement: ${enh_exp}/enhanced/RESULTS.md" + log "Evaluation result for enhancement: ${enh_exp}/RESULTS.md" fi else @@ -808,7 +820,7 @@ if "${score_with_asr}"; then # Using same wav.scp for all speakers cp "${_data}/wav.scp" "${_ddir}/wav.scp" else - cp "${enh_exp}/enhanced_${dset}/scoring/wav_spk${spk}" "${_ddir}/wav.scp" + cp "${enh_exp}/${inference_tag}_${dset}/scoring/wav_spk${spk}" "${_ddir}/wav.scp" fi cp data/${dset}/text_spk${spk} ${_ddir}/text cp ${_data}/{spk2utt,utt2spk,utt2num_samples,feats_type} ${_ddir}

BTW, it would be better to add an additional argument: pretrained_model as in egs2/TEMPLATE/asr1/asr.sh#L93, which allows initializing from a pretrained model.

Similarly this also applies to egs2/TEMPLATE/enh_st1/enh_st.sh.

OK. I've added it.

egs2/TEMPLATE/enh_asr1/enh_asr.sh

espnet2/tasks/enh_s2t.py

egs2/TEMPLATE/enh_asr1/enh_asr.sh

egs2/TEMPLATE/enh_st1/enh_st.sh

espnet2/tasks/enh_s2t.py

simpleoier · 2022-04-13T03:12:46Z

I finished my second-round review, except for two s3prl scripts as I am not familiar with them.

Thanks. I think those s3prl files are fine.

egs2/TEMPLATE/enh_st1/enh_st.sh

…E/{enh_asr1,enh_st1}

ftshijt

It looks cool for me on the ST side. You have my pass for that part.

sw005320 · 2022-04-18T22:23:13Z

OK for me as well.
@simpleoier, if you think this PR is ready, I’ll merge this PR.

simpleoier · 2022-04-19T11:03:35Z

Hi @sw005320 , it is ready. Thanks!

Emrys365 · 2022-04-20T06:58:52Z

@simpleoier, I think we may also need to include the newly added tasks in the integration test: ci/test_integration_espnet2.sh#L78

To do that, some toy recipes need to be created in egs2/mini_an4.

sw005320 · 2022-04-20T11:44:20Z

@simpleoier, I think we may also need to include the newly added tasks in the integration test: ci/test_integration_espnet2.sh#L78

To do that, some toy recipes need to be created in egs2/mini_an4.

Good idea!

simpleoier · 2022-04-20T12:00:30Z

I'll work on the integration test. Thanks for the suggestion.

mergify bot added the ESPnet2 label Mar 31, 2022

sw005320 requested a review from Emrys365 March 31, 2022 17:41

sw005320 added ASR Automatic speech recogntion SE Speech enhancement labels Mar 31, 2022

sw005320 added this to the v.0.10.7 milestone Mar 31, 2022

sw005320 added the Refactoring Refactoring label Mar 31, 2022

simpleoier force-pushed the enh_s2t branch 5 times, most recently from 5c87878 to dec5023 Compare April 1, 2022 03:34

mergify bot added the Installation label Apr 1, 2022

mergify bot added the conflicts label Apr 1, 2022

simpleoier force-pushed the enh_s2t branch from dec5023 to ce93b30 Compare April 1, 2022 20:14

mergify bot removed the conflicts label Apr 1, 2022

simpleoier removed the Installation label Apr 1, 2022

simpleoier force-pushed the enh_s2t branch from ce93b30 to c995678 Compare April 1, 2022 20:43

simpleoier added ST Speech translation SLU Spoken language understanding labels Apr 1, 2022

simpleoier force-pushed the enh_s2t branch 6 times, most recently from 8fc5f5f to 2c79166 Compare April 3, 2022 12:34

simpleoier changed the title ~~[WIP] enh_s2t joint model~~ enh_s2t joint model Apr 3, 2022

simpleoier force-pushed the enh_s2t branch 3 times, most recently from eef0e10 to 45a202e Compare April 3, 2022 22:48

simpleoier force-pushed the enh_s2t branch from 212869e to 5917481 Compare April 12, 2022 22:30

mergify bot added conflicts and removed conflicts labels Apr 12, 2022

simpleoier force-pushed the enh_s2t branch from ec503de to b60234d Compare April 13, 2022 01:08

Emrys365 reviewed Apr 13, 2022

View reviewed changes

simpleoier force-pushed the enh_s2t branch from b60234d to c8848ae Compare April 13, 2022 03:12

simpleoier force-pushed the enh_s2t branch from c8848ae to 0d382f1 Compare April 13, 2022 03:16

mergify bot added ESPnet1 Installation labels Apr 13, 2022

simpleoier force-pushed the enh_s2t branch 2 times, most recently from c71eaa4 to 8b117d7 Compare April 13, 2022 04:12

Emrys365 reviewed Apr 13, 2022

View reviewed changes

egs2/TEMPLATE/enh_st1/enh_st.sh Outdated Show resolved Hide resolved

simpleoier force-pushed the enh_s2t branch from 8b117d7 to a9b8d9f Compare April 13, 2022 04:37

simpleoier added 3 commits April 13, 2022 14:32

initial commit for enh_s2t joint model and template egs: egs2/TEMPLAT…

dd408fc

…E/{enh_asr1,enh_st1}

add test scripts for enh_s2t joint models

24f3357

Merge branch 'master' into enh_s2t

e132867

simpleoier force-pushed the enh_s2t branch from a9b8d9f to e132867 Compare April 13, 2022 18:33

ftshijt approved these changes Apr 17, 2022

View reviewed changes

simpleoier mentioned this pull request Apr 18, 2022

diar_enh joint model #4234

Closed

sw005320 merged commit 42eb310 into espnet:master Apr 19, 2022

Emrys365 mentioned this pull request Apr 20, 2022

Some improvements to current enh functions #4251

Merged

1 task

YushiUeda mentioned this pull request May 6, 2022

enh_diar joint model #4339

Merged

4 tasks

neillu23 mentioned this pull request Apr 22, 2023

[PRE REVIEW]: Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing openjournals/joss-reviews#4746

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enh_s2t joint model #4226

enh_s2t joint model #4226

simpleoier commented Mar 31, 2022 •

edited

mergify bot commented Apr 1, 2022

mergify bot commented Apr 12, 2022

Emrys365 left a comment

Emrys365 Apr 12, 2022 •

edited

Emrys365 Apr 12, 2022 •

edited

Emrys365 Apr 13, 2022

simpleoier Apr 13, 2022

simpleoier commented Apr 13, 2022

ftshijt left a comment

sw005320 commented Apr 18, 2022

simpleoier commented Apr 19, 2022

Emrys365 commented Apr 20, 2022

sw005320 commented Apr 20, 2022

simpleoier commented Apr 20, 2022

enh_s2t joint model #4226

enh_s2t joint model #4226

Conversation

simpleoier commented Mar 31, 2022 • edited

mergify bot commented Apr 1, 2022

mergify bot commented Apr 12, 2022

Emrys365 left a comment

Choose a reason for hiding this comment

Emrys365 Apr 12, 2022 • edited

Choose a reason for hiding this comment

Emrys365 Apr 12, 2022 • edited

Choose a reason for hiding this comment

Emrys365 Apr 13, 2022

Choose a reason for hiding this comment

simpleoier Apr 13, 2022

Choose a reason for hiding this comment

simpleoier commented Apr 13, 2022

ftshijt left a comment

Choose a reason for hiding this comment

sw005320 commented Apr 18, 2022

simpleoier commented Apr 19, 2022

Emrys365 commented Apr 20, 2022

sw005320 commented Apr 20, 2022

simpleoier commented Apr 20, 2022

simpleoier commented Mar 31, 2022 •

edited

Emrys365 Apr 12, 2022 •

edited

Emrys365 Apr 12, 2022 •

edited