-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enh_s2t joint model #4226
enh_s2t joint model #4226
Conversation
5c87878
to
dec5023
Compare
This pull request is now in conflict :( |
8fc5f5f
to
2c79166
Compare
eef0e10
to
45a202e
Compare
This pull request is now in conflict :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finished my second-round review, except for two s3prl
scripts as I am not familiar with them.
use_k2=false # Whether to use k2 based decoder | ||
batch_size=1 | ||
inference_tag= # Suffix to the result dir for decoding. | ||
inference_config= # Config for decoding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After #4251 is merged, would you consider also applying the updates here?
But it might take additional efforts to merge the updatedespnet2/bin/enh_inference.py
script intoespnet2/bin/enh_s2t_inference.py
.
Expand to see the diff in #4251
diff --git a/egs2/TEMPLATE/enh1/enh.sh b/egs2/TEMPLATE/enh1/enh.sh
index fcb4f324f..0afc31881 100755
--- a/egs2/TEMPLATE/enh1/enh.sh
+++ b/egs2/TEMPLATE/enh1/enh.sh
@@ -78,6 +78,8 @@ download_model=
# Evaluation related
scoring_protocol="STOI SDR SAR SIR SI_SNR"
ref_channel=0
+inference_tag= # Prefix to the result dir for ENH inference.
+inference_enh_config= # Config for enhancement.
score_with_asr=false
asr_exp="" # asr model for scoring WER
lm_exp="" # lm model for scoring WER
@@ -151,8 +153,9 @@ Options:
--init_param # pretrained model path and module name (default="${init_param}")
# Enhancement related
- --inference_args # Arguments for enhancement in the inference stage (default="${inference_args}")
- --inference_model # Enhancement model path for inference (default="${inference_model}").
+ --inference_args # Arguments for enhancement in the inference stage (default="${inference_args}")
+ --inference_model # Enhancement model path for inference (default="${inference_model}").
+ --inference_enh_config # Configuration file for overwriting some model attributes during SE inference. (default="${inference_enh_config}")
# Evaluation related
--scoring_protocol # Metrics to be used for scoring (default="${scoring_protocol}")
@@ -247,6 +250,14 @@ if [ -n "${speed_perturb_factors}" ]; then
enh_exp="${enh_exp}_sp"
fi
+if [ -z "${inference_tag}" ]; then
+ if [ -n "${inference_enh_config}" ]; then
+ inference_tag="$(basename "${inference_enh_config}" .yaml)"
+ else
+ inference_tag=enhanced
+ fi
+fi
+
# ========================== Main stages start from here. ==========================
if ! "${skip_data_prep}"; then
@@ -614,7 +625,7 @@ if ! "${skip_eval}"; then
for dset in "${valid_set}" ${test_sets}; do
_data="${data_feats}/${dset}"
- _dir="${enh_exp}/enhanced_${dset}"
+ _dir="${enh_exp}/${inference_tag}_${dset}"
_logdir="${_dir}/logdir"
mkdir -p "${_logdir}"
@@ -646,6 +657,7 @@ if ! "${skip_eval}"; then
--data_path_and_name_and_type "${_data}/${_scp},speech_mix,${_type}" \
--key_file "${_logdir}"/keys.JOB.scp \
--train_config "${enh_exp}"/config.yaml \
+ ${inference_enh_config:+--inference_config "$inference_enh_config"} \
--model_file "${enh_exp}"/"${inference_model}" \
--output_dir "${_logdir}"/output.JOB \
${_opts} ${inference_args}
@@ -686,7 +698,7 @@ if ! "${skip_eval}"; then
if "${score_obs}"; then
_dir="${data_feats}/${dset}/scoring"
else
- _dir="${enh_exp}/enhanced_${dset}/scoring"
+ _dir="${enh_exp}/${inference_tag}_${dset}/scoring"
fi
_logdir="${_dir}/logdir"
@@ -713,7 +725,7 @@ if ! "${skip_eval}"; then
# To compute the score of observation, input original wav.scp
_inf_scp+="--inf_scp ${data_feats}/${dset}/wav.scp "
else
- _inf_scp+="--inf_scp ${enh_exp}/enhanced_${dset}/spk${spk}.scp "
+ _inf_scp+="--inf_scp ${enh_exp}/${inference_tag}_${dset}/spk${spk}.scp "
fi
done
@@ -749,7 +761,7 @@ if ! "${skip_eval}"; then
./scripts/utils/show_enh_score.sh "${_dir}/../.." > "${_dir}/../../RESULTS.md"
done
log "Evaluation result for observation: ${data_feats}/RESULTS.md"
- log "Evaluation result for enhancement: ${enh_exp}/enhanced/RESULTS.md"
+ log "Evaluation result for enhancement: ${enh_exp}/RESULTS.md"
fi
else
@@ -808,7 +820,7 @@ if "${score_with_asr}"; then
# Using same wav.scp for all speakers
cp "${_data}/wav.scp" "${_ddir}/wav.scp"
else
- cp "${enh_exp}/enhanced_${dset}/scoring/wav_spk${spk}" "${_ddir}/wav.scp"
+ cp "${enh_exp}/${inference_tag}_${dset}/scoring/wav_spk${spk}" "${_ddir}/wav.scp"
fi
cp data/${dset}/text_spk${spk} ${_ddir}/text
cp ${_data}/{spk2utt,utt2spk,utt2num_samples,feats_type} ${_ddir}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- BTW, it would be better to add an additional argument:
pretrained_model
as in egs2/TEMPLATE/asr1/asr.sh#L93, which allows initializing from a pretrained model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly this also applies to egs2/TEMPLATE/enh_st1/enh_st.sh
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I've added it.
Thanks. I think those s3prl files are fine. |
c71eaa4
to
8b117d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks cool for me on the ST side. You have my pass for that part.
OK for me as well. |
Hi @sw005320 , it is ready. Thanks! |
@simpleoier, I think we may also need to include the newly added tasks in the integration test: ci/test_integration_espnet2.sh#L78
|
Good idea! |
I'll work on the integration test. Thanks for the suggestion. |
Support Enhancement and [ASR, SLU, ST] downstream tasks.
Future PR (TODO)