Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESPnet recipe for the Kinect-WSJ dataset #5711

Merged
merged 20 commits into from
Mar 25, 2024
Merged

Conversation

atharva253
Copy link
Contributor

What?

This is a new recipe for preparing the Kinect-WSJ dataset and training speech enhancement/separation models on it.

Details about the dataset:

Kinect-WSJ is a multichannel, reverberated and noisy extension to the WSJ0-2mix dataset. It is designed to simulate challenging acoustic environments through strong reverberation and noise conditions along with the Kinect-like microphone array geometry used in CHiME-5.

Related Links

Pretrained Model (Also in README.md): https://huggingface.co/atharva253/tfgridnetv2_wsj_kinect
Dataset Repository: https://github.com/sunits/Reverberated_WSJ_2MIX

@sw005320 sw005320 added this to the v.202405 milestone Mar 20, 2024
@sw005320 sw005320 added Recipe SE Speech enhancement labels Mar 20, 2024
@sw005320 sw005320 requested a review from Emrys365 March 20, 2024 18:48
Copy link

codecov bot commented Mar 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.60%. Comparing base (dbd73dd) to head (51d28ca).
Report is 5 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5711   +/-   ##
=======================================
  Coverage   76.60%   76.60%           
=======================================
  Files         761      761           
  Lines       69880    69880           
=======================================
  Hits        53534    53534           
  Misses      16346    16346           
Flag Coverage Δ
test_configuration_espnet2 ∅ <ø> (∅)
test_integration_espnet1 62.92% <ø> (ø)
test_integration_espnet2 48.84% <ø> (ø)
test_integration_espnetez 27.98% <ø> (ø)
test_python_espnet1 18.20% <ø> (ø)
test_python_espnet2 52.41% <ø> (ø)
test_python_espnetez 13.95% <ø> (ø)
test_utils 20.91% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sw005320
Copy link
Contributor

@atharva253
Copy link
Contributor Author

@sw005320 Sure, I'll fix the CI errors and get back

Copy link
Collaborator

@Emrys365 Emrys365 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Could you fix the minor comments I just left?

# Put an & at the end of next line if you want run the wav creation in parallel using a single machine. It will spawn 50, 13 and 7 jobs for train dev and test
# You can also use cluster managers such as SLURM/SGE/OAR with this command
# python create_corrupted_speech.py $wsj_mix_list $start $file_end $wsj2_mix_base $noise_list $src_count $rir_yaml_list $SNR $dest_base || exit 1;
python create_corrupted_speech.py $wsj_mix_list $start $file_end $wsj2_mix_base $noise_list $src_count $rir_yaml_list $SNR $dest_base || exit 1 &
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to use parallel background jobs here, could you follow the coding style in https://github.com/espnet/espnet/blob/master/egs/wsj/asr1/run.sh#L308-L312 to add some checks to make sure all background jobs are finished without error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added checks to make sure all background jobs are finished without error.

Comment on lines 91 to 92
wget --continue -O $wdir/mixture_scripts.zip ${url}
unzip $wdir/mixture_scripts.zip -d $wdir
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just doing git clone ... to fetch the repository into $wdir?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced wget with git clone in data.sh


wget --continue -O $wdir/mixture_scripts.zip ${url}
unzip $wdir/mixture_scripts.zip -d $wdir
#chmod 700 local/create_corrupted_speech_parallel.sh $wdir/Reverberated_WSJ_2MIX-master/create_corrupted_speech.sh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chmod line is now removed.

}

help_message=$(cat << EOF
Usage: $0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the documentation of the arguments below in this help message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll add the documentation for the arguments. Since the min_or_max and sample_rate are fixed for Kinect WSJ, data.sh only has parallel and use_dereverb as arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the documentation for the parallel and use_dereverb arguments.

egs2/wsj_kinect/enh1/local/wsj_kinect_data_prep.sh Outdated Show resolved Hide resolved
egs2/wsj_kinect/enh1/local/wsj_kinect_data_prep.sh Outdated Show resolved Hide resolved
Comment on lines 41 to 43
# elif [ ! -d f ]; then
# echo "Error: $f is not a directory."
# exit 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the commented portion.


# Ensure that the wav dir exists
for f in "$wavdir/$tr" "$wavdir/$cv" "$wavdir/$tt"; do
# echo "$f"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the commented line.

egs2/wsj_kinect/enh1/run.sh Outdated Show resolved Hide resolved
@atharva253
Copy link
Contributor Author

Thanks for your comments @Emrys365! The portion for the parallel background jobs might need some time for testing after the modifications. I'll complete all the fixes during the weekend and update on the same.

@sw005320 sw005320 added the auto-merge Enable auto-merge label Mar 24, 2024
@sw005320
Copy link
Contributor

Thanks, @atharva253!
After the CI is passed, I’ll merge this PR.

@sw005320
Copy link
Contributor

It seems that the error is reproducible.
https://github.com/espnet/espnet/actions/runs/8409685020/job/23033292927?pr=5711
We should not change egs2/wsj_kinect/enh1/conf/slurm.conf from the original one, and this would be a reason.
Can you check it?

@mergify mergify bot merged commit eed7751 into espnet:master Mar 25, 2024
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge Enable auto-merge ESPnet2 README Recipe SE Speech enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants