-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESPnet recipe for the Kinect-WSJ dataset #5711
Conversation
…_1_D_128_batch_8.yaml
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #5711 +/- ##
=======================================
Coverage 76.60% 76.60%
=======================================
Files 761 761
Lines 69880 69880
=======================================
Hits 53534 53534
Misses 16346 16346
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@atharva253, please fix the CI errors https://github.com/espnet/espnet/actions/runs/8364289940/job/22899257857?pr=5711#step:8:605 |
for more information, see https://pre-commit.ci
@sw005320 Sure, I'll fix the CI errors and get back |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Could you fix the minor comments I just left?
# Put an & at the end of next line if you want run the wav creation in parallel using a single machine. It will spawn 50, 13 and 7 jobs for train dev and test | ||
# You can also use cluster managers such as SLURM/SGE/OAR with this command | ||
# python create_corrupted_speech.py $wsj_mix_list $start $file_end $wsj2_mix_base $noise_list $src_count $rir_yaml_list $SNR $dest_base || exit 1; | ||
python create_corrupted_speech.py $wsj_mix_list $start $file_end $wsj2_mix_base $noise_list $src_count $rir_yaml_list $SNR $dest_base || exit 1 & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to use parallel background jobs here, could you follow the coding style in https://github.com/espnet/espnet/blob/master/egs/wsj/asr1/run.sh#L308-L312 to add some checks to make sure all background jobs are finished without error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added checks to make sure all background jobs are finished without error.
egs2/wsj_kinect/enh1/local/data.sh
Outdated
wget --continue -O $wdir/mixture_scripts.zip ${url} | ||
unzip $wdir/mixture_scripts.zip -d $wdir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just doing git clone ...
to fetch the repository into $wdir
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced wget
with git clone
in data.sh
egs2/wsj_kinect/enh1/local/data.sh
Outdated
|
||
wget --continue -O $wdir/mixture_scripts.zip ${url} | ||
unzip $wdir/mixture_scripts.zip -d $wdir | ||
#chmod 700 local/create_corrupted_speech_parallel.sh $wdir/Reverberated_WSJ_2MIX-master/create_corrupted_speech.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The chmod
line is now removed.
egs2/wsj_kinect/enh1/local/data.sh
Outdated
} | ||
|
||
help_message=$(cat << EOF | ||
Usage: $0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the documentation of the arguments below in this help message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll add the documentation for the arguments. Since the min_or_max
and sample_rate
are fixed for Kinect WSJ, data.sh only has parallel
and use_dereverb
as arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the documentation for the parallel
and use_dereverb
arguments.
# elif [ ! -d f ]; then | ||
# echo "Error: $f is not a directory." | ||
# exit 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the commented portion.
|
||
# Ensure that the wav dir exists | ||
for f in "$wavdir/$tr" "$wavdir/$cv" "$wavdir/$tt"; do | ||
# echo "$f" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the commented line.
Thanks for your comments @Emrys365! The portion for the parallel background jobs might need some time for testing after the modifications. I'll complete all the fixes during the weekend and update on the same. |
Co-authored-by: Wangyou Zhang <C0me_On@163.com>
Thanks, @atharva253! |
It seems that the error is reproducible. |
What?
This is a new recipe for preparing the Kinect-WSJ dataset and training speech enhancement/separation models on it.
Details about the dataset:
Kinect-WSJ is a multichannel, reverberated and noisy extension to the WSJ0-2mix dataset. It is designed to simulate challenging acoustic environments through strong reverberation and noise conditions along with the Kinect-like microphone array geometry used in CHiME-5.
Related Links
Pretrained Model (Also in README.md): https://huggingface.co/atharva253/tfgridnetv2_wsj_kinect
Dataset Repository: https://github.com/sunits/Reverberated_WSJ_2MIX