Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugs in generating wsj0-2mix dataset #164

Closed
hangtingchen opened this issue Jul 9, 2020 · 2 comments
Closed

Bugs in generating wsj0-2mix dataset #164

hangtingchen opened this issue Jul 9, 2020 · 2 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@hangtingchen
Copy link
Contributor

🐛 Bug

Should use mv1 instead of mv2 to get wav files

To Reproduce

https://github.com/mpariente/asteroid/blob/0bdec2644f2d770d037ce804b7f70cb98bd5c9fa/egs/wsj0-mix/DeepClustering/local/convert_sphere2wav.sh#L31

The line uses both mv2 and mv1 to get wav files. But the mv1 will be covered by mv2, resulting in the generated wav files being from mv2. The mv1 is noise-free while the mv2 is noisy. The wsj0-mix dataset is expected to use mv1.

The correct code is
wav=`echo "$line" | sed "s:wv1:wav:g" | awk -v dir=$wav_dir -F'/' '{printf("%s/%s/%s/%s", dir, $(NF-2), $(NF-1), $NF)}'`

Expected behavior

We tested the datasets generated by mv1 and mv2. It is observed that the former can reproduce the results, the latter is worse around 1-2 dB in SI-SNR.

Our results with mv1, the final validation loss was about 2950.
微信截图_20200709163058

I am sorry that our results with mv2 were deleted, its final validation loss was about 3500.

Environment

  • Asteroid-master
  • PyTorch 1.4.0
  • PyTorchLightning 7.6.1
@hangtingchen hangtingchen added bug Something isn't working help wanted Extra attention is needed labels Jul 9, 2020
@mpariente
Copy link
Collaborator

mpariente commented Jul 9, 2020

I think this is right and has been reported elsewhere as well.
I think I only had wv1 when I generated wsj0 in the first place so I didn't notice the problem.
I checked now and you're right.

Would you like to submit a PR for that please?
Thanks again !

By the way, I'm very happy you can finally reproduce the results.

@mpariente
Copy link
Collaborator

Closed by #166

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants