New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some files in the dev_testset_noisy_testclips seem to be identical #70
Comments
@zhaoyj1122 thanks for letting us know! I'll check for the duplicates and get back to you with the update soon. |
@zhaoyj1122 yes, we indeed have ten identical files in the test dataset. The files are at
I will clean up the dataset and update the archive on Azure and close this issue afterwards. Meanwhile, please feel free to remove any 9 out of 10 clips in the list. I'll also check our other datasets for the duplicates. Stay tuned! |
looks like |
Got it! Thanks a lot for that. |
I've updated the archive on Azure so I think I can close this issue for now. Please open another one if you see duplicates in our other datasets, and I'll do the same. Thanks for your help and good luck! |
For example:
The text was updated successfully, but these errors were encountered: