Handy ASR noise dataset

A handy dataset for noise augmentations for ASR / TTS:

~20k noise files;
~200 distinct categories;

Contact us! Open issues, collaborate, submit a PR, contribute, share your datasets!

Contribution ideas

Add much more data from BBC Sound Effects dataset.

Download links

Meta data file / 2.0M / 73cb528656a484b20e02d6c5fd05f14c Noise archive file / 4.7G / 5e069c867a0da891f57616905129b6c3

Open feather file:

import pandas as pd

df = pd.read_feather(file_path)

Data preparation

The dataset is compiled using open domain sources. All labels resembling loud human speech were removed (but background noise, i.e. street chatter, was not removed). All of the items are 0 - 60 seconds long.

All files are normalized as follows:

Converted to mono, if necessary;
Converted to 16 kHz sampling rate, if necessary;
Stored as 16-bit integers;

Contacts

Please contact us here or just create a GitHub issue!

License

cc-by

References / citations / licenses

Links / license

rnnoise / CC0;
acoustic events / if you end up using the dataset, we ask you to cite the following paper;
urban sounds / cc-by-nc;
esc-50 / license (cc-by-nc);
freiburg-106 / ?;
sound-events / ?;
BBC Sound Effects (a small part) / license;
nar dataset / the data are freely accessible for scientific research purposes and for non-commercial applications

Paper citations:

Naoya Takahashi, Michael Gygli, Beat Pfister and Luc Van Gool,"Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Recognition", Proc. Interspeech 2016, San Fransisco;
J. Salamon, C. Jacoby and J. P. Bello, "A Dataset and Taxonomy for Urban Sound Research", 22nd ACM International Conference on Multimedia, Orlando USA, Nov. 2014;

Donations

Donate (each coffee pays for several full downloads) / use our DO referral link to help.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Repository files navigation

Handy ASR noise dataset

Contribution ideas

Download links

Data preparation

Contacts

License

References / citations / licenses

Donations

About

Releases

Packages

speechio/asr-noises

Folders and files

Latest commit

History

README.md

README.md

Repository files navigation

Handy ASR noise dataset

Contribution ideas

Download links

Data preparation

Contacts

License

References / citations / licenses

Donations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages