Complete Video Tutorial: https://youtu.be/eA7G9IjN8Xk
This dataset contains 8732 labeled sound excerpts (<=4s) of urban sounds from 10 classes:
Download link: https://datahack.analyticsvidhya.com/contest/practice-problem-urban-sound-classification/
UrbanSound8K: This is a dataset of urban sounds that contains 8,732 labeled sound clips from ten classes, including air conditioner, car horn, children playing, dog bark, drilling, engine idling, gunshot, jackhammer, siren, and street music.
ESC-50: This is a dataset of environmental sounds that contains 2,000 labeled sound clips from 50 classes, including animal sounds, natural soundscapes, human sounds, and water sounds.
AudioSet: This is a large-scale dataset of labeled audio events that contains over 2 million audio clips from over 600 classes, including environmental sounds.
DCASE 2019 Task 1: This is a dataset of sound events in real-life audio recordings that contains over 20,000 labeled sound clips from ten classes, including dog, rooster, chainsaw, car horn, and church bells.
FSD: This is a dataset of environmental sounds that contains over 41,000 sound clips from 101 classes, including animal sounds, nature sounds, and urban sounds.
here are a few more publicly available datasets for human language audio classification:
Speech Commands: https://www.tensorflow.org/datasets/catalog/speech_commands M-AILABS Speech Dataset: https://www.caito.de/2019/01/the-m-ailabs-speech-dataset/ LibriSpeech: http://www.openslr.org/12/ VoxForge: http://www.voxforge.org/ Free Spoken Digit Dataset: https://github.com/Jakobovski/free-spoken-digit-dataset
Accuracy: 80.00%