Skip to content

quentin-wang/audioset_classification

Repository files navigation

Google Audio Set classification with Keras and tensorflow

Audio Set [1] is a large scale weakly labelled dataset containing over 2 million 10-second audio clips with 527 classes published by Google in 2017.

This codebase is an implementation of [2, 3], where attention neural networks are proposed for Audio Set classification and achieves a mean average precision (mAP) of 0.360.

Download dataset

Audioset.

Run

Users may optionaly choose tensorflow in runme.sh to run the code.

./runme.sh

Results

Mean average precision (mAP) of different models.

References

[1] Gemmeke, Jort F., et al. "Audio set: An ontology and human-labeled dataset for audio events." Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on. IEEE, 2017.

[2] Kong, Qiuqiang, et al. "Audio Set classification with attention model: A probabilistic perspective." arXiv preprint arXiv:1711.00927 (2017).

[3] Yu, Changsong, et al. "Multi-level Attention Model for Weakly Supervised Audio Classification." arXiv preprint arXiv:1803.02353 (2018).

External links

The original implmentation of [3] is created by Changsong Yu https://github.com/ChangsongYu/Eusipco2018_Google_AudioSet

Contact

Bin Wang (wang.bin # gmx.com)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published