DEEP Open Catalogue: Audio classifier
Author/Mantainer: Ignacio Heredia (CSIC)
Project: This work is part of the DEEP Hybrid-DataCloud project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 777435.
This is a plug-and-play tool to perform audio classification with Deep Learning. It allows the user to classify their samples of audio as well as training their own classifier for a custom problem. The classifier is currently pretrained on the 527 high-level classes from the AudioSet dataset.
You can find more information about it in the DEEP Marketplace.
Installing this module
- It is a requirement to have Tensorflow>=1.12.0 installed (either in gpu or cpu mode). This is not listed in the
requirements.txtas it breaks GPU support.
- This project has been tested in Ubuntu 18.04 with Python 3.6.5. Further package requirements are described in the
To start using this framework run and download the default weights:
git clone https://github.com/deephdc/audio-classification-tf cd audio-classification-tf pip install -e . curl -o ./models/default.tar.gz https://cephrgw01.ifca.es:8080/swift/v1/audio-classification-tf/default.tar.gz cd models && tar -zxvf default.tar.gz && rm default.tar.gz
To use this module with an API you have to install the DEEPaaS package (temporarily, until
1.0 launching, you will have to use the
git clone -b test-args https://github.com/indigo-dc/deepaas cd deepaas pip install -e .
deepaas-run --listen-ip 0.0.0.0. Now open http://0.0.0.0:5000/ and look for the methods belonging to the
We have also prepared a ready-to-use Docker container to run this module. To run it:
docker search deephdc docker run -ti -p 5000:5000 deephdc/deep-oc-audio-classification-tf
Now open http://0.0.0.0:5000/ and look for the methods belonging to the
Train an audio classifier [in progress
You can train your own audio classifier with your custom dataset. For that you have to:
Test an audio classifier
There are two possible ways to use the
PREDICT method from the DEEPaaS API:
- supply to the
dataargument a path pointing to a (signed 16-bit PCM)
wavfile containing your audio.
- supply to the
urlargument an online url pointing to a (signed 16-bit PCM)
wavfile containing your audio. Here is an example of such an url that you can use for testing purposes.
The code in this project is based on the original repo by IBM, and implements the paper 'Multi-level Attention Model for Weakly Supervised Audio Classification' by Yu et al.
The main changes with respect to the original repo are that:
- we have addded a training method so that the user is able to create his own custom classifier
- the code has been packaged into an installable Python package.
- it has been made compatible with the DEEPaaS API.
If you consider this project to be useful, please consider citing any of the references below:
Jort F. Gemmeke, Daniel P. W. Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R. Channing Moore, Manoj Plakal, Marvin Ritter,"Audio set: An ontology and human-labeled dataset for audio events", IEEE ICASSP, 2017.
Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley,"Audio Set classification with attention model: A probabilistic perspective." arXiv preprint arXiv:1711.00927 (2017).
Changsong Yu, Karim Said Barsim, Qiuqiang Kong, Bin Yang ,"Multi-level Attention Model for Weakly Supervised Audio Classification." arXiv preprint arXiv:1803.02353 (2018).
S. Hershey, S. Chaudhuri, D. P. W. Ellis, J. F. Gemmeke, A. Jansen, R. C. Moore, M. Plakal, D. Platt, R. A. Saurous, B. Seybold et al., "CNN architectures for large-scale audio classification," arXiv preprint arXiv:1609.09430, 2016.