An Environmental Sound Classifier
(Current works on desktop Firefox and mobile Safari on iOS!)
I wrote a bit about Whisp, the ESC-50 dataset, training an environmental sound classifier, and some insights I had along the way while testing it in the field on my blog A Quite Life.
This project comprises of three main parts:
Spectrogram generation notebook, which shows you how to make spectrograms from the ESC-50 dataset. It also shows you how to make gifs as well, for fun ;)
Learner notebook, which shows how to build our classification model with fastai
Web app, which allows you to predict sound classfication with our model!
Its probably a good idea to fork this repo as you may end up working on different machines
brew install conda
Create a new conda environment:
conda create -n whisp python=3.6
Activate your environment:
conda activate whisp
Clone repo, move into the
whisp directory and install required libraries:
git clone https://github.com/aquietlife/whisp
pip install -r requirements.txt
Finally, make sure to get the ESC-50 dataset:
curl -LO https://github.com/karoldvl/ESC-50/archive/master.zip
Then you should be good to go!
Start up Jupyter to play in the notebooks:
All of our pretraining data munging can be found here:
Walk through the notebook, which guides you through creating all the spectrogram data needed for the learner notebook, as well as some bonus code for generating gifs like the ones above.
Whisp is trained on the ESC-50 dataset
The paper on this dataset is short and fun to read :)
You can train our environmental sound classifier with this notebook:
Before running this notebook, please make sure you have generated the spectrograms from the Spectrogram Generator notebook.
Ssh into your paperspace machine, clone this repo, and then go through the Spectrogram Generator notebook above to generate your spectrograms for learning.
Follow the Installation instructions above, but when you get to starting Jupyter notebook, use this command instead:
jupyter notebook --no-browser --port=8889 --NotebookApp.allow_remote_access=True
Open up another tab and ssh into your machine again, like so:
ssh -N -L localhost:8888:localhost:8889 firstname.lastname@example.org
It appears to hang after enter on password, but its all good.
Back in the first tab, copy and paste the Jupyter notebook url, but change 8889 to 8888 like so:
From there you should be able to run all the notebooks, so start with the Spectrogram Generator notebook to create your spectrograms for training :)
After creating your spectrograms, you can run through the learner notebook.
At the end of our second set of training, we get the following results:
0.122500 or 87.25% accuracy!
At the end of the notebook, we export our model
export.pkl in the
app/model directory to be used in our web app ^_^
If you have errors running the app with errors about 'installing Python as a framework', try runnning
conda install matplotlib
To run the web server, run:
python app/server.py serve
The app is served at
You can test uploading files from the
field_recordings directory, which has three 5-second recordings that I made.
The repo is set up to deploy easily to any number of web hosting services that support Docker.
I ended up going with Render, but you can use whatever works for you :)
You can create movies from the spectrograms with the following command:
ffmpeg -i animated.gif -movflags faststart -pix_fmt yuv420p -vf "scale=trunc(iw/2)*2:trunc(ih/2)*2" video.mp4