Skip to content

Appolo's Ear is an ensemble of three music genre classification neural networks, a nginx server, and a demonstrative react frontend. Its backend is glued together with Docker and Docker compose.

Notifications You must be signed in to change notification settings

batchema/apollos-ear

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Apollo's Ear: Music Classification across 13 genres

Description

Appolo's Ear is an ensemble of three music genre classification neural networks, a nginx server, and a simple react frontend. Its backend is glued together with Docker and Docker compose and you can easily replicate it by following the steps below. This infrastructure can also be used for other sorts of online classification system. The goal of the project was to learn more about Tensorflow and Docker, about integrating a machine learning model with a fullstack application, and about how setting up a https server with Nginx.

Project Structure

  • data_acquisition/ data acquisition and audio cleanup utility files
  • classifiers.ipynb: Jupyter notebook to implement and train models
  • server/ server files
  • client.py: simple main file for testing

Running the Project

A. If interested only in implementing the models

  1. Fork the repo
  2. Download the GTZAN dataset. If you desire aquiring extra data to classify genres other than the ones in the dataset, check data_acquisition/audio_data_collector.py for the needed utility functions. Check this tutorial for how to use Selenium. Note that I personally used Brave Browser and that it works with the Chrome Driver
  3. Open classifiers.ipynb in Google Collab (recommended) or your favorite Jupyter server
  4. Follow the notebook to train and save the models. Note that if using Google Collab, you will need to link your Google Drive

B. If interested in A. and also putting the dockerize project on a server

  1. Follow all of A.
  2. Once you have trained and saved your models, put them somewhere in the server folder.
  3. Edit MLP_PATH, CNN_PATH, and RNN_PATH in genre_prediction_service.py to point to your MLP, CNN and RNN_LSTM models respectively. Go to Step 7 if you do not wish to test the system locally
  4. Install Docker and Docker Compose on your machine. Note that Docker Compose is included with the official installations of Docker on Windows and macOS
  5. In your terminal, cd server and run docker-compose build. Then, run docker-compose up. An Nginx server proxying a UWsgi server which serves server.py should now be running.
  6. In client.py, make sure URL=DOCKER_TEST_PATH and that TEST_PATH points to an audio file.
  7. Run client.py. It should hit your server and return a predicted genre if everything went as desired
  8. Spin up a server (EC2 instance recommended as it is what I used) and scp the server/ folder onto it. Then make sure the http port is accessible from anywhere (On EC2, this is done by adding inbound rules.
  9. ssh onto the server and cd server then chmod +x init.sh
  10. Run ./init.sh. Make sure to read init.sh to ensure I am not making you download malware ;)
  11. Copy your server ip to SERVER_IP in client.py, then make sure URL=PROD_URL. Then run the file. It should hit your server and return a prediction if all went well.

By now, you should have 3 models for music genre classification, and a production-ready server to power any frontend application

PS: Tensorflow hangs on some servers when using the RNN_LSTM model. I am as of yet incapable of solving the bug. If it happens to you, best course of action is to use a different model on the server. If you manage to solve the bug, please let me know.

Special Note

I would have not been able to complete this project if not for the content of multiple open-source engineers. I would like to particularly highlight Valerio Velardo from the The Sound of AI. Anyone interested in Machine Learning for audio classification should definitely check his pages.

About

Appolo's Ear is an ensemble of three music genre classification neural networks, a nginx server, and a demonstrative react frontend. Its backend is glued together with Docker and Docker compose.

Topics

Resources

Stars

Watchers

Forks