Music Genre Classification with LSTMs
- Classify music files based on genre from the GTZAN music corpus
- GTZAN corpus is included for easy of use
- Use multiple layers of LSTM Recurrent Neural Nets
- Implementations in PyTorch, Keras & Darknet.
Test trained LSTM model
./weights/ you can find trained model weights and model architecture.
To test the model on your custom audio file, run
python3 predict_example.py path/to/custom/file.mp3
or to test the model on our custom files, run
python3 predict_example.py audios/classical_music.mp3
Audio features extracted
- librosa - for audio feature extraction
pip install keras
pip install torch torchvision
brew install libomp
Ideas for improving accuracy:
- GTZAN dataset has problems, how do we use it with consideration?
- Normalize MFCCs & other input features (Recurrent BatchNorm?)
- Decay learning rate
- How are we initing the weights?
- Better optimization hyperparameters (too little dropout)
- Do you have avoidable bias? How's your variance?
At Epoch 400, training on a TITAN X GPU (October 2017):
At Epoch 400, training on a 2018 Macbook Pro CPU (May 2019):