- Classify music files based on genre from the GTZAN music corpus
- GTZAN corpus is included for easy of use
- Use multiple layers of (bidirectional) Recurrent Neural Nets
- Implementations in PyTorch and Keras.
In the ./weights/
you can find trained model weights and model architecture.
To test the model on your custom audio file, run
python3 predict_example.py path/to/custom/file.mp3
or to test the model on our custom files, run
python3 predict_example.py audios/classical_music.mp3
- Normalize MFCCs & other input features (Recurrent BatchNorm?)
- Decay learning rate
- How are we initing the weights?
- Better optimization hyperparameters (too little dropout)
- Do you have avoidable bias? How's your variance?
-
Training (at Epoch 400): Training loss: 0.5801 Training accuracy: 0.7810
-
Validating: Dev loss: 0.734523485104 Dev accuracy: 0.766666688025
-
Testing: Test loss: 0.900845060746 Test accuracy: 0.683333342274