I implemented a deep neural network for music generation integrating a variational auto-encoder approach with transformers. Starting from the first idea of the architecture and a first dataset, I tried to emulate the current state of the art models. I gave importance to the comparison of different models trained on different datasets with the aim to understand if one genre or one instrument could generalize in a good way also the other genres and instruments. At the end of the experiment, I noticed how regularization techniques and the amount of available data could impact the results. In the end, good results were achieved. However, they are not comparable with the state of art results.
More information about regarding the project can be found in the report and in the slides .
Take a look at the Music_Generation_Transformer.ipynb notebook. You need a discrete number of midi files to achieve good training results.
Here we show some plots we obtained during the optimization phase.
This project was developed for the course of Intelligent Systems for Pattern Recognition at the University of Pisa under the guide of Prof. Davide Bacciu.
- Giovanni Sorice - Giovanni Sorice
-
Colin Raffel. "Learning-Based Methods for Comparing Sequences, with Applications to Audio-to-MIDI Alignment and Matching". PhD Thesis, 2016.
-
Phil Wang. "compressive-transformer-pytorch". GitHub repo.