Music Score Generation
Project realized at Swiss Federal Institute of Technology in Lausanne (EPFL) by Yoann Ponti, supervised by Nathanaël Perraudin and Michaël Defferrard.
The complementary blogpost can be found here.
Getting Started
These instructions will get you a copy of the project up and running on your machine.
Prerequisites
The codebase is written in python. The bare minimum you should do to get everything running, assuming you have running conda environement, is
# Clone the repo
https://github.com/onanypoint/epfl-semester-project-biaxialnn.git biaxialnn
cd biaxialnn
# Install dependencies
conda create -n biaxialnn python=3.5
source activate biaxialnn
pip install -r requirements.txt
If you want to run the model on GPU, you also need to have a theanorc
(more info) with GPU enable. You might also want to look at this page related to CUDA backend.
Before starting
Before running any machine learning code, you will have to obtain a large collection of score file. Those should be placed in a data folder dedicated to "raw" scores.
There is a jupyter notebook included showing an example of a pipeline going through the data retrieval from musescore.com. The pre-processing to go from the "raw" score to the statematrix representation. And finally the training of the model before generating a score.
Other source of "raw" scores:
Interface
Configuration file
The first step is to create a configuration file. There is an example file included.
cp config.ini.example confi.ini
Configuration options are:
pitch_lowerBound # Minimum midi pitch number taken into account during
the processing phase. If a note lower than that is
encountered, it will be discarded and a warning
will be shown.
pitch_upperBound # Maximum midi pitch number taken into account.
measure_quantization # In how many timestep a measure is divided into. By
default it is 16 meaning that each quarter note is
"subdivided" into four. Making the 16th note the
shortest note which can be represented.
batch_size # Number of score segment to process in parallel
during the training phase. Be aware that the bigger
the number is, the greater chance of an "out of
memory" error.
seq_len # Length of the sequence used during training.
division_len # Minimum number of timestep between two sequences
taken from the same score.
musescore_api_key # Musescore API Key. Only used during data retrieval.
More info can be found on developers.musescore.com
A CLI implementation is included.
Usage
$ python main.py [-h] {preprocess, train, generate}
Without any options running the preprocess, train and generate command will be run using the default values and on the minimal dataset present in the repository.
Warning
A training iteration takes around 10 seconds on a GTX 650. The training can be
gracefully stopped with CTRL-C
.
$ python main.py preprocess --help
usage: main.py preprocess [-h] [-d INPUT_DIRECTORY] [-o OUTPUT_FILE]
[-c COUNT]
optional arguments:
-h, --help show this help message and exit
-d INPUT_DIRECTORY, --input-directory INPUT_DIRECTORY
Directory containing scores to transform to
statematrix format.
-o OUTPUT_FILE, --output-file OUTPUT_FILE
Save statematrix pickle to this location.
-c COUNT, --count COUNT
Number of randomly selected scores to process.
$ python main.py train --help
-h, --help show this help message and exit
-f STATEMATRIX_FILE, --statematrix-file STATEMATRIX_FILE
File containing statematrix pickle, i.e output of the
preprocessing.
-o OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY
Where to save meta information during training.
-e TRAINING_EPOCHS, --training-epochs TRAINING_EPOCHS
Number of iterations to run for.
-t T_LAYER_SIZES, --t-layer-sizes T_LAYER_SIZES
List of size for each LSTM layer used for the time
model.
-p P_LAYER_SIZES, --p-layer-sizes P_LAYER_SIZES
List of size for each LSTM layer used for the pitch
model.
-r DROPOUT, --dropout DROPOUT
Dropout value
-v VALIDATION_SPLIT, --validation-split VALIDATION_SPLIT
Percentage of pieces to keep for validation purposes.
-m MODEL_CONFIG, --model-config MODEL_CONFIG
Model config (trained weights) to load before starting
training.
$ python main.py generate --help
usage: main.py generate [-h] [-t T_LAYER_SIZES] [-p P_LAYER_SIZES]
[-r DROPOUT] -s SEED [-l LENGTH] [-c CONSERVATIVITY]
[-m MODEL_CONFIG] [-o OUTPUT_DIRECTORY] [-n NAME]
optional arguments:
-h, --help show this help message and exit
-t T_LAYER_SIZES, --t-layer-sizes T_LAYER_SIZES
List of size for each LSTM layer used for the time
model.
-p P_LAYER_SIZES, --p-layer-sizes P_LAYER_SIZES
List of size for each LSTM layer used for the pitch
model.
-r DROPOUT, --dropout DROPOUT
Dropout value
-s SEED, --seed SEED Score to use as seed for generation.
-l LENGTH, --length LENGTH
Number of timestep to generate.
-c CONSERVATIVITY, --conservativity CONSERVATIVITY
Conservativity value, i.e. how much freedom is given
to the generation process.
-m MODEL_CONFIG, --model-config MODEL_CONFIG
Model config (trained weights) to load before starting
generation.
-o OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY
Where to save the generated samples.
-n NAME, --name NAME Name of the generated sample.