Music sheet generation experimentation using recurrent neural networks
Switch branches/tags
Nothing to show
Clone or download

README.md

Music Score Generation

Project realized at Swiss Federal Institute of Technology in Lausanne (EPFL) by Yoann Ponti, supervised by Nathanaël Perraudin and Michaël Defferrard.

The complementary blogpost can be found here.

Getting Started

These instructions will get you a copy of the project up and running on your machine.

Prerequisites

The codebase is written in python. The bare minimum you should do to get everything running, assuming you have running conda environement, is

# Clone the repo
https://github.com/onanypoint/epfl-semester-project-biaxialnn.git biaxialnn
cd biaxialnn

# Install dependencies
conda create -n biaxialnn python=3.5
source activate biaxialnn
pip install -r requirements.txt

If you want to run the model on GPU, you also need to have a theanorc (more info) with GPU enable. You might also want to look at this page related to CUDA backend.

Before starting

Before running any machine learning code, you will have to obtain a large collection of score file. Those should be placed in a data folder dedicated to "raw" scores.

There is a jupyter notebook included showing an example of a pipeline going through the data retrieval from musescore.com. The pre-processing to go from the "raw" score to the statematrix representation. And finally the training of the model before generating a score.

Other source of "raw" scores:

Interface

Configuration file

The first step is to create a configuration file. There is an example file included.

cp config.ini.example confi.ini

Configuration options are:

pitch_lowerBound        #   Minimum midi pitch number taken into account during 
                            the processing phase. If a note lower than that is 
                            encountered, it will be discarded and a warning 
                            will be shown.

pitch_upperBound        #   Maximum midi pitch number taken into account.

measure_quantization    #   In how many timestep a measure is divided into. By
                            default it is 16 meaning that each quarter note is
                            "subdivided" into four. Making the 16th note the 
                            shortest note which can be represented.

batch_size              #   Number of score segment to process in parallel
                            during the training phase. Be aware that the bigger
                            the number is, the greater chance of an "out of 
                            memory" error.

seq_len                 #   Length of the sequence used during training.       

division_len            #   Minimum number of timestep between two sequences 
                            taken from the same score.

musescore_api_key       #   Musescore API Key. Only used during data retrieval.
                            More info can be found on developers.musescore.com

A CLI implementation is included.

Usage

$ python main.py [-h] {preprocess, train, generate}

Without any options running the preprocess, train and generate command will be run using the default values and on the minimal dataset present in the repository.

Warning

A training iteration takes around 10 seconds on a GTX 650. The training can be gracefully stopped with CTRL-C.


$ python main.py preprocess --help
usage: main.py preprocess [-h] [-d INPUT_DIRECTORY] [-o OUTPUT_FILE]
                          [-c COUNT]

optional arguments:
  -h, --help            show this help message and exit
  -d INPUT_DIRECTORY, --input-directory INPUT_DIRECTORY
                        Directory containing scores to transform to
                        statematrix format.
  -o OUTPUT_FILE, --output-file OUTPUT_FILE
                        Save statematrix pickle to this location.
  -c COUNT, --count COUNT
                        Number of randomly selected scores to process.
$ python main.py train --help                    
  -h, --help            show this help message and exit
  -f STATEMATRIX_FILE, --statematrix-file STATEMATRIX_FILE
                        File containing statematrix pickle, i.e output of the
                        preprocessing.
  -o OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY
                        Where to save meta information during training.
  -e TRAINING_EPOCHS, --training-epochs TRAINING_EPOCHS
                        Number of iterations to run for.
  -t T_LAYER_SIZES, --t-layer-sizes T_LAYER_SIZES
                        List of size for each LSTM layer used for the time
                        model.
  -p P_LAYER_SIZES, --p-layer-sizes P_LAYER_SIZES
                        List of size for each LSTM layer used for the pitch
                        model.
  -r DROPOUT, --dropout DROPOUT
                        Dropout value
  -v VALIDATION_SPLIT, --validation-split VALIDATION_SPLIT
                        Percentage of pieces to keep for validation purposes.
  -m MODEL_CONFIG, --model-config MODEL_CONFIG
                        Model config (trained weights) to load before starting
                        training.
$ python main.py generate --help
usage: main.py generate [-h] [-t T_LAYER_SIZES] [-p P_LAYER_SIZES]
                        [-r DROPOUT] -s SEED [-l LENGTH] [-c CONSERVATIVITY]
                        [-m MODEL_CONFIG] [-o OUTPUT_DIRECTORY] [-n NAME]

optional arguments:
  -h, --help            show this help message and exit
  -t T_LAYER_SIZES, --t-layer-sizes T_LAYER_SIZES
                        List of size for each LSTM layer used for the time
                        model.
  -p P_LAYER_SIZES, --p-layer-sizes P_LAYER_SIZES
                        List of size for each LSTM layer used for the pitch
                        model.
  -r DROPOUT, --dropout DROPOUT
                        Dropout value
  -s SEED, --seed SEED  Score to use as seed for generation.
  -l LENGTH, --length LENGTH
                        Number of timestep to generate.
  -c CONSERVATIVITY, --conservativity CONSERVATIVITY
                        Conservativity value, i.e. how much freedom is given
                        to the generation process.
  -m MODEL_CONFIG, --model-config MODEL_CONFIG
                        Model config (trained weights) to load before starting
                        generation.
  -o OUTPUT_DIRECTORY, --output-directory OUTPUT_DIRECTORY
                        Where to save the generated samples.
  -n NAME, --name NAME  Name of the generated sample.

Built With

  • Theano - Machine learning framework
  • Music21 - Toolkit for computer-aided musicology

Acknowledgments