<a href="https://colab.research.google.com/github/Bogula/AI_Music/blob/main/Copy_of_GITv2_BrainMusic_Train_polyphony_rnn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This is a modified copy of the Brain Music Polyphony Training Notebook form Raquel Bujalance and Cecil Fernandez Briche.
I modified some of the paths for a workshop I held at the SAAI Symposium on Aug 8, 2021 to test this out with the participants. The original notebook can be found here: https://github.com/brainmusic/models


Licensed under the Apache [License](https://www.apache.org/licenses)

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


---



# Brain Music. Polyphony RNN Training
Raquel Bujalance and Cecil Fernandez Briche

This colab notebook lets you to train Polyphony RNN for music generation, based on [Magenta library](https://github.com/tensorflow/magenta/tree/master/magenta/models/polyphony_rnn).This model applies language modeling using an LSTM to polyphonic music generation.

The notebook has been created to help anyone who wants to try training their own polyphony model using Magenta library, step by step, in colab. We have trained it with different samples linked to emotions, you can see the results in [post].


Instructions for running:
Make sure to use a GPU runtime, click: Runtime >> Change Runtime Type >> GPU
Double-click any of the hidden cells to view the code.

Note: If you are going to train a heavy model keep in mind that google can restrict the use of [GPUs](https://research.google.com/colaboratory/faq.html#gpu-availability), "It is possible that a user who uses Colaboratory for long term calculations has a temporary restriction on the type of hardware available to him or the time during which he can use it. We encourage users with significant computing needs to use the Colaboratory UI with a local execution environment."



---



# Environment Setup
Install magenta and fluidsynth, a sequence synthesis.

In [None]:
#@title Install

#@markdown Install magenta and fluidsynth as a synthesizer to listen de audios.
 #@markdown Magenta is compatible with both Python 2 and 3.
 #@markdown This take some time, specially for fluidsynth installation

!apt-get update -qq && apt-get install -qq libfluidsynth1 fluid-soundfont-gm build-essential libasound2-dev libjack-dev
!pip install -qU pyfluidsynth pretty_midi

!pip install magenta

# Hack to allow python to pick up the newly-installed fluidsynth lib. 
# This is only needed for the hosted Colab environment.
import ctypes.util
orig_ctypes_util_find_library = ctypes.util.find_library
def proxy_find_library(lib):
  if lib == 'fluidsynth':
    return 'libfluidsynth.so.1'
  else:
    return orig_ctypes_util_find_library(lib)
ctypes.util.find_library = proxy_find_library

##**SUBFOLDERS**

In oder to make this work you need to create 4 subdirectories:

midi - for the initial midi files

miditf - for the converted tensor flow input data

midirun - for the model

midiout - for the generated midi files


In [2]:
#@title Drive Setup
#@markdown If your training sample is in google drive you need to connect it. 
#@markdown You can also upload the data to a temporary folder but it will be 
#@markdown lost when the session is closed.
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
#@title Import Dependencies
#@markdown Import libraries from Magenta, Tensorflow and Numpy

from google.colab import files
import numpy as np
import os
import tensorflow as tf
import magenta.music as mm
import magenta
from magenta.scripts import convert_dir_to_note_sequences
from magenta.models.polyphony_rnn import *



# Sample Adaptation

Magenta does not work directly with Midi files but with NoteSequences, so the first step to create the sample is to convert the Midi files into Note sequences files and pack them as tfrecord to work with them fast and efficiently.


You need to define the folder with the midi files, as well as the
folder in which the notesequence will be saved. Change the routes from below to your own folders. The output of this step (notesequences)  can be saved directly as a temporary file, e.g. '/tmp/notesequences.tfrecord'  instead of on drive, since after the next step in which the sample is split it would no longer be necessary.  

Tip for novices: You can see the path of the folder to the left in files and with the mouse button select "copy path". As the route usually contains   "My drive" remember to enter "\\"  in the middel as "My \ drive".

In [None]:
!convert_dir_to_note_sequences \
--input_dir=/content/drive/MyDrive/midi/ \
--output_file=/temp/notesequences.tfrecord \
--log=INFO

Now you are ready to separate your sample between train and test. The percentage you leave in the test for evaluate your model is defined with the eval ratio argument. For example with a eval ratio equal to 10%, the 90% of the sample will be saved in the traing collection, while the remaining 10% will be stored as evaluation sample.

The input for this step must match the one defined as output in the previous step.


In [None]:
#test and train sample split wih 10% ratio
!polyphony_rnn_create_dataset \
--input=/temp/notesequences.tfrecord \
--output_dir=/content/drive/MyDrive/miditf \
--eval_ratio=0.10 \
--config='polyphony'

If the cell has been executed correctly, you have to have two files saved in the output_dir, both in tfrecord format, one with the training sample and one with the eval sample. 


# Model Training
Now you are ready to train your model!

This step can take a long, long time and depending on how large your database is and the number of layers and their size, you may get a memory error, or lose the connection to the GPU.

We recommend you to start with a small sample and a light model for example a bach size of 64 and two LSTM rnn layers of 64, "batch_size=64,rnn_layer_sizes=[64,64]" and incorporate more complexity little by little. If you save the checkpoints you can re-launch the training at the point where you left the previous session, this is especially interesting if you lose the web connection or your session closes unexpectedly.  


To train the model you can define the following parameters: 
* **run_dir** is the directory where checkpoints and TensorBoard data will be stored.
* **sequence_example_file** is the TFRecord file with the train sample, the folder  must be the same as the one defined in output_dir in the previous step .
* **num_training_steps** is an optional parameter for how many update steps to take before exiting the training loop. By default, training will run continuously until manually terminated.
* **hparams** is another optional parameter that specifies the hyperparameters you want to use; batch size and RNN Layers in a vector with the units considered in each layer. 
* **dropout_keep_prob** is a optional parameter to reduce overfitting and improving model performance. Dropout is a regularization method to select randomly a % of neurons in the LSTM units thats are probabilistically excluded from activation and weight updates while training the model. 
* **learning_rate** is another optional parameter that controls how quickly or slowly a neural network model learns. This value is usually between 0.0 and 1.0, a learning rate too small may result in a long training process that could get stuck, whereas a value too large may result in an unstable training process.
* **clip_norm** is another optional parameter. Gradient clipping clips parameters' gradients during backpropagation by a maximum norm to prevent Vanishing/Exploding gradients.

By default polyphony_rnn model use this configuration: 
* batch_size=64,
* rnn_layer_sizes=[256, 256, 256]
* dropout_keep_prob=0.5
* learning_rate=0.001
* clip_norm=5

Tip for novices: if you change the hyperparameters, for example by increasing the number of layers, remember to change the directory where the checkpoints are stored, otherwise the model will try to link to the last training and will give you an error of layer dimensions. 


In [None]:
#Train the model!
!polyphony_rnn_train \
--run_dir=/content/drive/MyDrive/midirun/run1 \
--sequence_example_file=/content/drive/MyDrive/miditf/training_poly_tracks.tfrecord \
--num_training_steps=1000 \
--hparams="batch_size=4,rnn_layer_sizes=[128,128,128]" \
--config='polyphony' \
--num_checkpoints=10


When you consider that the model is sufficiently tuned you can keep it in a bundle file. This allows you to import the trained model at any time and use it to create new sequences. To save it you have to call the same function of the previous step polyphony_rnn_generate, but changing some of the parameterization

*   the run directory has to be the same as in previous step 
*   hparam must also be the same as those defined in the training. 
*   bundle_file is the path where to save the file with the model.mag









In [None]:
#Save your model 
!polyphony_rnn_generate \
--run_dir=/content/drive/MyDrive/midirun/run1 \
--hparams="batch_size=64,rnn_layer_sizes=[128,128,128]" \
--bundle_file=/content/drive/MyDrive/midirun/run1/my_poly_rnn.mag \
--config='polyphony' \
--save_generator_bundle

# Generating polyphonic tracks 
New tracks can be generated from the last saved checkpoint of the model (save in run_dir) or from the bundle, here is an example of both features. 
In addition the sequence can be started from the first notes of a midi file or directly by giving the notes. 

##Generation from a check point
When you create a new melody from the last checkpoint trained you can do it at the end of the process or during the training to analyze the fit of the model. The training function also allows you to evaluate the model in the test sample, but what better test than the human ear? 
In fact this type of models are usually evaluated through listening test when  participants are asked to rate the generated sample in terms of the Likert scale, see for example the evaluation of [music transformed model](https://arxiv.org/pdf/1809.04281.pdf). 
As in the previous cases the run_dir must be the same path where the checkpoints have been saved, in output_dir you must put the path of the directory where you want to save the new creations. 
num_outputs gives the number of samples to generate and num_steps the length of the track. 
In this case, the generation is produced from three notes in Midi [language](https://newt.phys.unsw.edu.au/jw/notes.html) inserted as primer_pitches.





In [None]:
#generate new track with the trained model from a sequence of notes
!polyphony_rnn_generate \
--config='polyphony' \
--run_dir=/content/drive/MyDrive/midirun/run1 \
--output_dir=/content/drive/MyDrive/midiout/poly_train1 \
--hparams="batch_size=64,rnn_layer_sizes=[128,128,128]" \
--num_outputs=10 \
--num_steps=200 \
--primer_pitches="[67,64,60]" \
--condition_on_primer=true \
--inject_primer_during_generation=false

But to create a richer structure, it is better to start the sequence with a few seconds of a real track, this can be done by replacing primer_pitches by primer_midi, you have included the midi path you want to use. 
It is recommended to use only a few seconds, so before using it you can use the magenta library to extract some notes and save it in a temporal folder. 


In [None]:
#Choose your own MIDI file
primer_midi=("/content/drive/MyDrive/midiout/poly_train1/2021-08-08_110152_01.mid")
primer_ns=mm.midi_file_to_note_sequence(primer_midi)
mm.plot_sequence(primer_ns)


#Generation from a Bundle
Creating songs from a saved bundle is just as easy as changing run_dir for bundle_file, with the folder path where the bundle is stored.


In [None]:
#generate new sequences with the train model (budle_file)
!polyphony_rnn_generate \
--config='polyphony_rnn ' \
--bundle_file=/content/drive/MyDrive/midirun/run1/dowland_rnn_128.mag \
--output_dir=/content/drive/MyDrive/midiout/poly_train1 \
--num_outputs=3 \
--num_steps=100 \
--condition_on_primer=False \
--inject_primer_during_generation=False