# Royal Institute of Technology - KTH
# DD2424 - Deep Learning of Data Science.
Project edited by Victor Sanchez - 19980429-T517.


Training of a model using a collection of piano MIDI files from the [MAESTRO dataset](https://magenta.tensorflow.org/datasets/maestro). 

This file is a developped version of the tutorial [Music generation with an RNN](https://www.tensorflow.org/text/tutorials/music_generation).

In [None]:
from main import *
reset_keras()

## Download the Maestro dataset & Creation of the training dataset


In [None]:
data_dir = pathlib.Path('data/maestro-v2.0.0')
download_dataset(data_dir)

filenames = generate_filename(data_dir)
print('The dataset contains ' + str(len(filenames)) + ' MIDI files:')

key_order = ['pitch', 'step', 'duration']

nb_of_file_input = 1

batch_size = 64
seq_length = 8
train_ds, list_file_in_dataset, nb_notes = dataset_generator_maestro(key_order, filenames, seq_length, batch_size, nb_files = nb_of_file_input, random_file = True, display_dataset_info = False, display_sequence_info = False)


print('\n the dataset generated has the following characteristics :')
print("\n",train_ds.element_spec)

midi_file_of_dataset = midi_to_notes(list_file_in_dataset[0])
for i in range(1, nb_of_file_input):
    midi_file_of_dataset_temporary = midi_to_notes(list_file_in_dataset[i])
    midi_file_of_dataset = pd.concat([midi_file_of_dataset,midi_file_of_dataset_temporary], ignore_index=True)
#print("\n piano roll original")
#plot_piano_roll(midi_file_of_dataset,'original')

#print("\n distribution of original")
#plot_distributions(midi_file_of_dataset, 'original')

<font color='red'>DISCLAIMER: </font> Before each training seesion, restart the kernal of your notebook

## Creation of dataset with given file

In [None]:
# file_audio = 'data/other/pachelbel_canon.midi'
# key_order = ['pitch', 'step', 'duration']
# nb_of_file_input = 1
# batch_size = 64
# seq_length = 8
# train_ds, nb_notes = dataset_generator_with_file(key_order, file_audio, seq_length, batch_size, display_dataset_info = False, display_sequence_info = False)
# print("\n",train_ds.element_spec)

# # print("\n piano roll original")
# # plot_piano_roll(midi_to_notes(file_audio),'Original')
# # print("\n distribution of original")
# # plot_distributions(midi_to_notes(file_audio), 'Original')


Remark on representation of a note :
When training the model: `pitch`, `step` and `duration`. The pitch is the perceptual quality of the sound as a MIDI note number. 
The `step` is the time elapsed from the previous note or start of the track.
The `duration` is how long the note will be playing in seconds and is the difference between the note end and note start times. 


It seems easier to interpret the note names rather than the pitches, so the function below is used to convert from the numeric pitch values to note names. 
The note name shows the type of note, accidental and octave number
(e.g. C#4). 

The training of the model is made on batches of sequences of notes. Each example consists of a sequence of notes as the input features, and next note as the label.

Notes for users :
Set the sequence length for each example. Experiment with different lengths (e.g. 50, 100, 150) to see which one works best for the data, or use [hyperparameter tuning](https://www.tensorflow.org/tutorials/keras/keras_tuner). The size of the vocabulary (`vocab_size`) is set to 128 representing all the pitches supported by `pretty_midi`.

## Create and train a model with single hidden layer

In [None]:
learning_rate = 0.005
type_RNN = "LSTM" # or "RNNSimple"
type_optimizer = "Adam" # or "Adagrad" or "RMSProp"
nb_neurons = 128
nb_epochs = 5
model = Network_init(seq_length, learning_rate, type_RNN, type_optimizer, nb_neurons)

title = "seq_length="+str(seq_length)+"_learning_rate="+str(learning_rate)+"_nb_epochs="+str(nb_epochs)+"_batch_size="+str(batch_size)+"_type_RNN="+type_RNN+"_type_optimizer="+type_optimizer+"_nb_of_file_input="+str(nb_of_file_input)+"_nb_units="+str(nb_neurons)


Note for users : Testing the `model.evaluate` function, you can see that the `pitch` loss is significantly greater than the `step` and `duration` losses. 
Note that `loss` is the total loss computed by summing all the other losses and is currently dominated by the `pitch` loss.

## Create a model with multiple hidden layer

In [None]:
# learning_rate = 0.005
# type_RNN = "LSTM" # or "RNNSimple" or "GRU"
# type_optimizer = "Adam" # or "Adagrad" or "Adam"
# nb_epochs = 5
# nodes = [128,128]
# """
# if len(nodes) == 1:
#     model = Network_init(seq_length, learning_rate, type_RNN, type_optimizer, nb_neurons = nodes[0])
# else:
#     model = Network_init_multi_layers(seq_length, learning_rate, nodes, type_RNN, type_optimizer)
# """
# model = Network_init_test(seq_length, learning_rate, nodes, type_RNN, type_optimizer)
# #model = Network_init(seq_length, learning_rate, type_RNN, type_optimizer, nb_neurons = nodes[0])
# #model = Network_init_multi_layers(seq_length, learning_rate, nodes, type_RNN, type_optimizer)

# nb_epochs = 5
# title = "seq_length="+str(seq_length)+"_learning_rate="+str(learning_rate)+"_nb_epochs="+str(nb_epochs)+"_batch_size="+str(batch_size)+"_type_RNN="+type_RNN+"_type_optimizer="+type_optimizer+"_nb_of_file_input="+str(nb_of_file_input)+"_nb_units="+str(nodes)


## Training of the generated model

In [None]:

history = train_network(model, train_ds, nb_epochs)
plotting_result(history, title)

## Prediction of notes

We first provide a starting sequence of notes. The function below generates one note from a sequence of notes. 

For note pitch, it draws a sample from softmax distribution of notes produced by the model, and does not simply pick the note with the highest probability.
Always picking the note with the highest probability would lead to repetitive sequences of notes being generated.

The `temperature` parameter can be used to control the randomness of notes generated.

Now generate some notes. You can play around with temperature and the starting sequence in `next_notes` and see what happens.

In [None]:
pm_sample = pretty_midi.PrettyMIDI(list_file_in_dataset[0])
raw_notes = midi_to_notes(list_file_in_dataset[0])
#print('Number of instruments:', len(pm_sample.instruments))
instrument = pm_sample.instruments[0]
instrument_name = pretty_midi.program_to_instrument_name(instrument.program)

vocab_size = 128
temperature = 0.9
num_predictions = 100 #int(nb_notes)

title_new = title+"_temperature="+str(temperature)

generated_notes, out_pm = predict_notes(raw_notes, seq_length, title_new,  vocab_size, instrument_name, temperature, model, export_prediciton = True, num_predictions = num_predictions)


# print("\n piano roll predicted")
plot_piano_roll(generated_notes, title_new)

# print("\n distribution of estimated")
# plot_distributions(generated_notes, title)
plot_multiple_distributions(midi_file_of_dataset, generated_notes, title_new)

## Prediction of a single audio for multiple file data set

In [None]:
# pm_sample = pretty_midi.PrettyMIDI(list_file_in_dataset[0])
# raw_notes = midi_to_notes(list_file_in_dataset[0])
# #print('Number of instruments:', len(pm_sample.instruments))
# instrument = pm_sample.instruments[0]
# instrument_name = pretty_midi.program_to_instrument_name(instrument.program)

# vocab_size = 128
# temperature = 1
# num_predictions = len(raw_notes)

# #title_new = title+"_temperature="+str(temperature)
# title_new = title
# generated_notes, out_pm = predict_notes(raw_notes, seq_length, title_new,  vocab_size, instrument_name, temperature, model, export_prediciton = True, num_predictions = num_predictions)


# # print("\n piano roll predicted")
# plot_piano_roll(generated_notes, title_new)


# midi_file_of_dataset_first_file = midi_to_notes(list_file_in_dataset[0])
# # print("\n distribution of estimated")
# # plot_distributions(generated_notes, title)
# plot_multiple_distributions(midi_file_of_dataset_first_file, generated_notes, title_new)

## Prediction of personnal audio

In [None]:

# pm_sample = pretty_midi.PrettyMIDI(file_audio)
# raw_notes_perso = midi_to_notes(file_audio)
# #print('Number of instruments:', len(pm_sample.instruments))
# instrument = pm_sample.instruments[0]
# instrument_name = pretty_midi.program_to_instrument_name(instrument.program)

# vocab_size = 128
# temperature = 0.3
# num_predictions = nb_notes
# title_perso = title
# generated_notes, out_pm = predict_notes(raw_notes_perso, seq_length, title_perso,  vocab_size, instrument_name, temperature, model, export_prediciton = True, num_predictions = nb_notes)

# #print("\n Play the generated music")
# #display_audio(out_pm)

# print("\n piano roll original")
# plot_piano_roll(midi_to_notes(file_audio),title_perso)
# print("\n piano roll predicted")
# plot_piano_roll(generated_notes, title_perso)

# print("\n distribution of original")
# plot_distributions(midi_to_notes(file_audio), title_perso)

# print("\n distribution of estimated")
# plot_distributions(generated_notes, title_perso)