# Magenta RNN

In this notebook, we will be generating three basic melodies using Magenta and it's three models. Magenta usesa Recurring Nural Network, or RNN, to generate it's music. RNN's are a class of feedforward Neural Networks where nodes form a connected graph along of a temporal sequence.

In [112]:
import math
import os
import time
import warnings

def action_with_warning():
    warnings.warn("should not appear")

with warnings.catch_warnings(record=True):
    action_with_warning()
    
import magenta.music as mm
from magenta.models.melody_rnn import melody_rnn_sequence_generator
from magenta.music import DEFAULT_QUARTERS_PER_MINUTE
from magenta.protobuf.generator_pb2 import GeneratorOptions
from magenta.protobuf.music_pb2 import NoteSequence
from visual_midi import Plotter

def generate(bundle_name: str,
             sequence_generator,
             generator_id: str,
             primer_filename: str = None,
             qpm: float = DEFAULT_QUARTERS_PER_MINUTE,
             total_length_steps: int = 64,
             temperature: float = 1.0,
             beam_size: int = 1,
             branch_factor: int = 1,
             steps_per_iteration: int = 1,
             generatedname = str,
             show_plot: bool = False) -> NoteSequence:
    mm.notebook_utils.download_bundle(bundle_name, "bundles")
    bundle = mm.sequence_generator_bundle.read_bundle_file(os.path.join("bundles", bundle_name))
    generator_map = sequence_generator.get_generator_map()
    generator = generator_map[generator_id](checkpoint=None, bundle=bundle)
    generator.initialize()
    if primer_filename:
        primer_sequence = mm.midi_io.midi_file_to_note_sequence(
          os.path.join("simplemidi", primer_filename))
    else:
        primer_sequence = NoteSequence()
    if primer_sequence.tempos:
        if len(primer_sequence.tempos) > 1:
          raise Exception("No support for multiple tempos")
        qpm = primer_sequence.tempos[0].qpm
    # Calculates the seconds per 1 step, which changes depending on the QPM 
    # value (steps per quarter in generators are mostly 4)
    seconds_per_step = 60.0 / qpm / getattr(generator, "steps_per_quarter", 4)
  
    # Calculates the primer sequence length in steps and time by taking the
    # total time (which is the end of the last note) and finding the next step
    # start time.
    primer_sequence_length_steps = math.ceil(primer_sequence.total_time
                                             / seconds_per_step)
    primer_sequence_length_time = (primer_sequence_length_steps 
                                   * seconds_per_step)
    primer_end_adjust = (0.00001 if primer_sequence_length_time > 0 else 0)
    primer_start_time = 0
    primer_end_time = (primer_start_time
                       + primer_sequence_length_time
                       - primer_end_adjust)
    generation_length_steps = total_length_steps - primer_sequence_length_steps
    if generation_length_steps <= 0:
        raise Exception("Total length in steps too small "
                        + "(" + str(total_length_steps) + ")"
                        + ", needs to be at least one bar bigger than primer "
                        + "(" + str(primer_sequence_length_steps) + ")")
    generation_length_time = generation_length_steps * seconds_per_step
    generation_start_time = primer_end_time
    generation_end_time = (generation_start_time
                           + generation_length_time
                           + primer_end_adjust)
  
    # Showtime
    print("Primer time: ["
          + str(primer_start_time) + ", "
          + str(primer_end_time) + "]")
    print("Generation time: ["
          + str(generation_start_time) + ", "
          + str(generation_end_time) + "]")
    generator_options = GeneratorOptions()
    generator_options.args['temperature'].float_value = temperature
    generator_options.args['beam_size'].int_value = beam_size
    generator_options.args['branch_factor'].int_value = branch_factor
    generator_options.args['steps_per_iteration'].int_value = (
        steps_per_iteration)
    generator_options.generate_sections.add(
        start_time=generation_start_time,
        end_time=generation_end_time)
    sequence = generator.generate(primer_sequence, generator_options)
    
    date_and_time = time.strftime('%Y-%m-%d_%H%M%S')
    generator_name = str(generator.__class__).split(".")[2]
    midi_filename = "%s.mid" % (generatedname)
    midi_path = os.path.join("output", midi_filename)
    mm.midi_io.note_sequence_to_midi_file(sequence, midi_path)
    print("Generated midi file: " + str(os.path.abspath(midi_path)))
  
    # Writes the resulting plot file to the output directory
    date_and_time = time.strftime('%Y-%m-%d_%H%M%S')
    generator_name = str(generator.__class__).split(".")[2]
    plot_filename = "%s.html" % (generatedname)
    plot_path = os.path.join("output", plot_filename)
    pretty_midi = mm.midi_io.note_sequence_to_pretty_midi(sequence)
    plotter = Plotter()
    if show_plot:
        plotter.show(pretty_midi, plot_path)
    else:
        plotter.save(pretty_midi, plot_path)
    print("Generated plot file: " + str(os.path.abspath(plot_path)))
      
    return sequence
  
warnings.filterwarnings("ignore")


This method, generate, takes a RNN function and a midi, and returns a midi generated using said specified model. When generate is called, it takes in several parameters that affect how the music is generated.
<br>
<br>
RNN model: The RNN model that is being used (i.e basic, lookback, or attention)
<br>
total_length_step: The length of the sequence of the produced new song. The total new length of the song will be the length of the song that is provided as data, as well as length specified here
<br>
qpm: the quarters per minute of music. The default is 60 qpm
<br>
temperature: the randomness of the generated melody. <1 is less random and more like the primer melody, while >1 is more random.
<br>
beam size: the larger the beam size, the less random the sequence will be, at the cost of time
<br>
steps_per_iteration: # of steps at each iteratiron, the larger this value, the less iterations there are in total




## Basic RNN

This model is Magenta standard RNN model. As you can see, in this model, the primer melody is not reflected much in the generated music after. This is because the basic RNN configuration uses one-hot coding to represent extected melodies in the LSTM. It can generate a short melody that stays in the key of the sample meoldy, but it has trouble meeking the same feeling and tone of the music, which a human composer would be able to do. Basic RNN is not able to observe and catalogue patterns in music, and thus generate music that does not share many of the music elements that the primer melody has. Basic RNN is not very holistic, and does not concider groups of notes and the small melodies that they create, and instead is more focused on trying to represent the note-by-note changes. 
<br>
Basic RNN uses two features, inputs and labels, to generate melodies. The input is the previous event, aka the previous note measure, as a vector. The label is the next event, which can either be turning off (not playing anything), no event (if a note is currently playing, continue playing said note, else continue scilence), and a note-on event for every pitch (which also turns off any other note that might be playing). Based on what they input is, the model will select an event that is not too dissimilar from the previous event to make a following note that sounds somewhat similar to the previous note. By chaining these events together, this method forms a new song.  

In [113]:
sequence = generate(
    "basic_rnn.mag",
    melody_rnn_sequence_generator,
    "basic_rnn",
    primer_filename="fur_elis.mid",
    total_length_steps=128,
    temperature=1,
    generatedname = "fur_elis_basic_rnn",
    show_plot=False)
from IPython.display import HTML
HTML(filename="output/fur_elis_basic_rnn.html")

'model_variables' collection should be of type 'byte_list', but instead is of type 'node_list'.
INFO:tensorflow:Restoring parameters from /var/folders/dm/3kslprps6b736vz2bgdqpwx00000gn/T/tmper6zu_aa/model.ckpt
Primer time: [0, 1.6463307499999995]
Generation time: [1.6463307499999995, 23.414623999999993]
INFO:tensorflow:Beam search yields sequence with log-likelihood: -189.158035 
Generated midi file: /Users/brian/Synthetic-Symphony-ML-422/422magenta_brian/output/fur_elis_basic_rnn.mid
Generated plot file: /Users/brian/Synthetic-Symphony-ML-422/422magenta_brian/output/fur_elis_basic_rnn.html


In [114]:
sequence = generate(
    "basic_rnn.mag",
    melody_rnn_sequence_generator,
    "basic_rnn",
    primer_filename="got_melody.mid",
    total_length_steps=128,
    temperature=1,
    generatedname = "got_melody_basic_rnn",
    show_plot=False)
from IPython.display import HTML
HTML(filename="output/got_melody_basic_rnn.html")

'model_variables' collection should be of type 'byte_list', but instead is of type 'node_list'.
INFO:tensorflow:Restoring parameters from /var/folders/dm/3kslprps6b736vz2bgdqpwx00000gn/T/tmp11oy41tf/model.ckpt
Primer time: [0, 4.235282000000001]
Generation time: [4.235282000000001, 22.588224]
INFO:tensorflow:Beam search yields sequence with log-likelihood: -166.166565 
Generated midi file: /Users/brian/Synthetic-Symphony-ML-422/422magenta_brian/output/got_melody_basic_rnn.mid
Generated plot file: /Users/brian/Synthetic-Symphony-ML-422/422magenta_brian/output/got_melody_basic_rnn.html


As you can see, in the music generated in basic_rnn for Fur Elis and the Game of Thrones theme, the primer melody is not very evident in the music generated. The only recongnisable aspects of the music are the recycled notes, tempo, and the spacing of the notes which somewhat matches the melody. We will next use two different modified RNN methods that might give us a song that sounds more human and natural.

## Lookback RNN

Lookback RNN uses custom inputs and labels in order to train the model better. This specific method uses pattern recognition to generate notes in a less random way. It recognizes patters that occur in the primer melody by "looking back" at the melody one or two bars ago. Along with the same input and labels that basic RNN uses, Lookback RNN uses the previous events from 1 and 2 bars ago as aditional inputs. This gives the network more data points to find patterns, such as small melodies, and mirrored or constrasting melodies and cadences. Lookback RNN also uses the whether the last event was a repetition from the event from 1 or 2 bars ago as an input. This lets the program recognise repetition and determine which segements of the inputed music are single melodies. 
<br>
In addition to new inouts, we also get more labels with Lookback. These labels, unlike the Basic RNN, look back even further, 1 and 2 bars ago. If the current event is repeating the same event, aka the same sequence of notes, from 2 or 1 bars ago, we use a label to mark that spot as a repetition. On an event where there is no repetition, Lookback marks it with a label as a specific melody. For example, if we have a melody where the third bar repeats the first or second bar, each label for the third bar will be labeled as a repeated melody. But if the fouth bar is a unique melody, it is marked as uniqle. Lookback uses these new labels and inputs to map out the general structure of the input song, and create a some that has more structure. 

In [115]:
sequence = generate(
    "lookback_rnn.mag",
    melody_rnn_sequence_generator,
    "lookback_rnn",
    primer_filename="fur_elis.mid",
    total_length_steps=128,
    temperature=1,
    generatedname = "fur_elis_lookback_rnn",
    show_plot=False)
HTML(filename="output/fur_elis_lookback_rnn.html")

'model_variables' collection should be of type 'byte_list', but instead is of type 'node_list'.
INFO:tensorflow:Restoring parameters from /var/folders/dm/3kslprps6b736vz2bgdqpwx00000gn/T/tmpl4_l6cl_/model.ckpt
Primer time: [0, 1.6463307499999995]
Generation time: [1.6463307499999995, 23.414623999999993]
INFO:tensorflow:Beam search yields sequence with log-likelihood: -108.476913 
Generated midi file: /Users/brian/Synthetic-Symphony-ML-422/422magenta_brian/output/fur_elis_lookback_rnn.mid
Generated plot file: /Users/brian/Synthetic-Symphony-ML-422/422magenta_brian/output/fur_elis_lookback_rnn.html


In [116]:
sequence = generate(
    "lookback_rnn.mag",
    melody_rnn_sequence_generator,
    "lookback_rnn",
    primer_filename="got_melody.mid",
    total_length_steps=128,
    temperature=1,
    generatedname = "got_melody_lookback_rnn",
    show_plot=False)
from IPython.display import HTML
HTML(filename="output/got_melody_lookback_rnn.html")

'model_variables' collection should be of type 'byte_list', but instead is of type 'node_list'.
INFO:tensorflow:Restoring parameters from /var/folders/dm/3kslprps6b736vz2bgdqpwx00000gn/T/tmptg5j1w0o/model.ckpt
Primer time: [0, 4.235282000000001]
Generation time: [4.235282000000001, 22.588224]
INFO:tensorflow:Beam search yields sequence with log-likelihood: -271.492767 
Generated midi file: /Users/brian/Synthetic-Symphony-ML-422/422magenta_brian/output/got_melody_lookback_rnn.mid
Generated plot file: /Users/brian/Synthetic-Symphony-ML-422/422magenta_brian/output/got_melody_lookback_rnn.html


As we can see, the two melodies that we created with lookback, comapred to the basic RNN, are much more structured. We can see that our algorithm understood which segments were unique and should be repeated. Unlike in Basic RNN, you could clearly hear the infulence of our primer music in the generated music. However, we can see that for some songs, like Fur Elis, Lookback makes our song too repetitive. 

## Attention RNN

Attention RNN is a method where "attention" is given to specific parts of the sequence. In this Attention RNN version, where we don’t have an encoder-decoder, we just always look at the outputs from the last 𝑛 steps when generating the output for the current step. The way we “look at” these steps is with an attention mechanism. Specifically:
<br>
![title](images/equation.png)
<br>

The vector 𝑣 and matrices 𝑊′1, 𝑊′2 are the learnable parameters of the model. ℎ𝑖 are the RNN outputs from the previous 𝑛 steps (ℎ𝑡−𝑛,...,ℎ𝑡−1), and vector 𝑐𝑡 is the current step’s RNN cell state. These are used to calculate 𝑢𝑡𝑖(𝑢𝑡𝑡−𝑛,...,𝑢𝑡𝑡−1), which is an 𝑛 length vector with a value for each of the previous 𝑛 steps. The values is how much attention each step should receive. A softmax is used to normalize these values and create a mask-like vector 𝑎𝑡𝑖, called the attention mask. The RNN outputs from the previous 𝑛 steps are then multiplied by these attention mask values and then summed together to get ℎ′𝑡. The ℎ′𝑡 vector is a vector of the 𝑛 previous outputs combined together, but each output contributing a different amount relative to how much attention that step received.
<br>
Using the h't vector, which combines the ouputs of the last n steps, lets Attention RNN injsect information from previous steps into the current step's calculation, letting Attention RNN learn longer-term patterns. It also stores the information from previous steps in the RNN's cell state. 

In [117]:
sequence = generate(
    "attention_rnn.mag",
    melody_rnn_sequence_generator,
    "attention_rnn",
    primer_filename="fur_elis.mid",
    total_length_steps=128,
    temperature=1,
    generatedname = "fur_elis_attention_rnn",
    show_plot=False)
HTML(filename="output/fur_elis_attention_rnn.html")

'model_variables' collection should be of type 'byte_list', but instead is of type 'node_list'.
INFO:tensorflow:Restoring parameters from /var/folders/dm/3kslprps6b736vz2bgdqpwx00000gn/T/tmpcx3q0x96/model.ckpt
Primer time: [0, 1.6463307499999995]
Generation time: [1.6463307499999995, 23.414623999999993]
INFO:tensorflow:Beam search yields sequence with log-likelihood: -164.623672 
Generated midi file: /Users/brian/Synthetic-Symphony-ML-422/422magenta_brian/output/fur_elis_attention_rnn.mid
Generated plot file: /Users/brian/Synthetic-Symphony-ML-422/422magenta_brian/output/fur_elis_attention_rnn.html


In [119]:
sequence = generate(
    "attention_rnn.mag",
    melody_rnn_sequence_generator,
    "attention_rnn",
    primer_filename="got_melody.mid",
    total_length_steps=128,
    temperature=1,
    generatedname = "got_melody_attention_rnn",
    show_plot=False)
from IPython.display import HTML
HTML(filename="output/got_melody_attention_rnn.html")

'model_variables' collection should be of type 'byte_list', but instead is of type 'node_list'.
INFO:tensorflow:Restoring parameters from /var/folders/dm/3kslprps6b736vz2bgdqpwx00000gn/T/tmptn4cp4ds/model.ckpt
Primer time: [0, 4.235282000000001]
Generation time: [4.235282000000001, 22.588224]
INFO:tensorflow:Beam search yields sequence with log-likelihood: -147.267471 
Generated midi file: /Users/brian/Synthetic-Symphony-ML-422/422magenta_brian/output/got_melody_attention_rnn.mid
Generated plot file: /Users/brian/Synthetic-Symphony-ML-422/422magenta_brian/output/got_melody_attention_rnn.html


As we can see, the patterns that are repeated span a larger range with Attention RNN, and thus have more variance than Lookback brings. This creates a more natural sounding melody, as there is more variaton allowed, given the larger space that Attention gives to it's pattern detection.

## Summary

Through testing three RNN models, we have found that when learning how to train models to make music, we have to create a model that understand how to recognise patterns. The patterns that the model makes have to be recognizable in order to not be a random mess of notes, but also has to be hollistic enough to be able to not repeat too many patterns too frequently. 