## RNN-Music-Generator for music in ABC Notation

<p>
    This is a small project exploring the functionalities of recurrent neural networks (RNNs). We will use RNNs to learn
    patterns from raw sheet music written in ABC notation, and then use the trained model to generate (i.e. predict) new pieces
    of music based on this learned information.
    <br><br>
    To accomplish this, we input a sequence of characters into the model and train the model to predict the output, i.e., the
    next character at each time step. RNNs maintain an internal state that depends on previously seen elements, so information
    about all characters seen up to a given time is taken into account when generating the prediction.
    <br><br>
    This notebook will show how to achieve this task in a simple way. If you need more information about the implementation, you
    will have to refer to the file "utils.py" imported here.
    <br><br>
    FYI: This project was written on a Windows machine with a Linux subsystem. This is due to the programs used to convert songs 
    in ABC notation. If you are using a Linux-based machine, you may need to delete the "wsl" commands in the .wav creation.   
</p>

### Import necessary dependencies

In [1]:
import tensorflow as tf
import utils
import os

### About the available datasets

<p>
    This project provides various datasets consisting of different songs represented in ABC notation. However, you can use any 
    type of music dataset if it is written in this notation. Just upload it to the data folder and use the "utils.load_data()" 
    function below. You can also find more information about this notation here:
    <br>
    <ul>
    <li><a href="https://en.wikipedia.org/wiki/ABC_notation">Wikipedia - ABC-notation</a></li>
    <li><a href="https://abcnotation.com/">abcnotation.com</a></li>
    </ul>
</p>

<p>
    Here are some links to sources of the collected .abc music records:
    <ul>
      <li><a href ="https://www.norbeck.nu/abc/">Henrik Norbeck's ABC Tunes</a></li>
      <li><a href ="http://tradfrance.com/aaa/index.php/musique-traditionnelle-de-france/">Musique traditionnelle de France</a></li>
      <li><a href ="https://math.dartmouth.edu/~doyle/docs/waugh/">Fiddle Tunes from Bernie Waugh</a></li>
    </ul>
</p>

<p>
    And these are the already existing datasets inside the data folder that can be used:
    <br>
    <ul>
        <li>american-canadian</li>
        <li>celtic-british</li>
        <li>france</li>
        <li>irish</li>
        <li>jigs</li>
    </ul>
    
</p>

In [2]:
# Enter one of the names of the datasets above:

abc_song_data = utils.load_data('irish')

There are 817 songs in the "irish" dataset.


In [3]:
# For example this is what a song looks like

print(abc_song_data[0])

X:1
T:Alexander's
Z: id:dc-hornpipe-1
M:C|
L:1/8
K:D Major
(3ABc|dAFA DFAd|fdcd FAdf|gfge fefd|(3efe (3dcB A2 (3ABc|!
dAFA DFAd|fdcd FAdf|gfge fefd|(3efe dc d2:|!
AG|FAdA FAdA|GBdB GBdB|Acec Acec|dfaf gecA|!
FAdA FAdA|GBdB GBdB|Aceg fefd|(3efe dc d2:|!


### Process the data for the learning task
<p>
    To use an RNN for music generation, we need to convert the string representation of our .abc dataset into a large numeric 
    vector. This is necessary because the character sequences of this vector will later be fed into the model to train it for 
    output prediction.
    <br>
    For this purpose, we will first combine the list of songs into a single string. Then we will create a vocabulary that will 
    help us convert the combined string into a numeric vector. More about this in the following steps.
</p>

In [4]:
# combine all songs into a single string

joined_songs = "\n\n".join(abc_song_data)

<p>
    In the next step of preprocessing our data, we need to create a mapping for all unique characters. This is necessary because 
    we need to vectorize our joined string into a numeric representation. Therefore, we will create a dictionary called 
    "vocabulary" in which all unique characters are mapped to an integer.
</p>

In [5]:
vocabulary = utils.get_vocabulary(joined_songs)

In [6]:
# Some useful information about the created vocabulary

print("Length of vocabulary:", len(vocabulary))
print("\nShowing the first entries of the vocabulary:")

for i in range(5):
    items = list(vocabulary.items())
    print("{} ---> {}".format(repr(items[i][0]), items[i][1]))
print("...")

Length of vocabulary: 83

Showing the first entries of the vocabulary:
'\n' ---> 0
' ' ---> 1
'!' ---> 2
'"' ---> 3
'#' ---> 4
...


<p>
    Now we can create our numerical representation of the string:
</p>

In [7]:
vectorized_songs = utils.string2numerical(joined_songs)

print("The numerical song vector looks like this:", vectorized_songs)

The numerical song vector looks like this: [49 22 13 ... 22 82  2]


<p>
    Below, the first five lines of the first song in our dataset are presented numerically to provide a visual example of the
    procedure:
</p>

In [10]:
first_song = abc_song_data[0]

# create numerical representation of the first song
string_row = first_song.split("\n")
convert = []
for i in string_row:
    convert += [str(utils.string2numerical(i))]

# print first 5 lines of each representation
print("\n".join(first_song.split("\n")[:5]), "\n")
print("\n".join(convert[:5]))

X:1
T:Alexander's
Z: id:dc-hornpipe-1
M:C|
L:1/8 

[49 22 13]
[45 22 26 67 60 79 56 69 59 60 73  5 74]
[51 22  1 64 59 22 59 58  9 63 70 73 69 71 64 71 60  9 13]
[38 22 28 82]
[37 22 13 11 20]


### Building the Recurrent Neural Network

<p>
    In this section, we will build a RNN and train it to learn the patterns of a song written in ABC notation, and then use this 
    model to generate never-before-heard music.
    <br><br>
    Breaking this down, what we're really asking the model is: given a character, or sequence of characters, what is the most 
    likely next character? We will train the model to perform this task.
    <br><br>
    At this point in the notebook, we have already put our data into a usable form. This means we can now move on to the next
    step and create our training sequences and model ourselves.
    <br><br>
    As mentioned earlier, the RNN will use sequences/batches of song snippets from our vectorized songs in the training process.
    To create these batches we will be using the "create_batch" function. This function creates an input sequence, also called
    x_batch, and a corresponding target sequence, called y_batch, that our RNN must correctly predict.
    <br><br>
    These chunks have a certain size "sequence_length" and "sequence_length+1". For example, if our chosen text is "hello", our
    x_batch is "hell" and our y_batch is "ello". Our RNN should then learn 
    to predict "o" as our next character given x_batch.
</p>

In [None]:
# a small example for a created batch:
x_batch, y_batch = utils.create_batch(vectorized_songs[0:20], seq_length=5, batch_size=1)

print("x_batch:", x_batch[0], "\n    --->", list(utils.numerical2string(x_batch[0])))
print("\ny_batch:", y_batch[0], "\n    --->", list(utils.numerical2string(y_batch[0])))

<p>
    But how will our RNN process and learn from these generated stacks? For each of these vectors, each index is processed in a 
    single time step. So for our input at time step 0, the model will receive the index for the first character in the input 
    sequence and try to predict the index of the next character. This process is repeated for each upcoming time step, but in 
    addition to the current input, the RNN also considers the information from the previous step, i.e., its updated state.
</p>

In [None]:
# below you can see how this works over the first characters in our text:

for i, (input_index, target_index) in enumerate(zip(x_batch[0], y_batch[0])):
    print("Step:", i)
    print("      input:", input_index, "('{}')".format(utils.numerical2string([int(input_index)])))
    print("      target:", target_index, "('{}')".format(utils.numerical2string([int(target_index)])))

<p>
    <u>Now we can finally create a new model or load an old model from saved checkpoints:</u>
    <br><br>
    This model will be based on the LSTM architecture, where we use a state vector to obtain information about the temporal 
    relationships between successive characters (example shown above). The final output of the LSTM will then be fed into a 
    fully connected dense layer, where we output a softmax over each character in the vocabulary and then sample from this 
    distribution to predict the next character. 
</p>
<p>
    Three layers are used to define the model:
    <ul>
        <li>Layer 1: Embedding layer to transform indices into dense vectors of a fixed embedding size</li>
        <li>Layer 2: LSTM network with 'rnn_units' number of units</li>
        <li>Layer 3: Dense (fully-connected) layer that transforms the LSTM output into the vocabulary size</li>
    </ul>
</p>
<br>
<p>
    Let's start by defining some hyperparameters for training the model:
</p>

In [None]:
### Hyperparameter settings and optimization ###

# optimization:
num_iterations = 2000
batch_size = 1
seq_length = 200
learning_rate = 2e-3

# model parameters:
vocabulary_size = len(vocabulary)
embedding_dim = 256
rnn_units = 1536

# checkpoint location:
checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt")

In [None]:
# call the following function to instantiate the model

model = utils.build_model(vocabulary_size, 
                    embedding_dim, 
                    rnn_units, 
                    batch_size)

optimizer = tf.keras.optimizers.Adam(learning_rate)

model.summary()

<p>
    Now we can train this model with our preprocessed data. Just uncomment the cell below if you want to train a new model.
    <br><br>
    Caution: This deletes the old model! Change the name of the control point or save the files to keep the model!
</p>

In [None]:
# only use this function if you want to train a new model, else load a pretrained model below

#utils.train_model(num_iterations, vectorized_songs, seq_length, batch_size, model, optimizer, checkpoint_prefix, plot_loss=True)

<p>
    Run this cell to load a existing model for faster music generation:
</p>

In [None]:
# Restore the model weights for the last checkpoint after training

model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([batch_size, None]))

model.summary()

<p>
    But how does the prediction procedure with our completely trained model work?
    <ol>
        <li>Initialization of a start string (here 'X') and an RNN state as well as specification of the number of characters to
            be generated.</li>
        <li>Use the starting string and RNN state to obtain the probability distribution for the next predicted character.</li>
        <li>Sample from the multinomial distribution to calculate the index of the predicted character. This predicted character
            is then used as the next input to the model.</li>
        <li>At each time step, the updated RNN state is fed back into the model so that it now has more context for the next
            prediction. After the next character is predicted, the updated RNN states are again fed into the model. In this way,
            it learns the sequence dependencies in the data as it gets more information from the previous predictions.</li>
    </ol>
</p>

### Generate music using the RNN model

<p>
    In this section we can use our trained model for music generation. Generated songs will be saved in the "songs" file in 
    this repository.
</p>
<p>
    <ul>
        <li>Song Count: Number of that will be created</li>
        <li>Generation Length: Number of characters we want to generate</li>
        <li>Print ABC: Show the ABC-Notation of the created song</li>
    </ul>
</p>

In [None]:
songs = utils.generate_songs(model, songcount=3, generation_length=500)

In [None]:
utils.play_generated_songs(songs, print_abc=True)

<p>
    Some more useful functions for playing, listing and deleting songs.
    <br>
    You just have to uncomment these to use them.
</p>

In [None]:
# this function simply lists all of created songs from the songs directory:

#utils.show_songs()

In [None]:
# use this function to play a specific song without creating new ones
# just type in a songnumber to hear the song:

# utils.play_song(songnumber=1)

In [None]:
# this funciton deletes all songs in the song directory
# just uncomment and run this function to delete everything:

#utils.delete_songs()