# Lofi Music Generator 🎶

## Basic Terminologies

1. **Recurrent Neural Networks (RNN):** A recurrent neural network is a class of artificial neural networks that make use of sequential information. They are called recurrent because they perform the same function for every single element of a sequence, with the result being dependent on previous computations.

2. **Long Short-Term Memory (LSTM):** A type of Recurrent Neural Network that can efficiently learn via gradient descent. Using a gating mechanism, LSTMs are able to recognise and encode long-term patterns. LSTMs are extremely useful to solve problems where the network has to remember information for a long period of time.

3. **Music21:**  A Python toolkit used for computer-aided musicology. It allows us to teach the fundamentals of music theory, generate music examples and study music.

4. **Keras:** A high-level neural networks API that simplifies interactions with Tensorflow.

## Data Dictionary

1. **Note:**  Note is a small bit of sound, similar to a syllable in spoken language.

2. **Chord:**  Any harmonic set of pitches/frequencies consisting of multiple notes that are heard as if sounding simultaneously.

3. **Pitch:** The frequency of the sound, or how high or low it is and is represented with the letters [A, B, C, D, E, F, G], with A being the highest and G being the lowest.

4. **Octave:** Which set of pitches you use on a piano.

5. **Offset:** Where the note is located in the piece.

6. **Lofi Hip/Hop:** Lo-fi Hip Hop refers to a subliminal genre of music that fuses traditional hip-hop and jazz elements to create an atmospheric, soothing, instrumental soundscape. It is characterized by the high-utilization of elements such as introspection, mellow tunes, and Japanese anime.

## Preparing the data

In [3]:
#!pip install music21

In [1]:
import numpy as np 
import os
import tensorflow as tf
import glob # Return all file paths that match a specific pattern.
import pickle # serializing and de-serializing a Python object structure
from music21 import converter, instrument, note, chord

### music21 

1. **music21.converter** contains tools for loading music from various file formats, whether from disk, from the web, or from text, into music21.stream.:class:~music21.stream.Score objects (or other similar stream objects).

2. **music21.instrument** represents instruments through objects that contain general information such as Metadata for instrument names, classifications, transpositions and default MIDI program numbers. It also contains information specific to each instrument or instrument family, such as string pitches, etc. 

3. **music21.note** contains classes and functions for creating Notes, Rests, and Lyrics.

4. **music21.chord** defines the Chord object, a sub-class of GeneralNote as well as other methods, functions, and objects related to chords.

In [2]:
def get_notes(dir=None):
    '''
    Get all the notes and chords from the midi files in the directory
    '''
    notes = []
    
    filepaths = os.listdir(dir)
    
    for file in filepaths:
        midi = converter.parse(dir+"/"+file) #loading each file into a Music21 stream object
        parsed_note = None

        try:    #file has instrument parts
            meta = instrument.partitionByInstrument(midi)
            parsed_note = meta.parts[0].recurse()
        except: # file has notes in a flat structure
            parsed_note = midi.flat.notes
        
        for element in parsed_note:
            if isinstance(element, note.Note): #The isinstance() function returns True if the specified object is of the specified type, otherwise False.
                notes.append(str(element.pitch))
            elif isinstance(element, chord.Chord):
                notes.append('.'.join(str(n) for n in element.normalOrder)) #Chord.normalOrder Return the normal 
                                    #order/normal form of the Chord represented as a list of integers.
                                    #append every chord by encoding the id of every note in the chord together into a single string, 
                                    #with each note being separated by a dot. 
                                    #These encodings allows us to easily decode the output generated by the network into the correct notes and chords.
                                
        with open('data/notes', 'wb') as filepath:
            pickle.dump(notes, filepath)
        
    return notes
        

In [3]:
notes = get_notes("Lofi")

In [4]:
def sequence():
    '''
    create input sequences for the network and their respective outputs. 
    The output for each input sequence will be the first note or chord that 
    comes after the sequence of notes in the input sequence in our list of notes.
    '''
    sequence_len = 100

    pitch = sorted(set(notes)) #get all pitches

    # create a dictionary to map pitches to integers
    int_note = dict((note, number) for number, note in enumerate(pitch))

    net_in =  [] #Network Input
    net_out = [] #Network Output

    # create input and output sequences
    for i in range(0, len(notes)-sequence_len):
        seq_in   = notes[i:i+sequence_len]
        seq_out  = notes[i+sequence_len]
        
        sequence=[]
        for note in seq_in:
            sequence.append(int_note[note])
            
        net_in.append(sequence)
        net_out.append(int_note[seq_out])
    
    n_patterns=1800
    
    # reshape the input into a format compatible with LSTM layers
    net_in =  np.reshape(net_in, (n_patterns, sequence_len, 1))

    # normalize input
    #net_in = net_in / float(n_vocab)

    net_out = tf.keras.utils.to_categorical(net_out) #Converts a class vector (integers) to binary class matrix.

    return (net_in,net_out)

In [5]:
net_in, net_out = sequence()

## Model

In our model we use four different types of layers:

**LSTM layers:** A Recurrent Neural Net layer that takes a sequence as an input and can return either sequences (return_sequences=True) or a matrix.


**Dropout layers:** A regularisation technique that consists of setting a fraction of input units to 0 at each update during the training to prevent overfitting. The fraction is determined by the parameter used with the layer.


**Dense layers:** A fully connected neural network layer where each input node is connected to each output node.


**Activation layer:** Determines what activation function our neural network will use to calculate the output of a node.

In [6]:
def create_model():
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.LSTM(
        units = 512, #Positive integer, dimensionality of the output space.
        input_shape=(net_in.shape[1], net_in.shape[2]),
        recurrent_dropout=0.3, # Fraction of the units to drop for the linear transformation of the recurrent state.
        return_sequences=True # Whether to return the last output in the output sequence, or the full sequence.
    ))
    model.add(tf.keras.layers.LSTM(512, return_sequences=True, recurrent_dropout=0.3,))
    model.add(tf.keras.layers.LSTM(512))
    model.add(tf.keras.layers.BatchNormalization()) #Layer that normalizes its inputs.
    model.add(tf.keras.layers.Dropout(0.3)) #Applies Dropout to the input.
    model.add(tf.keras.layers.Dense(254))
    model.add(tf.keras.layers.Activation('relu'))
    model.add(tf.keras.layers.BatchNormalization())
    model.add(tf.keras.layers.Dropout(0.3))
    model.add(tf.keras.layers.Activation('softmax'))
    
    model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
    
    return model

In [7]:
model = create_model()
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 100, 512)          1052672   
                                                                 
 lstm_1 (LSTM)               (None, 100, 512)          2099200   
                                                                 
 lstm_2 (LSTM)               (None, 512)               2099200   
                                                                 
 batch_normalization (BatchN  (None, 512)              2048      
 ormalization)                                                   
                                                                 
 dropout (Dropout)           (None, 512)               0         
                                                                 
 dense (Dense)               (None, 254)               130302    
                                                        

In [78]:
def train_model(model, net_in, net_out):
    '''
    Training your neural network
    '''
    filepath="weights-{epoch:02d}-{loss:.4f}-bigger.hdf5"
    checkpoint = tf.keras.callbacks.ModelCheckpoint( #Callback to save the Keras model or model weights at some frequency.
        filepath,
        monitor='loss',
        verbose=0,
        save_best_only=True,
        mode='min'
    )
    callbacks_list = [checkpoint]
    
    model.fit(net_in, net_out, epochs=100, batch_size=128, callbacks=callbacks_list) #Train the model
    
    return model

In [79]:
model = train_model(model, net_in, net_out)
model

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<keras.engine.sequential.Sequential at 0x136f206d070>

In [81]:
#model.save('model.hdf5')

In [84]:
# model = tf.keras.models.load_model('models')
# model.summary()

model = create_network(net_in, 100)
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 100, 512)          1052672   
                                                                 
 lstm_1 (LSTM)               (None, 100, 512)          2099200   
                                                                 
 lstm_2 (LSTM)               (None, 512)               2099200   
                                                                 
 batch_normalization (BatchN  (None, 512)              2048      
 ormalization)                                                   
                                                                 
 dropout (Dropout)           (None, 512)               0         
                                                                 
 dense (Dense)               (None, 254)               130302    
                                                        

In [83]:
def create_model_weights(network_input, n_vocab):
    """ 
    create the structure of the neural network 
    """
    model = tf.keras.Sequential()
    model.add(tf.keras.layers.LSTM(
        units = 512, #Positive integer, dimensionality of the output space.
        input_shape=(net_in.shape[1], net_in.shape[2]),
        recurrent_dropout=0.3, # Fraction of the units to drop for the linear transformation of the recurrent state.
        return_sequences=True # Whether to return the last output in the output sequence, or the full sequence.
    ))
    model.add(tf.keras.layers.LSTM(512, return_sequences=True, recurrent_dropout=0.3,))
    model.add(tf.keras.layers.LSTM(512))
    model.add(tf.keras.layers.BatchNormalization()) #Layer that normalizes its inputs.
    model.add(tf.keras.layers.Dropout(0.3)) #Applies Dropout to the input.
    model.add(tf.keras.layers.Dense(254))
    model.add(tf.keras.layers.Activation('relu'))
    model.add(tf.keras.layers.BatchNormalization())
    model.add(tf.keras.layers.Dropout(0.3))
    model.add(tf.keras.layers.Activation('softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='rmsprop')

    # Load the weights to each node
    model.load_weights('weights-93-1.9348-bigger.hdf5')

    return model

## Generate Song

In [155]:
def generate_notes(model, network_input, notes, n_vocab):
    """ 
    Generate notes from the neural network based on a sequence of notes 
    """
    
    # pick a random sequence from the input as a starting point for the prediction
    start = np.random.randint(0, len(network_input)-1)
    pitch = sorted(set(notes))
    

    int_to_note = dict((number, note) for number, note in enumerate(pitch))

    pattern = network_input[start]
    prediction_output = []

    # generate 500 notes
    prediction_input = np.reshape(pattern, (1, len(pattern), 1))
    prediction_input = prediction_input / float(n_vocab)
        
    predictions = model.predict(net_in[100:400], verbose=0)
    for i in range(len(predictions)):
        index = np.argmax(predictions[i])
        result = int_to_note[index]
        prediction_output.append(result)

    return prediction_output

In [156]:
generated_notes = generate_notes(model, net_in, notes, 100)
print(generated_notes)

['6.9.11.2', 'F#4', 'F#5', 'E5', 'G5', 'B5', '4.7.11', 'E4', 'G4', 'E5', '6.9.11.2', 'F#4', 'F#5', 'E5', 'G5', 'B5', '4.7.11', 'E4', 'G4', 'A5', '6.9.11.2', 'A4', '4.5.9.0', 'A4', '2.4.7.9', '4.7.9.0', 'A4', '4.7.9.11.0', 'E4', 'G4', 'A5', 'E5', 'A4', '4.5.9.0', 'A4', '2.4.7.9', '4.7.9.0', 'A4', '5.9.0', 'E4', 'G6', 'A5', 'E6', 'A5', 'G5', 'D6', 'E5', '11.0', '4.7', 'E5', 'E4', 'G5', 'G3', 'C5', 'D5', '5.9.0', 'E4', 'G6', 'F6', 'E6', 'A5', 'G5', 'D6', 'E5', '11.0', '4.7', 'E5', 'E4', 'G5', 'G3', 'C5', 'D5', '0.1.3.5.8', 'G#5', 'G5', 'E-5', 'C5', '0.1.3.5.8', 'C5', 'B-4', 'C5', '3.7.10', 'E-5', 'C5', '0.3.5.8', 'G#5', '0.3.5.7.8', 'G5', 'E-5', 'C5', '0.3.5.7.8', 'C5', 'B-4', '7.8.10.0.3', '7.8.10.0.3', 'B-5', 'C6', '4.6.9.11', '1', '11', '9.11.1.4', 'C#4', '9.11.1.4', '8.11.1.4', '1', '7.10.0.3', 'C#4', 'B1', '2.6.9', '4', 'C#4', '11', 'B1', '2.6.9', 'C#4', 'D2', '6.9.1', 'C#4', 'C#4', '9', '4.6.9.11', '1', '11', '9.11.1.4', 'C#4', '9.11.1.4', '8.11.1.4', '1', '7.10.0.3', 'C#4', 'B1', '

## Stream Music

In [157]:
from music21 import stream

def create_midi(prediction_output):
    """ 
    convert the output from the prediction to notes and create a midi file
    from the notes 
    """
    offset = 0
    output_notes = []

    # create note and chord objects based on the values generated by the model
    for pattern in prediction_output:
        # pattern is a chord
        if ('.' in pattern) or pattern.isdigit():
            notes_in_chord = pattern.split('.')
            notes = []
            for current_note in notes_in_chord:
                new_note = note.Note(int(current_note))
                new_note.storedInstrument = instrument.Piano()
                notes.append(new_note)
            new_chord = chord.Chord(notes)
            new_chord.offset = offset
            output_notes.append(new_chord)
        # pattern is a note
        else:
            new_note = note.Note(pattern)
            new_note.offset = offset
            new_note.storedInstrument = instrument.Piano()
            output_notes.append(new_note)

        # increase offset each iteration so that notes do not stack
        offset += 0.5

    midi_stream = stream.Stream(output_notes)

    midi_stream.write('midi', fp='test_output3.mid')

In [158]:
create_midi(generated_notes)

# AI Generated Song

https://soundcloud.com/user-467169078/ai-lofi?utm_source=clipboard&utm_medium=text&utm_campaign=social_sharing