# GPT-2 for music - By Dr. Tristan Behrens

This notebook shows you how to generate music with GPT-2

---

## Find me online

- https://www.linkedin.com/in/dr-tristan-behrens-734967a2/
- https://twitter.com/DrTBehrens
- https://github.com/AI-Guru
- https://huggingface.co/TristanBehrens
- https://huggingface.co/ai-guru


---

## Install depencencies.

The following cell sets up fluidsynth and pyfluidsynth on colaboratory.

In [41]:
if "google.colab" in str(get_ipython()):
    print("Installing dependencies...")
    #!pip uninstall -y bokeh
    !apt-get update -qq && apt-get install -qq  build-essential libasound2-dev libjack-dev && apt-get install libfluidsynth3
    !pip install -qU pyfluidsynth

    !pip install --upgrade bokeh==2.4.3

Installing dependencies...
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
libfluidsynth3 is already the newest version (2.2.5-1).
0 upgraded, 0 newly installed, 0 to remove and 24 not upgraded.


In [42]:
!pip install transformers
!pip install note_seq



## Load the tokenizer and the model from 🤗 Hub.

In [43]:
import os
os.environ["PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION"] = "python"

In [44]:
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ai-guru/lakhclean_mmmtrack_4bars_d-2048")
model = AutoModelForCausalLM.from_pretrained("ai-guru/lakhclean_mmmtrack_4bars_d-2048")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


## Convert the generated tokens to music that you can listen to.

This uses note_seq, which is something like MIDI coming from Google Magenta. You could even use it to load and save MIDI files. Check their repo if you want to learn more.


In [45]:
import note_seq

NOTE_LENGTH_16TH_120BPM = 0.25 * 60 / 120
BAR_LENGTH_120BPM = 4.0 * 60 / 120

def token_sequence_to_note_sequence(token_sequence, use_program=True, use_drums=True, instrument_mapper=None, only_piano=False):

    if isinstance(token_sequence, str):
        token_sequence = token_sequence.split()

    note_sequence = empty_note_sequence()

    # Render all notes.
    current_program = 1
    current_is_drum = False
    current_instrument = 0
    track_count = 0
    for token_index, token in enumerate(token_sequence):

        if token == "PIECE_START":
            pass
        elif token == "PIECE_END":
            print("The end.")
            break
        elif token == "TRACK_START":
            current_bar_index = 0
            track_count += 1
            pass
        elif token == "TRACK_END":
            pass
        elif token == "KEYS_START":
            pass
        elif token == "KEYS_END":
            pass
        elif token.startswith("KEY="):
            pass
        elif token.startswith("INST"):
            instrument = token.split("=")[-1]
            if instrument != "DRUMS" and use_program:
                if instrument_mapper is not None:
                    if instrument in instrument_mapper:
                        instrument = instrument_mapper[instrument]
                current_program = int(instrument)
                current_instrument = track_count
                current_is_drum = False
            if instrument == "DRUMS" and use_drums:
                current_instrument = 0
                current_program = 0
                current_is_drum = True
        elif token == "BAR_START":
            current_time = current_bar_index * BAR_LENGTH_120BPM
            current_notes = {}
        elif token == "BAR_END":
            current_bar_index += 1
            pass
        elif token.startswith("NOTE_ON"):
            pitch = int(token.split("=")[-1])
            note = note_sequence.notes.add()
            note.start_time = current_time
            note.end_time = current_time + 4 * NOTE_LENGTH_16TH_120BPM
            note.pitch = pitch
            note.instrument = current_instrument
            note.program = current_program
            note.velocity = 80
            note.is_drum = current_is_drum
            current_notes[pitch] = note
        elif token.startswith("NOTE_OFF"):
            pitch = int(token.split("=")[-1])
            if pitch in current_notes:
                note = current_notes[pitch]
                note.end_time = current_time
        elif token.startswith("TIME_DELTA"):
            delta = float(token.split("=")[-1]) * NOTE_LENGTH_16TH_120BPM
            current_time += delta
        elif token.startswith("DENSITY="):
            pass
        elif token == "[PAD]":
            pass
        else:
            #print(f"Ignored token {token}.")
            pass

    # Make the instruments right.
    instruments_drums = []
    for note in note_sequence.notes:
        pair = [note.program, note.is_drum]
        if pair not in instruments_drums:
            instruments_drums += [pair]
        note.instrument = instruments_drums.index(pair)

    if only_piano:
        for note in note_sequence.notes:
            if not note.is_drum:
                note.instrument = 0
                note.program = 0

    return note_sequence

def empty_note_sequence(qpm=120.0, total_time=0.0):
    note_sequence = note_seq.protobuf.music_pb2.NoteSequence()
    note_sequence.tempos.add().qpm = qpm
    note_sequence.ticks_per_quarter = note_seq.constants.STANDARD_PPQ
    note_sequence.total_time = total_time
    return note_sequence

## Generate music

This will generate one track of music and render it.

In [46]:
generated_sequence = "PIECE_START"

run=0
timestr = time.strftime("%Y%m%d-%H%M%S")

Note: Run the following cell multiple times to generate more tracks.

In [47]:
from google.colab import files
import time
run =run + 1
fname = "mlmidi-"+timestr+"_"+str(run)+".mid"

# Encode the conditioning tokens.
input_ids = tokenizer.encode(generated_sequence, return_tensors="pt")
#print(input_ids)

# Generate more tokens.
eos_token_id = tokenizer.encode("TRACK_END")[0]
temperature = 1.0
generated_ids = model.generate(
    input_ids,
    max_length=2048,
    do_sample=True,
    temperature=temperature,
    eos_token_id=eos_token_id,
)
generated_sequence = tokenizer.decode(generated_ids[0])
print(generated_sequence)

note_sequence = token_sequence_to_note_sequence(generated_sequence)

synth = note_seq.fluidsynth
note_seq.plot_sequence(note_sequence)
note_seq.play_sequence(note_sequence, synth)
note_seq.sequence_proto_to_midi_file(note_sequence, fname)
files.download(fname)

PIECE_START TRACK_START INST=0 DENSITY=9 BAR_START NOTE_ON=53 NOTE_ON=41 TIME_DELTA=2 NOTE_OFF=53 NOTE_OFF=41 TIME_DELTA=2 NOTE_ON=56 NOTE_ON=60 NOTE_ON=51 NOTE_ON=53 TIME_DELTA=1 NOTE_OFF=56 NOTE_OFF=60 NOTE_OFF=51 NOTE_OFF=53 TIME_DELTA=1 NOTE_ON=56 NOTE_ON=53 NOTE_ON=60 NOTE_ON=51 TIME_DELTA=2 NOTE_OFF=56 NOTE_OFF=53 NOTE_OFF=60 NOTE_OFF=51 NOTE_ON=46 NOTE_ON=34 TIME_DELTA=1 NOTE_OFF=46 NOTE_OFF=34 TIME_DELTA=1 NOTE_ON=58 NOTE_ON=48 NOTE_ON=55 NOTE_ON=52 NOTE_ON=62 NOTE_ON=36 TIME_DELTA=2 NOTE_OFF=55 NOTE_OFF=52 TIME_DELTA=1 NOTE_OFF=58 NOTE_OFF=48 NOTE_OFF=62 TIME_DELTA=1 NOTE_OFF=36 NOTE_ON=64 NOTE_ON=67 NOTE_ON=58 NOTE_ON=60 TIME_DELTA=2 NOTE_OFF=64 NOTE_OFF=67 NOTE_OFF=58 NOTE_OFF=60 BAR_END BAR_START NOTE_ON=36 NOTE_ON=58 NOTE_ON=60 TIME_DELTA=1 NOTE_OFF=36 NOTE_OFF=58 NOTE_OFF=60 TIME_DELTA=1 NOTE_ON=72 NOTE_ON=67 NOTE_ON=64 NOTE_ON=60 NOTE_ON=58 TIME_DELTA=1 NOTE_OFF=72 NOTE_OFF=67 NOTE_OFF=64 NOTE_OFF=60 NOTE_OFF=58 TIME_DELTA=1 NOTE_ON=67 NOTE_ON=58 NOTE_ON=64 NOTE_ON=60 TI

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [48]:
print(note_sequence)

ticks_per_quarter: 220
tempos {
  qpm: 120.0
}
notes {
  pitch: 53
  velocity: 80
  end_time: 0.25
}
notes {
  pitch: 41
  velocity: 80
  end_time: 0.25
}
notes {
  pitch: 56
  velocity: 80
  start_time: 0.5
  end_time: 0.625
}
notes {
  pitch: 60
  velocity: 80
  start_time: 0.5
  end_time: 0.625
}
notes {
  pitch: 51
  velocity: 80
  start_time: 0.5
  end_time: 0.625
}
notes {
  pitch: 53
  velocity: 80
  start_time: 0.5
  end_time: 0.625
}
notes {
  pitch: 56
  velocity: 80
  start_time: 0.75
  end_time: 1.0
}
notes {
  pitch: 53
  velocity: 80
  start_time: 0.75
  end_time: 1.0
}
notes {
  pitch: 60
  velocity: 80
  start_time: 0.75
  end_time: 1.0
}
notes {
  pitch: 51
  velocity: 80
  start_time: 0.75
  end_time: 1.0
}
notes {
  pitch: 46
  velocity: 80
  start_time: 1.0
  end_time: 1.125
}
notes {
  pitch: 34
  velocity: 80
  start_time: 1.0
  end_time: 1.125
}
notes {
  pitch: 58
  velocity: 80
  start_time: 1.25
  end_time: 1.625
}
notes {
  pitch: 48
  velocity: 80
  start_ti