**This notebook is based on the Hugging Face course - [Chapter 9: Building and sharing demos.](https://huggingface.co/course/chapter9/1?fw=tf)**

You trained the **GPT-2 model for generating guitar music** in the [last notebook](https://colab.research.google.com/drive/13fL8J0f8X7dhyF_o8CDA7jle1sDLnkcQ?usp=sharing). You should have by now a model in the Hugging Face hub with a nice widget for generating music.

Still, this is not a very useful way to present the model. It shows the input and output as text and not as music. In this notebook, you'll create a gradio demo with a better interface than the model card, so the users of your model can generate and listen to guitar music with the model.

Let's start by installing gradio and all the necessary libraries.

In [1]:
!apt-get install -qq libfluidsynth1
!pip install gradio
!pip install transformers[sentencepiece]
!pip install note-seq
!pip install -U protobuf==3.20.1
# Latest pyfluidsynth doesn't work correctly with pretty_midi synthesis.
!pip install pyfluidsynth==1.3.0

Selecting previously unselected package libfluidsynth1:amd64.
(Reading database ... 155569 files and directories currently installed.)
Preparing to unpack .../libfluidsynth1_1.1.9-1_amd64.deb ...
Unpacking libfluidsynth1:amd64 (1.1.9-1) ...
Setting up libfluidsynth1:amd64 (1.1.9-1) ...
Processing triggers for libc-bin (2.27-3ubuntu1.5) ...
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting gradio
  Downloading gradio-3.3.1-py3-none-any.whl (5.3 MB)
[K     |████████████████████████████████| 5.3 MB 7.3 MB/s 
[?25hCollecting orjson
  Downloading orjson-3.8.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (270 kB)
[K     |████████████████████████████████| 270 kB 27.2 MB/s 
Collecting analytics-python
  Downloading analytics_python-1.4.0-py2.py3-none-any.whl (15 kB)
Collecting pydub
  Downloading pydub-0.25.1-py2.py3-none-any.whl (32 kB)
Collecting paramiko
  Downloading paramiko-2.11.0-py2.py3-none-any.whl (212 kB)
[

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pyfluidsynth==1.3.0
  Downloading pyFluidSynth-1.3.0-py3-none-any.whl (18 kB)
Installing collected packages: pyfluidsynth
Successfully installed pyfluidsynth-1.3.0


## 5.1 Exploring the trained Mutopia Model

Let's now check the Mutopia model. Here, you'll wrap the model in a text generation pipeline.

In [2]:
from transformers import AutoTokenizer, TFGPT2LMHeadModel
from transformers import pipeline

mutopia_model = TFGPT2LMHeadModel.from_pretrained("juancopi81/mutopia_guitar_mmm")
mutopia_tokenizer = AutoTokenizer.from_pretrained("juancopi81/mutopia_guitar_mmm")
pipe = pipeline(
    "text-generation", model=mutopia_model, tokenizer=mutopia_tokenizer, device=0
)

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


Moving 0 files to the new cache system


0it [00:00, ?it/s]

Downloading:   0%|          | 0.00/894 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/345M [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at juancopi81/mutopia_guitar_mmm.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


Downloading:   0%|          | 0.00/380 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/21.6k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/99.0 [00:00<?, ?B/s]

First, let's see the pipeline in action. For this example, you'll set the [time signature](https://en.wikipedia.org/wiki/Time_signature) to 4/4, [the BPM](https://en.wikipedia.org/wiki/Tempo) to 90, the density of the notes to 2 (more density equals more music events), and the first note to G2 (43 in MIDI number). These are the controllers that the user of the models will be able to change in the GUI.

In [3]:
seed = "PIECE_START TIME_SIGNATURE=4_4 BPM=90 TRACK_START INST=0 DENSITY=2 BAR_START NOTE_ON=43"
piece = pipe(seed, max_length=100)[0]["generated_text"]
print(piece)

Setting `pad_token_id` to 0 (first `eos_token_id`) to generate sequence


PIECE_START TIME_SIGNATURE=4_4 BPM=90 TRACK_START INST=0 DENSITY=2 BAR_START NOTE_ON=43 NOTE_ON=55 NOTE_ON=52 NOTE_ON=48 TIME_DELTA=4.0 NOTE_OFF=43 NOTE_OFF=55 NOTE_OFF=52 NOTE_OFF=48 NOTE_ON=50 TIME_DELTA=2.0 NOTE_OFF=50 NOTE_ON=48 TIME_DELTA=2.0 NOTE_OFF=48 NOTE_ON=48 TIME_DELTA=2.0 NOTE_OFF=48 NOTE_ON=43 TIME_DELTA=2.0 NOTE_OFF=43 NOTE_ON=36 TIME_DELTA=4.0 NOTE_OFF=36 BAR_END BAR_START NOTE_ON=36 TIME_DELTA=4.0 NOTE_OFF=36 NOTE_ON=52 NOTE_ON=55 NOTE_ON=48 TIME_DELTA=4.0 NOTE_OFF=52 NOTE_OFF=55 NOTE_OFF=48 NOTE_ON=50 TIME_DELTA=2.0 NOTE_OFF=50 NOTE_ON=48 TIME_DELTA=2.0 NOTE_OFF=48 NOTE_ON=43 TIME_DELTA=2.0 NOTE_OFF=43 BAR_END BAR_START NOTE_ON=48 NOTE_ON=52 NOTE_ON=55 NOTE_ON=38 TIME_DELTA=4.0 NOTE_OFF=48 NOTE_OFF=52 NOTE_OFF=55 NOTE_OFF=38 NOTE_ON=36 TIME_DELTA=4.0 NOTE_OFF=36 NOTE_ON=50 TIME_DELTA=2.0 NOTE_OFF=50 NOTE_ON=48 TIME_DELTA=2.0 NOTE_OFF=48 NOTE_ON=43 TIME_DELTA=2.0 NOTE_OFF=43 BAR_END BAR_START NOTE_ON=45 TIME_DELTA=2.0 NOTE_OFF=45 NOTE_ON=48 TIME_DELTA=2.0 NOTE_OFF=48 N

## 5.2 Creating a first gradio demo 

You'll now define a function, using the Mutopia modal, that takes text as input and returns a sequence of tokens. These tokens are the music piece generated by the model. For this notebook, you'll allow a prediction of a maximum length of 250 tokens.

In [4]:
def generate_guitar_piece(seed: str) -> str:
  # Generate a maximum of 250 tokens
  piece = pipe(seed, max_length=250)[0]["generated_text"]
  return piece

You can create and launch a `Gradio Interface` using the function. To do this, you need to specify:


*   The `function` that your interface will use (`generate_guitar_piece` in this case).
*   The `inputs` of your interface. In this case, this `text` includes the time signature, the BPM, the density, and the first note of the piece that the model will generate. This text needs to be similar to the above (`seed`). As you can see, this is not very convenient for the user to enter the controllers for generating the music. Don't worry! We'll change this in a moment.
*    The `outputs` of your interface. For now, your outputs will be just `text` (the tokens generated by the model). Later, you will change this to audio.

You can then call the `launch()` method of the interface. For showing errors in Colab, you would set debug=true. For now, I'll put it to false, but feel free to change it to debug the code:

In [6]:
import gradio as gr

iface = gr.Interface(
    fn=generate_guitar_piece,
    inputs="text",
    outputs="text"
)

iface.launch(debug=False)

Colab notebook detected. To show errors in colab notebook, set `debug=True` in `launch()`
Running on public URL: https://11383.gradio.app

This share link expires in 72 hours. For free permanent hosting, check out Spaces: https://huggingface.co/spaces


(<gradio.routes.App at 0x7faf8b23a5d0>,
 'http://127.0.0.1:7861/',
 'https://11383.gradio.app')

## 5.3 Improving the inputs user-experience

You now have a nice gradio demo using the Mutopia Guitar Model. Still, it is not very convenient for the user to enter the input as tokens and receive a non-musical output. You'll change this! First, let's start by creating a more convenient way to input the controllers that will determine the generation of the guitar piece.

For doing this, you will use some excellent components provided by gradio. Specifically, your user interface will use the `dropdown` and `slider` components. With the user's inputs, you'll create the `seed tex`t behind the scenes utilizing the function `create_seed`:

In [7]:
import gradio as gr

# Feel free to change this, I am using only three notes here because the model 
# works better this way.
notes = ["G2", "C3", "E3"]
notes_to_midi = {"G2": 43, "C3": 48, "E3": 52}
time_signatures = ["4/4", "3/4", "2/4", "6/8"]
time_signature_to_tokens = {"4/4": "4_4", "3/4": "3_4", "2/4": "2_4", "6/8": "6_8"}

# Helper function to create the string seed
def create_seed(time_signature: str,
                note: str,
                bpm: int,
                density: int) -> str:
  
  seed = (f"PIECE_START TIME_SIGNATURE={time_signature_to_tokens[time_signature]} "
          f"BPM={bpm} TRACK_START INST=0 DENSITY={density} "
          f"BAR_START NOTE_ON={notes_to_midi[note]} ")
  return seed
          

def generate_guitar_piece(time_signature: str,
                          note: str,
                          bpm: int,
                          density: int) -> str:
  seed = create_seed(time_signature, note, bpm, density)
  piece = pipe(seed, max_length=250)[0]["generated_text"]
  return piece

iface = gr.Interface(
  fn=generate_guitar_piece,
  inputs = [
    gr.Dropdown(time_signatures, value="4/4"), 
    gr.Dropdown(notes, value="G2"), 
    gr.Slider(minimum=60, maximum=140, step=5, value=80), 
    gr.Slider(minimum=0, maximum=5, step=1, value=2)
  ],
  outputs="text"
)

iface.launch(debug=False)

Colab notebook detected. To show errors in colab notebook, set `debug=True` in `launch()`
Running on public URL: https://18955.gradio.app

This share link expires in 72 hours. For free permanent hosting, check out Spaces: https://huggingface.co/spaces


(<gradio.routes.App at 0x7faf8b1b9d10>,
 'http://127.0.0.1:7862/',
 'https://18955.gradio.app')

## 5.4 Creating the outputs of the model

Getting tokens as output is not what you want for your user. So in this part, you'll change the output to be actual music!

### Helper functions

First, you'll add some helper functions to convert the tokens into audio. This is mainly taken from [Dr. Tristan Beheren's notebook](https://huggingface.co/TristanBehrens/js-fakes-4bars/blob/main/colab_jsfakes_generation.ipynb) and the adaptation made by [Omar Sanseviero](https://www.linkedin.com/in/omarsanseviero/) in [this space](https://huggingface.co/spaces/osanseviero/gpt2_for_music).

I further adapted these helper functions to meet the needs of the outputs presented to the users.

In [8]:
import note_seq
import copy

# Value of BPM for 1 second
BPM_1_SECOND = 60

# Variables to change based on the time signature
numerator = ""
denominator = ""

def token_sequence_to_note_sequence(token_sequence, 
                                    use_program=True, 
                                    use_drums=False, 
                                    instrument_mapper=None, 
                                    only_guitar=True):

    if isinstance(token_sequence, str):
        token_sequence = token_sequence.split()

    note_sequence = empty_note_sequence()

    # Render all notes.
    current_program = 1
    current_is_drum = False
    current_instrument = 0
    track_count = 0
    for token_index, token in enumerate(token_sequence):

        if token == "PIECE_START":
            pass
        elif token == "PIECE_END":
            print("The end.")
            break
        elif token.startswith("TIME_SIGNATURE="):
            time_signature_str = token.split("=")[-1]
            numerator = int(time_signature_str.split("_")[0])
            denominator = int(time_signature_str.split("_")[-1])
            time_signature = note_sequence.time_signatures.add()
            time_signature.numerator = numerator
            time_signature.denominator = denominator
        elif token.startswith("BPM="):
            bpm_str = token.split("=")[-1]
            bpm = int(bpm_str)
            note_sequence.tempos[0].qpm = bpm
            pulse_duration, bar_duration = duration_in_sec(
                bpm, numerator, denominator
            )
        elif token == "TRACK_START":
            current_bar_index = 0
            track_count += 1
            pass
        elif token == "TRACK_END":
            pass
        elif token == "KEYS_START":
            pass
        elif token == "KEYS_END":
            pass
        elif token.startswith("KEY="):
            pass
        elif token.startswith("INST"):
            instrument = token.split("=")[-1]
            if instrument != "DRUMS" and use_program:
                if instrument_mapper is not None:
                    if instrument in instrument_mapper:
                        instrument = instrument_mapper[instrument]
                current_program = int(instrument)
                current_instrument = track_count
                current_is_drum = False
            if instrument == "DRUMS" and use_drums:
                current_instrument = 0
                current_program = 0
                current_is_drum = True
        elif token == "BAR_START":
            current_time = (current_bar_index * bar_duration)
            current_notes = {}
        elif token == "BAR_END":
            current_bar_index += 1
            pass
        elif token.startswith("NOTE_ON"):
            pitch = int(token.split("=")[-1])
            note = note_sequence.notes.add()
            note.start_time = current_time
            note.end_time = current_time + denominator * pulse_duration
            note.pitch = pitch
            note.instrument = current_instrument
            note.program = current_program
            note.velocity = 80
            note.is_drum = current_is_drum
            current_notes[pitch] = note
        elif token.startswith("NOTE_OFF"):
            pitch = int(token.split("=")[-1])
            if pitch in current_notes:
                note = current_notes[pitch]
                note.end_time = current_time
        elif token.startswith("TIME_DELTA"):
            delta = float(token.split("=")[-1]) * (0.25) * pulse_duration
            current_time += delta
        elif token.startswith("DENSITY="):
            pass
        elif token == "[PAD]":
            pass
        else:
            #print(f"Ignored token {token}.")
            pass

    # Make the instruments right.
    instruments_drums = []
    for note in note_sequence.notes:
        pair = [note.program, note.is_drum]
        if pair not in instruments_drums:
            instruments_drums += [pair]
        note.instrument = instruments_drums.index(pair)

    if only_guitar:
        for note in note_sequence.notes:
            if not note.is_drum:
                # Midi number for guitar is 23
                note.instrument = 24
                note.program = 24

    return note_sequence

# Calculate the duration in seconds of pulse and bar
def duration_in_sec(bpm, numerator, denominator):
    pulse_duration = BPM_1_SECOND / bpm
    number_of_quarters_per_bar = (4 / denominator) * numerator
    bar_duration = pulse_duration * number_of_quarters_per_bar
    return pulse_duration, bar_duration

def empty_note_sequence(qpm=120, total_time=0.0):
    note_sequence = note_seq.protobuf.music_pb2.NoteSequence()
    note_sequence.tempos.add().qpm = qpm
    #note_sequence.ticks_per_quarter = note_seq.constants.STANDARD_PPQ
    note_sequence.total_time = total_time
    return note_sequence

Let's now use these helper functions in our interface. After generating the tokens, you'll convert them to a [magenta note-sequence](https://github.com/magenta/note-seq). With this `note-sequence object`, you'll create audio that users can listen to.

Notice that the Gradio's `Audio` output component can automatically render a tuple with a `sample rate` and `NumPy array` of data as a playable audio file:

In [9]:
import gradio as gr
import numpy as np

SAMPLE_RATE=44100

# Feel free to change this, I am using only three notes here because the model 
# works better this way.
notes = ["G2", "C3", "E3"]
notes_to_midi = {"G2": 43, "C3": 48, "E3": 52}
time_signatures = ["4/4", "3/4", "2/4", "6/8"]
time_signature_to_tokens = {"4/4": "4_4", "3/4": "3_4", "2/4": "2_4", "6/8": "6_8"}

# Helper function to create the string seed
def create_seed(time_signature: str,
                note: str,
                bpm: int,
                density: int) -> str:
  
  seed = (f"PIECE_START TIME_SIGNATURE={time_signature_to_tokens[time_signature]} "
          f"BPM={bpm} TRACK_START INST=0 DENSITY={density} "
          f"BAR_START NOTE_ON={notes_to_midi[note]} ")
  return seed
          

def generate_guitar_piece(time_signature: str,
                          note: str,
                          bpm: int,
                          density: int) -> str:
  seed = create_seed(time_signature, note, bpm, density)
  piece = pipe(seed, max_length=250)[0]["generated_text"]
  
  # Convert text of notes to audio
  note_sequence = token_sequence_to_note_sequence(piece)
  synth = note_seq.midi_synth.fluidsynth
  array_of_floats = synth(note_sequence, sample_rate=SAMPLE_RATE)
  int16_data = note_seq.audio_io.float_samples_to_int16(array_of_floats)
  return SAMPLE_RATE, int16_data


iface = gr.Interface(
  fn=generate_guitar_piece,
  inputs = [
    gr.Dropdown(time_signatures, value="4/4"), 
    gr.Dropdown(notes, value="G2"), 
    gr.Slider(minimum=60, maximum=140, step=10, value=80), 
    gr.Slider(minimum=0, maximum=5, step=1, value=2)
  ],
  outputs="audio"
)

iface.launch(debug=False)

Colab notebook detected. To show errors in colab notebook, set `debug=True` in `launch()`
Running on public URL: https://12528.gradio.app

This share link expires in 72 hours. For free permanent hosting, check out Spaces: https://huggingface.co/spaces


(<gradio.routes.App at 0x7faf7b8b7810>,
 'http://127.0.0.1:7863/',
 'https://12528.gradio.app')

## 5.5 Polishing the Gradio demo

To finish your Gradio demo, you can beautify it with some elements that the `Interface` class supports. Here, you'll add a title, a description, and an article to your demo. The description and article can use `text`, `markdown`, or `HTML`. Check the next cell to see how to do this:

In [10]:
import gradio as gr
import numpy as np

SAMPLE_RATE=44100

# Feel free to change this, I am using only three notes here because the model 
# works better this way.
notes = ["G2", "C3", "E3"]
notes_to_midi = {"G2": 43, "C3": 48, "E3": 52}
time_signatures = ["4/4", "3/4", "2/4", "6/8"]
time_signature_to_tokens = {"4/4": "4_4", "3/4": "3_4", "2/4": "2_4", "6/8": "6_8"}

# Content for your demo:
title = "Mutopia Guitar Composer"
# I am adding here an image that I generated using DALL-E
description = """
The bot was trained to compose guitar music using the 
[Mutopia Guitar Dataset](https://huggingface.co/datasets/juancopi81/mutopia_guitar_dataset). 
Change the controllers and receive a new guitar piece!
<center><img src="https://drive.google.com/uc?export=view&id=1F22ofTCeJAHqVag4lJvBZugAE1OyabVA" 
width=200px alt="Robot playing the guitar"></center>
"""
article = """
For a complete tutorial on how to create this demo from scratch, check out this
[GitHub Repo](https://github.com/juancopi81/MMM_Mutopia_Guitar).
"""

# Helper function to create the string seed
def create_seed(time_signature: str,
                note: str,
                bpm: int,
                density: int) -> str:
  
  seed = (f"PIECE_START TIME_SIGNATURE={time_signature_to_tokens[time_signature]} "
          f"BPM={bpm} TRACK_START INST=0 DENSITY={density} "
          f"BAR_START NOTE_ON={notes_to_midi[note]} ")
  return seed
          

def generate_guitar_piece(time_signature: str,
                          note: str,
                          bpm: int,
                          density: int) -> str:
  seed = create_seed(time_signature, note, bpm, density)
  piece = pipe(seed, max_length=250)[0]["generated_text"]
  
  # Convert text of notes to audio
  note_sequence = token_sequence_to_note_sequence(piece)
  synth = note_seq.midi_synth.fluidsynth
  array_of_floats = synth(note_sequence, sample_rate=SAMPLE_RATE)
  int16_data = note_seq.audio_io.float_samples_to_int16(array_of_floats)
  return SAMPLE_RATE, int16_data


iface = gr.Interface(
  fn=generate_guitar_piece,
  inputs = [
    gr.Dropdown(time_signatures, value="4/4"), 
    gr.Dropdown(notes, value="G2"), 
    gr.Slider(minimum=60, maximum=140, step=10, value=80), 
    gr.Slider(minimum=0, maximum=5, step=1, value=2)
  ],
  outputs="audio",
  # Here you can add the new content defined above
  title=title,
  description=description,
  article=article
)

iface.launch(debug=False)

Colab notebook detected. To show errors in colab notebook, set `debug=True` in `launch()`
Running on public URL: https://28379.gradio.app

This share link expires in 72 hours. For free permanent hosting, check out Spaces: https://huggingface.co/spaces


(<gradio.routes.App at 0x7faf942d0150>,
 'http://127.0.0.1:7864/',
 'https://28379.gradio.app')

You have now a complete Gradio demo that you can share with others using [Spaces](https://huggingface.co/spaces). You'll see how to do this later! 

But before that, let's see how you can have even more control over your demo using Gradio Blocks.

## 5.6 Using Gradio Blocks for your demo

In comparison to `Interface`, `Gradio Blocks` is a low-level API that "*that allows you to have full control over the data flows and layout of your application. You can build very complex, multi-step applications using Blocks (as in “building blocks”)*."

To use Blocks, you need to create a Block object and use it as a context ("with" statement). Inside the "with" statement, you can add components and define your layout.

To better understand blocks, let's create a similar demo as the one you made with `Interface` but this time using `Gradio Blocks`.

You'll create the demo's title, description, and article in the next cell. Then, you'll call the `launch()` method to start your demo.

In [11]:
import gradio as gr

# Content for your demo:
title = "Mutopia Guitar Composer"
# I am adding here an image that I generated using DALL-E
description = """
The bot was trained to compose guitar music using the 
[Mutopia Guitar Dataset](https://huggingface.co/datasets/juancopi81/mutopia_guitar_dataset). 
Change the controllers and receive a new guitar piece!
<center><img src="https://drive.google.com/uc?export=view&id=1F22ofTCeJAHqVag4lJvBZugAE1OyabVA" 
width=200px alt="Robot playing the guitar"></center>
"""
article = """
For a complete tutorial on how to create this demo from scratch, check out this
[GitHub Repo](https://github.com/juancopi81/MMM_Mutopia_Guitar).
"""

# Create a block object
demo = gr.Blocks()

# Use your Block object as a context
with demo:
  gr.Markdown("<h1 style='text-align: center; margin-bottom: 1rem'>" 
              + title 
              + "</h1>")
  gr.Markdown(description)
  gr.Markdown("<center>" 
              + article
              + "</center>")

# Launch your demo
demo.launch()


Colab notebook detected. To show errors in colab notebook, set `debug=True` in `launch()`
Running on public URL: https://24002.gradio.app

This share link expires in 72 hours. For free permanent hosting, check out Spaces: https://huggingface.co/spaces


(<gradio.routes.App at 0x7faf975153d0>,
 'http://127.0.0.1:7865/',
 'https://24002.gradio.app')

This is great! You should start noticing the power and flexibility of Blocks. 

To help you create better layouts, Blocks provides you with different tools. Let's use `gradio.Column()` and `gradio.Row()` to improve the UI of the controllers of the model.

In [12]:
import gradio as gr

# Content for your demo:
title = "Mutopia Guitar Composer"
# I am adding here an image that I generated using DALL-E
description = """
The bot was trained to compose guitar music using the 
[Mutopia Guitar Dataset](https://huggingface.co/datasets/juancopi81/mutopia_guitar_dataset). 
Change the controllers and receive a new guitar piece!
<center><img src="https://drive.google.com/uc?export=view&id=1F22ofTCeJAHqVag4lJvBZugAE1OyabVA" 
width=200px alt="Robot playing the guitar"></center>
"""
article = """
For a complete tutorial on how to create this demo from scratch, check out this
[GitHub Repo](https://github.com/juancopi81/MMM_Mutopia_Guitar).
"""

# Create a block object
demo = gr.Blocks()

# Use your Block object as a context
with demo:
  gr.Markdown("<h1 style='text-align: center; margin-bottom: 1rem'>" 
              + title 
              + "</h1>")
  gr.Markdown(description)

  # UI for the inputs of the model
  gr.Markdown("Select the params for generating your guitar piece.")
  with gr.Row():
    time_signature = gr.Dropdown(time_signatures, value="4/4", label="Time signature") 
    note = gr.Dropdown(notes, value="G2", label="First note")
    bpm = gr.Slider(minimum=60, maximum=140, step=10, value=80, label="Tempo") 
    density = gr.Slider(minimum=0, maximum=5, step=1, value=2, label="Density")
  with gr.Row():
    btn = gr.Button("Compose")

  gr.Markdown("<center>" 
              + article
              + "</center>")

# Launch your demo
demo.launch(debug=False)

Colab notebook detected. To show errors in colab notebook, set `debug=True` in `launch()`
Running on public URL: https://12077.gradio.app

This share link expires in 72 hours. For free permanent hosting, check out Spaces: https://huggingface.co/spaces


(<gradio.routes.App at 0x7faf9459db90>,
 'http://127.0.0.1:7866/',
 'https://12077.gradio.app')

You'll add the audio output to your demo in the next cell. To do this, you need to do two more things: 

*   First, create an audio elemente to your demo (`audio_output = gr.Audio()`)
*   And then, add the function you've been using when someone clicks your button (`btn.click`), specifying the inputs and outputs of this function.

Let's see this in action:

In [13]:
import gradio as gr

SAMPLE_RATE=44100

# Feel free to change this, I am using only three notes here because the model 
# works better this way.
notes = ["G2", "C3", "E3"]
notes_to_midi = {"G2": 43, "C3": 48, "E3": 52}
time_signatures = ["4/4", "3/4", "2/4", "6/8"]
time_signature_to_tokens = {"4/4": "4_4", "3/4": "3_4", "2/4": "2_4", "6/8": "6_8"}


# Content for your demo:
title = "Mutopia Guitar Composer"
# I am adding here an image that I generated using DALL-E
description = """
The bot was trained to compose guitar music using the 
[Mutopia Guitar Dataset](https://huggingface.co/datasets/juancopi81/mutopia_guitar_dataset). 
Change the controllers and receive a new guitar piece!
<figure>
<center>
<img src="https://drive.google.com/uc?export=view&id=1F22ofTCeJAHqVag4lJvBZugAE1OyabVA" 
width=200px alt="Robot playing the guitar">
<figcaption>Image generated using DALL-E</figcaption>
</center>
</figure>
"""
article = """
For a complete tutorial on how to create this demo from scratch, check out this
[GitHub Repo](https://github.com/juancopi81/MMM_Mutopia_Guitar).
"""

# Helper function to create the string seed
def create_seed(time_signature: str,
                note: str,
                bpm: int,
                density: int) -> str:
  
  seed = (f"PIECE_START TIME_SIGNATURE={time_signature_to_tokens[time_signature]} "
          f"BPM={bpm} TRACK_START INST=0 DENSITY={density} "
          f"BAR_START NOTE_ON={notes_to_midi[note]} ")
  return seed
          

def generate_guitar_piece(time_signature: str,
                          note: str,
                          bpm: int,
                          density: int):
  seed = create_seed(time_signature, note, bpm, density)
  piece = pipe(seed, max_length=250)[0]["generated_text"]
  
  # Convert text of notes to audio
  note_sequence = token_sequence_to_note_sequence(piece)
  synth = note_seq.midi_synth.fluidsynth
  array_of_floats = synth(note_sequence, sample_rate=SAMPLE_RATE)
  int16_data = note_seq.audio_io.float_samples_to_int16(array_of_floats)
  return SAMPLE_RATE, int16_data

# Create a block object
demo = gr.Blocks()

# Use your Block object as a context
with demo:
  gr.Markdown("<h1 style='text-align: center; margin-bottom: 1rem'>" 
              + title 
              + "</h1>")
  gr.Markdown(description)

  # UI for the inputs of the model
  gr.Markdown("Select the generation parameters.")
  with gr.Row():
    time_signature = gr.Dropdown(time_signatures, value="4/4", label="Time signature") 
    note = gr.Dropdown(notes, value="G2", label="First note")
    bpm = gr.Slider(minimum=60, maximum=140, step=10, value=90, label="Tempo")
    density = gr.Slider(minimum=0, maximum=5, step=1, value=2, label="Density")
  with gr.Row():
    btn = gr.Button("Compose")
  audio_output = gr.Audio()
  btn.click(generate_guitar_piece, 
            inputs = [
                time_signature, 
                note, 
                bpm, 
                density
            ],
            outputs=audio_output)
  
  gr.Markdown(article)

# Launch your demo
demo.launch(debug=False)

Colab notebook detected. To show errors in colab notebook, set `debug=True` in `launch()`
Running on public URL: https://27125.gradio.app

This share link expires in 72 hours. For free permanent hosting, check out Spaces: https://huggingface.co/spaces


(<gradio.routes.App at 0x7faf975828d0>,
 'http://127.0.0.1:7867/',
 'https://27125.gradio.app')

### Adding an image input to our model

It's also nice that users can visualize what they are hearing. A common way to do this is using a Piano Roll. Let's add another output to our demo, an image, so users can listen to and see the model's results.

First, we'll define some helper functions for this.

In [14]:
import collections
import io

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
from PIL import Image

def sequence_to_pandas_dataframe(sequence):
  pd_dict = collections.defaultdict(list)
  for note in sequence.notes:
    pd_dict["start_time"].append(note.start_time)
    pd_dict["end_time"].append(note.end_time)
    pd_dict["duration"].append(note.end_time - note.start_time)
    pd_dict["pitch"].append(note.pitch)

  return pd.DataFrame(pd_dict)

def dataframe_to_pianoroll_img(df):
  fig = plt.figure(figsize=(8, 5))
  ax = fig.add_subplot(111)
  ax.scatter(df.start_time, df.pitch, c="white")
  for _, row in df.iterrows():
    ax.add_patch(Rectangle(
        (row["start_time"], row["pitch"]-0.4), 
        row["duration"], 0.4, color="black"
    ))
  plt.xlabel('Seconds', fontsize=18)
  plt.ylabel('MIDI pitch', fontsize=16)
  return fig

def fig2img(fig):
    """Convert a Matplotlib figure to a PIL Image and return it"""
    import io
    buf = io.BytesIO()
    fig.savefig(buf, format="png")
    buf.seek(0)
    img = Image.open(buf)
    return img

def create_image_from_note_sequence(sequence):
  df_sequence = sequence_to_pandas_dataframe(sequence)
  fig = dataframe_to_pianoroll_img(df_sequence)
  img = fig2img(fig)
  return img

You are now ready to add a piano roll to your demo. For doing this, you need to return in your `generate_guitar_piece` function both outputs:


*   The `Audio` output: As a tuple of (sample rate, array).
*   The `Image` output: As a PIL image in this case.

Notice also how I added the audio_output and image_output as `gr.Audio()` and `gr.Image()` respectively in a row (`with gr.Row():`):


In [15]:
import gradio as gr

SAMPLE_RATE=44100

# Feel free to change this, I am using only three notes here because the model 
# works better this way.
notes = ["G2", "C3", "E3"]
notes_to_midi = {"G2": 43, "C3": 48, "E3": 52}
time_signatures = ["4/4", "3/4", "2/4", "6/8"]
time_signature_to_tokens = {"4/4": "4_4", "3/4": "3_4", "2/4": "2_4", "6/8": "6_8"}


# Content for your demo:
title = "Mutopia Guitar Composer"
# I am adding here an image that I generated using DALL-E
description = """
The bot was trained to compose guitar music using the 
[Mutopia Guitar Dataset](https://huggingface.co/datasets/juancopi81/mutopia_guitar_dataset). 
Change the controllers and receive a new guitar piece!
<figure>
<center>
<img src="https://drive.google.com/uc?export=view&id=1F22ofTCeJAHqVag4lJvBZugAE1OyabVA" 
width=200px alt="Robot playing the guitar">
<figcaption>Image generated using DALL-E</figcaption>
</center>
</figure>
"""
article = """
For a complete tutorial on how to create this demo from scratch, check out this
[GitHub Repo](https://github.com/juancopi81/MMM_Mutopia_Guitar).
"""

# Helper function to create the string seed
def create_seed(time_signature: str,
                note: str,
                bpm: int,
                density: int) -> str:
  
  seed = (f"PIECE_START TIME_SIGNATURE={time_signature_to_tokens[time_signature]} "
          f"BPM={bpm} TRACK_START INST=0 DENSITY={density} "
          f"BAR_START NOTE_ON={notes_to_midi[note]} ")
  return seed
          

def generate_guitar_piece(time_signature: str,
                          note: str,
                          bpm: int,
                          density: int):
  seed = create_seed(time_signature, note, bpm, density)
  piece = pipe(seed, max_length=250)[0]["generated_text"]
  
  # Convert text of notes to audio
  note_sequence = token_sequence_to_note_sequence(piece)
  synth = note_seq.midi_synth.fluidsynth
  array_of_floats = synth(note_sequence, sample_rate=SAMPLE_RATE)
  int16_data = note_seq.audio_io.float_samples_to_int16(array_of_floats)
  piano_roll = create_image_from_note_sequence(note_sequence)
  return (SAMPLE_RATE, int16_data), piano_roll

# Create a block object
demo = gr.Blocks()

# Use your Block object as a context
with demo:
  gr.Markdown("<h1 style='text-align: center; margin-bottom: 1rem'>" 
              + title 
              + "</h1>")
  gr.Markdown(description)

  # UI for the inputs of the model
  gr.Markdown("Select the generation parameters.")
  with gr.Row():
    time_signature = gr.Dropdown(time_signatures, value="4/4", label="Time signature") 
    note = gr.Dropdown(notes, value="G2", label="First note")
    bpm = gr.Slider(minimum=60, maximum=140, step=10, value=90, label="Tempo")
    density = gr.Slider(minimum=0, maximum=4, step=1, value=2, label="Density")
  with gr.Row():
    btn = gr.Button("Compose")
  with gr.Row():
    audio_output = gr.Audio()
    image_output = gr.Image()
  btn.click(generate_guitar_piece, 
            inputs = [
                time_signature, 
                note, 
                bpm, 
                density
            ],
            outputs=[audio_output, image_output])
  
  gr.Markdown(article)

# Launch your demo
demo.launch(debug=False)

Colab notebook detected. To show errors in colab notebook, set `debug=True` in `launch()`
Running on public URL: https://29317.gradio.app

This share link expires in 72 hours. For free permanent hosting, check out Spaces: https://huggingface.co/spaces


(<gradio.routes.App at 0x7faf737c1f50>,
 'http://127.0.0.1:7868/',
 'https://29317.gradio.app')

Amazing!! Great work! You can now share your demo with the world using Spaces' free hosting. For that, you need to create a Space in Hugging Face, clone it to your local computer, and adapt the code of this notebook. This adaptation should include a requirement.txt and the app.py file (where you would add the code of your Gradio Demo). You can check in this link how a final adaptation in Spaces could be made.

This has been an incredible journey. Congratulations on getting this far! We started from scratch and followed a series of steps to create an actual bot that composes guitar music, and we share it with the world. This is not trivial; you should be proud of yourself. For achieving this, you:


1.   Collected a dataset of guitar music from the Mutopia Project.
2.   Converted the MIDI files to text using the embedding representations proposed in the paper [MMM: Exploring Conditional Multi-Track Music Generation with the Transformer](https://arxiv.org/abs/2008.06048) and [implemented](https://github.com/AI-Guru/MMM-JSB) by dr. Tristan Behrens.
3.   With this Music as text, you then created a WordLevel Tokenizer.
4.   A dataset and a tokenizer allowed you to train a GPT-2 model using Hugging Face.
5.   Finally, you created a Gradio Demo and shared it with the world.

Fantastic! I hope you have had as much fun going through this tutorial as I had created it.