<a href="https://colab.research.google.com/github/AlkimiaSoft/AIColabSamples/blob/main/Generate_AI_Music_with_musicgen.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Init

### Instal musicgen from github repository

After the installation it will ask to reload the environment. Do it to start working with musicgen right away!

In [None]:
#musicGen dependencies
!pip install -U git+https://github.com/facebookresearch/audiocraft

### Import the needed libraries

In [1]:
from audiocraft.models import musicgen
from IPython.display import Audio
from audiocraft.data.audio import audio_write

## Basic usage

### Choose a model

For more information go to https://github.com/facebookresearch/audiocraft/blob/main/docs/MUSICGEN.md

In [None]:
#model = musicgen.MusicGen.get_pretrained('melody', device='cuda')
#model = musicgen.MusicGen.get_pretrained('facebook/musicgen-small', device='cuda')
model = musicgen.MusicGen.get_pretrained('facebook/musicgen-medium', device='cuda')
#model = musicgen.MusicGen.get_pretrained('facebook/musicgen-stereo-medium', device='cuda')
#model = musicgen.MusicGen.get_pretrained('facebook/musicgen-large', device='cuda')

### Define some variables

Feel free to change the values.

*Note: In the free colab layer, longer durations will cause an overflow of GPU RAM when using the medium or large models*

In [None]:
duration = 45 # duration in seconds
musicPrompts = ["happy ukelele tune"]
model.set_generation_params(duration=duration)
sampling_rate = model.sample_rate

### Generate the music

In [None]:
audioGenerations = model.generate(musicPrompts,progress=True)

## Save and display the music

In [None]:
audio_write("musicgen_basic_audio_test", audioGenerations[0, 0].cpu(), model.sample_rate, strategy="loudness", loudness_compressor=True)

In [None]:
Audio('/content/musicgen_basic_audio_test.wav', rate=sampling_rate)

### Alternative way ofsaving music using scipy


In [None]:
import locale
locale.getpreferredencoding = lambda: "UTF-8"
!pip install scipy
import scipy


In [None]:
audio_name = "musicgen_basic_audio_test_scipy.wav"
scipy.io.wavfile.write(audio_name, rate=sampling_rate, data=audioGenerations[0, 0].cpu().numpy())

Audio(audio_name, rate=sampling_rate)

## Generate multiple music passing a list of prompts

We can pass multiple prompts and get multiple results in one generate call

In [None]:
musicPrompts = ["epic battle music", "piano melancholic balad", "animated lofi chill beats"]

In [None]:
audioGenerations = model.generate(musicPrompts,progress=True)

### Saving and displaying the generated music

In [None]:
for idx, audioGeneration in enumerate(audioGenerations):
    #save file
    audio_write(f'audio_{idx}', audioGeneration.cpu(), model.sample_rate, strategy="loudness", loudness_compressor=True)
    #show audio file
    display(Audio(f'/content/audio_test_{idx}.wav', rate=sampling_rate))


## MODELS COMPARATION

Let's compare the 3 sizes models to see if there is a lot of quality diference in quality.

- `facebook/musicgen-small`: 300M model, text to music only - [🤗 Hub](https://huggingface.co/facebook/musicgen-small)
- `facebook/musicgen-medium`: 1.5B model, text to music only - [🤗 Hub](https://huggingface.co/facebook/musicgen-medium)
- `facebook/musicgen-large`: 3.3B model, text to music only - [🤗 Hub](https://huggingface.co/facebook/musicgen-large)


### Function to flush cache
We will need a function to flush the cache because of the limitations of the Free Colab layer.

In [2]:
import gc
import torch

def flush_cache():
  gc.collect()
  torch.cuda.empty_cache()

flush_cache()

### Helper function to display Audios in a comparative grid

In [3]:
from google.colab import widgets
from IPython.display import Audio

def compare_audio(promptsList, modelsList):
  grid = widgets.Grid(len(promptsList)+1, len(modelsList)+1)
  for i, (row, col) in enumerate(grid):
    #show prompt value
    if (row == 0) & (col != 0):
      print(modelsList[col-1])
    else:
      #show model value
      if (row != 0) & (col == 0):
        print(promptsList[row-1])

    #show Audio content
    if (row != 0) & (col != 0):
      audio_file = f'/content/audio_{modelsList[col-1].split("/")[-1]}_{row-1}.wav'
      display(Audio(audio_file))

### Generations for each model and each prompt

In [None]:
duration = 45 # duration in seconds
musicPrompts = [
    "epic battle music",
    "piano melancholic balad",
    "animated lofi chill beats"
]
models = [
    'facebook/musicgen-small',
    'facebook/musicgen-medium',
    'facebook/musicgen-large'
]

audio_table = {}

for model_name in models:
    model = musicgen.MusicGen.get_pretrained(model_name, device='cuda')
    model.set_generation_params(duration=duration)
    sampling_rate = model.sample_rate
    audioGenerations = model.generate(musicPrompts, progress=True)
    audio_table[model_name] = audioGenerations
    flush_cache()

# Save and display the generated audios
for model_name, audioGenerations in audio_table.items():
    for idx, audioGeneration in enumerate(audioGenerations):
        audio_write(f'audio_{model_name.split("/")[-1]}_{idx}', audioGeneration.cpu(), sampling_rate, strategy="loudness", loudness_compressor=True)


### Showing the results

In [None]:
compare_audio(musicPrompts,models)