# MusicGen

首先，我们从初始化 MusicGen 开始，可以从以下选项中选择一个模型：
1. `small` - 300M变压器解码器
2. `medium` - 1.5B 变压器解码器。
3. `melody` - 1.5B 变压器解码器也支持旋律调节。
4. `large` - 3.3B 变压器解码器。


我们将使用 small用于本演示的变体。

In [1]:
from audiocraft.models import MusicGen

# Using small model, better results would be obtained with `medium` or `large`.
model = MusicGen.get_pretrained('small')

A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
  from .autonotebook import tqdm as notebook_tqdm


接下来，让我们配置生成参数。具体而言，您可以控制以下内容：
* `use_sampling` (bool, optional)：如果为True，则使用采样，否则执行argmax解码。默认为True。
* `top_k` (int, optional): 用于采样的top_k。默认值为250。
* `top_p` (float, optional): 当设置为0时，使用用于采样的top_p top_k。默认值为0.0。
* `temperature` (float, optional)：softmax温度参数。默认值为1.0。
* `duration` (float, optional)：生成波形的持续时间。默认值为30.0。
* `cfg_coef` (float, optional)：用于无分类器引导的系数。默认为3.0。

保持不变时，MusicGen将恢复到其默认参数。

In [2]:
model.set_generation_params(
    use_sampling=True,
    top_k=250,
    duration=5
)

接下来，我们可以继续使用以下模式之一开始生成音乐：
* 无条件样本使用 `model.generate_unconditional`
* 音乐续用 `model.generate_continuation`
* 文本条件样本使用 `model.generate`
* 旋律条件样本使用 `model.generate_with_chroma`

### 无条件样本使用

In [4]:
from audiocraft.utils.notebook import display_audio

output = model.generate_unconditional(num_samples=2, progress=True)
display_audio(output, sample_rate=32000)

   253 /    250

### 音乐续用

In [5]:
import math
import torchaudio
import torch
from audiocraft.utils.notebook import display_audio

def get_bip_bip(bip_duration=0.125, frequency=440,
                duration=0.5, sample_rate=32000, device="cuda"):
    """Generates a series of bip bip at the given frequency."""
    t = torch.arange(
        int(duration * sample_rate), device="cuda", dtype=torch.float) / sample_rate
    wav = torch.cos(2 * math.pi * 440 * t)[None]
    tp = (t % (2 * bip_duration)) / (2 * bip_duration)
    envelope = (tp >= 0.5).float()
    return wav * envelope


In [6]:
# Here we use a synthetic signal to prompt both the tonality and the BPM
# of the generated audio.
res = model.generate_continuation(
    get_bip_bip(0.125).expand(2, -1, -1), 
    32000, ['Jazz jazz and only jazz', 
            'Heartful EDM with beautiful synths and chords'], 
    progress=True)
display_audio(res, 32000)

   228 /    250

In [7]:
# You can also use any audio from a file. Make sure to trim the file if it is too long!
prompt_waveform, prompt_sr = torchaudio.load("./assets/bach.mp3")
prompt_duration = 2
prompt_waveform = prompt_waveform[..., :int(prompt_duration * prompt_sr)]
output = model.generate_continuation(prompt_waveform, prompt_sample_rate=prompt_sr, progress=True)
display_audio(output, sample_rate=32000)

   153 /    250

### 文本条件样本使用 

In [8]:
from audiocraft.utils.notebook import display_audio

output = model.generate(
    descriptions=[
        '80s pop track with bassy drums and synth',
        '90s rock song with loud guitars and heavy drums',
    ],
    progress=True
)
display_audio(output, sample_rate=32000)

   253 /    250

### 旋律条件样本使用

In [10]:
import torchaudio
from audiocraft.utils.notebook import display_audio

model = MusicGen.get_pretrained('melody')
model.set_generation_params(duration=8)

melody_waveform, sr = torchaudio.load("assets/bolero_ravel.mp3")
melody_waveform = melody_waveform.unsqueeze(0).repeat(2, 1, 1)
output = model.generate_with_chroma(
    descriptions=[
        '80s pop track with bassy drums and synth',
        '90s lofi song with light guitars and smooth drums and low bass',
    ],
    melody_wavs=melody_waveform,
    melody_sample_rate=sr,
    progress=True
)
display_audio(output, sample_rate=32000)

   403 /    400