# MusicGen
Inferencing using large size model truly requires at least 16 GB of GPU RAM, but rather more for this notebook to run reliably (generation loop took about 20 GB).

I have used **NVIDIA A100** and this very strong card generates 30 seconds content within 90 seconds on large model.

In [1]:
import torchaudio
from audiocraft.models import MusicGen, MultiBandDiffusion
from audiocraft.data.audio import audio_write

In [None]:
model = MusicGen.get_pretrained('facebook/musicgen-large')
mbd = MultiBandDiffusion.get_mbd_musicgen()

model.set_generation_params(
    use_sampling=True,
    top_k=250,
    duration=30
)

In [3]:
texts = [
    "epic movie soundtrack with orchestra and modern drums",
    "punk rock with loud drum and power guitar",
    "a light and cheerly EDM track, with syncopated drums, aery pads, and strong emotions bpm: 130",
    "folk music with guitar and fiddle",
    "scottish pipes",
    "rock balad with power guitar, symphonic orchestra and flute as lead"
    ]

In [4]:
for text in texts:
    print(f"Generating: {text}")
    output = model.generate(
        descriptions=[text],
        progress=True, return_tokens=True
    )
    text = text.replace(" ", "_").replace(",", "").replace(":", "")
    output_diffusion = mbd.tokens_to_wav(output[1])
    audio_write(f"../musicgen-large/{text}", wav=output_diffusion[0], sample_rate=model.sample_rate, format="mp3", strategy="loudness", loudness_headroom_db=16, loudness_compressor=True)

Generating: epic movie soundtrack with orchestra and modern drums
Generating: punk rock with loud drum and power guitar
Generating: a light and cheerly EDM track, with syncopated drums, aery pads, and strong emotions bpm: 130
Generating: folk music with guitar and fiddle
Generating: scottish pipes
Generating: rock balad with power guitar, symphonic orchestra and flute as lead
  1503 /   1500