# MusicGen

MusicGenは、Meta AIのFAIRチームによって開発された音楽生成モデルです。このモデルは、テキストから音楽へ、または旋律に基づいて音楽を生成するために設計されています。モデルの特徴には、オーディオトークン化のためのEnCodecモデルと、音楽モデリングのための変圧器ベースの自己回帰言語モデルが含まれています。
- GitHub：https://github.com/facebookresearch/audiocraft/blob/main/model_cards/MUSICGEN_MODEL_CARD.md

<a href="https://colab.research.google.com/github/fuyu-quant/data-science-wiki/blob/main/multimodal/text_to_music/rinna_japanese_cloob_vit_b_16.ipynb" target="_blank" rel="noopener noreferrer"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
%%capture
!pip install git+https://github.com/huggingface/transformers.git

In [None]:
from transformers import AutoProcessor, MusicgenForConditionalGeneration

from IPython.display import Audio
import scipy

# MusicGenのダウンロード

In [None]:
model_name = "facebook/musicgen-small"
#model_name = "facebook/musicgen-medium"
#model_name = "facebook/musicgen-large"
#model_name = "facebook/musicgen-melody"


processor = AutoProcessor.from_pretrained(model_name)
model = MusicgenForConditionalGeneration.from_pretrained(model_name)

In [None]:
inputs = processor(
    text=["80s pop track with bassy drums and synth", "90s rock song with loud guitars and heavy drums"],
    padding=True,
    return_tensors="pt",
)

audio_values = model.generate(**inputs, max_new_tokens=256)

In [3]:
sampling_rate = model.config.audio_encoder.sampling_rate
Audio(audio_values[0].numpy(), rate=sampling_rate)


In [4]:
sampling_rate = model.config.audio_encoder.sampling_rate
scipy.io.wavfile.write("musicgen_out.wav", rate=sampling_rate, data=audio_values[0, 0].numpy())