Audiocraft

Audiocraft is a PyTorch library for deep learning research on audio generation. At the moment, it contains the code for MusicGen, a state-of-the-art controllable text-to-music model.

MusicGen

Audiocraft provides the code and models for MusicGen, a simple and controllable model for music generation. MusicGen is a single stage auto-regressive Transformer model trained over a 32kHz EnCodec tokenizer with 4 codebooks sampled at 50 Hz. Unlike existing methods like MusicLM, MusicGen doesn't not require a self-supervised semantic representation, and it generates all 4 codebooks in one pass. By introducing a small delay between the codebooks, we show we can predict them in parallel, thus having only 50 auto-regressive steps per second of audio. Check out our sample page or test the available demo!

Installation

Audiocraft requires Python 3.9, PyTorch 2.0.0, and a GPU with at least 16 GB of memory (for the medium-sized model). To install Audiocraft, you can run the following:

# Best to make sure you have torch installed first, in particular before installing xformers.
# Don't run this if you already have PyTorch installed.
pip install 'torch>=2.0'
# Then proceed to one of the following
pip install -U audiocraft  # stable release
pip install -U git+https://git@github.com/facebookresearch/audiocraft#egg=audiocraft  # bleeding edge
pip install -e .  # or if you cloned the repo locally

Usage

You can play with MusicGen by running the jupyter notebook at demo.ipynb locally, or use the provided colab notebook. Finally, a demo is also available on the facebook/MusiGen HugginFace Space (huge thanks to all the HF team for their support).

API

We provide a simple API and 4 pre-trained models. The pre trained models are:

small: 300M model, text to music only,
medium: 1.5B model, text to music only,
melody: 1.5B model, text to music and text+melody to music,
large: 3.3B model, text to music only.

We observe the best trade-off between quality and compute with the medium or melody model. In order to use MusicGen locally you must have a GPU. We recommend 16GB of memory, but smaller GPUs will be able to generate short sequences, or longer sequences with the small model.

See after a quick example for using the API.

import torchaudio
from audiocraft.models import MusicGen
from audiocraft.data.audio import audio_write

model = MusicGen.get_pretrained('melody')
model.set_generation_params(duration=8)  # generate 8 seconds.
wav = model.generate_unconditional(4)    # generates 4 unconditional audio samples
descriptions = ['happy rock', 'energetic EDM', 'sad jazz']
wav = model.generate(descriptions)  # generates 3 samples.

melody, sr = torchaudio.load('./assets/bach.mp3')
# generates using the melody from the given audio and the provided descriptions.
wav = model.generate_with_chroma(descriptions, melody[None].expand(3, -1, -1), sr)

for idx, one_wav in enumerate(wav):
    # Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
    audio_write(f'{idx}', one_wav, model.sample_rate, strategy="loudness")

Model Card

See the model card page.

FAQ

Will the training code be released?

Yes. We will soon release the training code for MusicGen and EnCodec.

Citation

@article{copet2023simple,
      title={Simple and Controllable Music Generation},
      author={Jade Copet and Felix Kreuk and Itai Gat and Tal Remez and David Kant and Gabriel Synnaeve and Yossi Adi and Alexandre Défossez},
      year={2023},
      journal={arXiv preprint arXiv:2306.05284},
}

License

The code in this repository is released under the MIT license as found in the LICENSE file.
The weights in this repository are released under the CC-BY-NC 4.0 license as found in the LICENSE_weights file.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github		.github
assets		assets
audiocraft		audiocraft
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
LICENSE_weights		LICENSE_weights
MANIFEST.in		MANIFEST.in
MODEL_CARD.md		MODEL_CARD.md
Makefile		Makefile
README.md		README.md
app.py		app.py
app_batched.py		app_batched.py
demo.ipynb		demo.ipynb
hf_loading.py		hf_loading.py
mypy.ini		mypy.ini
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

License

StephenRoddy/audiocraft

Folders and files

Latest commit

History

Repository files navigation

Audiocraft

MusicGen

Installation

Usage

API

Model Card

FAQ

Will the training code be released?

Citation

License

About

Resources

License

Stars

Watchers

Forks

Languages