# Scene Music Generation

https://github.com/facebookresearch/audiocraft/blob/main/docs/MUSICGEN.md

In [1]:
%pip install --upgrade pip
%pip install --upgrade transformers scipy torchvision

[0mLooking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
[0mNote: you may need to restart the kernel to use updated packages.
[0mLooking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
[0mNote: you may need to restart the kernel to use updated packages.


In [2]:
from transformers import pipeline
from transformers import AutoProcessor, MusicgenForConditionalGeneration
import scipy
import settings

processor = AutoProcessor.from_pretrained(settings.MUSIC_MODEL)
model = MusicgenForConditionalGeneration.from_pretrained(settings.MUSIC_MODEL).to(settings.DEVICE)
sampling_rate = model.config.audio_encoder.sampling_rate

def generate_music(prompt: str):
    inputs = processor(
        text=[prompt],
        padding=True,
        return_tensors="pt",
    ).to(settings.DEVICE)

    # 256 = 5s
    max_new_tokens = 256 * 8
    audio_values = model.generate(**inputs, max_new_tokens=max_new_tokens)
    audio = audio_values[0, 0].cpu().numpy()
    
    return audio

2024-11-11 01:08:49.726042: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-11-11 01:08:49.732870: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1731305329.741264  117564 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1731305329.743806  117564 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-11-11 01:08:49.752863: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [3]:
import settings
from model import Story

story_dir = settings.STORY_DIR
story = Story.load_from_directory(story_dir + "/step_4")

In [6]:
from IPython.display import Audio, display, Markdown
import os

for act in story.acts:
    for scene in act.scenes:
        prompt = scene.mood
        output_file = story_dir + f"/step_13/{act.act_id}/{scene.scene_id}.wav"

        # Create the directory if it doesn't exist
        os.makedirs(os.path.dirname(output_file), exist_ok=True)
        
        display(Markdown(f"---\n\n# Generating music for {act.title} {scene.title}"))
        
        audio = generate_music(prompt)
        scipy.io.wavfile.write(output_file, rate=sampling_rate, data=audio)

        # Convert WAV to MP3
        mp3_output_file = output_file.replace(".wav", ".mp3")
        os.system(f"ffmpeg -i {output_file} {mp3_output_file}")
        
        # Remove the WAV file after conversion
        os.remove(output_file)

        display(Markdown(f"Model: {settings.MUSIC_MODEL}"))
        display(Markdown(f"> {prompt}"))
        display(Audio(filename=mp3_output_file))

---

# Generating music for Act 1: Beyond the Screen The Underdog's Rise

ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enab

Model: facebook/musicgen-large

> Electric, competitive atmosphere

---

# Generating music for Act 1: Beyond the Screen Awakening to Nexus

ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enab

Model: facebook/musicgen-large

> Surreal, wonder-filled

---

# Generating music for Act 2: Unlikely Allies Confronting Rivals

ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enab

Model: facebook/musicgen-large

> Tense, competitive atmosphere

---

# Generating music for Act 2: Unlikely Allies First Trial in Nexus

ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enab

Model: facebook/musicgen-large

> Perplexing, thrilling

---

# Generating music for Act 3: The Weight of Reality Aftermath on Earth

ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enab

Model: facebook/musicgen-large

> Somber, reflective atmosphere

---

# Generating music for Act 3: The Weight of Reality Confronting the Architect

ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enab

Model: facebook/musicgen-large

> Unsettling, revelatory

In [18]:
for act in story.acts:
    display(Markdown(f"## {act.title}"))
    for scene in act.scenes:
        mp3_file = f"{story_dir}/{act.act_id}/{scene.scene_id}.mp3"
        if os.path.exists(mp3_file):
            display(Markdown(f"### {scene.title}\n[{mp3_file}]({mp3_file})\n\nMood: {scene.mood}"))


## Act 1: Beyond the Screen

### The Underdog's Rise
[stories/my_story/step_13/discovery/gaming_tournament.mp3](stories/my_story/step_13/discovery/gaming_tournament.mp3)

Mood: Electric, competitive atmosphere

### Awakening to Nexus
[stories/my_story/step_13/discovery/nexus_awakening.mp3](stories/my_story/step_13/discovery/nexus_awakening.mp3)

Mood: Surreal, wonder-filled

## Act 2: Unlikely Allies

### Confronting Rivals
[stories/my_story/step_13/alliance/rival_encounter.mp3](stories/my_story/step_13/alliance/rival_encounter.mp3)

Mood: Tense, competitive atmosphere

### First Trial in Nexus
[stories/my_story/step_13/alliance/nexus_challenge.mp3](stories/my_story/step_13/alliance/nexus_challenge.mp3)

Mood: Perplexing, thrilling

## Act 3: The Weight of Reality

### Aftermath on Earth
[stories/my_story/step_13/consequences/earth_aftermath.mp3](stories/my_story/step_13/consequences/earth_aftermath.mp3)

Mood: Somber, reflective atmosphere

### Confronting the Architect
[stories/my_story/step_13/consequences/architect_encounter.mp3](stories/my_story/step_13/consequences/architect_encounter.mp3)

Mood: Unsettling, revelatory

# Next Step
Onto [Step 13: Manuscript](./14_manuscript.ipynb)