# Fine-tune MusicGen on Eulogy (Stranger Things Soundtrack)

This notebook fine-tunes Meta's MusicGen model on the "Eulogy" soundtrack by Kyle Dixon & Michael Stein from Stranger Things.

The result will be a custom model that generates music in the haunting, ethereal, dark ambient style of that track.

## Setup Instructions

1. **Save a copy to Google Drive**: File ‚Üí Save a copy in Drive
2. **Get your Replicate API key**: https://replicate.com/account/api-tokens
3. **Add it as a secret**: Click the üîë icon in the left sidebar, add `REPLICATE_API_KEY`
4. **Download Eulogy audio** (see Step 1.A below)
5. **Run cells sequentially** using Shift+Enter

## 0.A Install Dependencies

In [33]:
!pip install -q replicate librosa pydub

## 0.B Setup Replicate API

In [34]:
from google.colab import userdata
import os
import replicate

# Get API key from Colab secrets
REPLICATE_API_KEY = userdata.get("REPLICATE_API_KEY")
os.environ["REPLICATE_API_TOKEN"] = REPLICATE_API_KEY

print("‚úì Replicate API configured")

‚úì Replicate API configured


## 0.C Test MusicGen (Optional)

Generate a quick test track to verify the base model works

In [35]:
# Optional: Test the base MusicGen model
try:
    input = {
            # "prompt": "Dark ambient synthesizer music, haunting and ethereal",
            "prompt": """Dark ambient synthesizer music. Haunting ethereal pads with slow atmospheric build.
            Wordless vocal-like Otamatone or synthesizer tones. Ghostly reverb-heavy drones. Cinematic horror soundtrack.
            Melancholic, mysterious, ominous. Similar to Stranger Things Eulogy by Kyle Dixon and Michael Stein.
            Pure synthesizers, no drums, no percussion. Minimal rhythm. Slow tempo. Foggy, distant, ethereal.""",
            "duration": 30,
            "model_version": "stereo-large",
            "output_format": "mp3",
            "normalization_strategy": "peak"
    }
    output = replicate.run(
        "meta/musicgen:671ac645ce5e552cc63a54a2bbff63fcf798043055d2dac5fc9e36a837eedcfb",
        input=input
    )

    print(f"‚úì Base model test successful")
    # To access the file URL:
    print(f"Output: {output.url}")

    # To write the file to disk:
    with open("output.mp3", "wb") as file:
        file.write(output.read())

#=> output.mp3 written to disk
except Exception as e:
    print(f"‚úó Error: {e}")

‚úì Base model test successful
Output: https://replicate.delivery/yhqm/eq6WOyB79Zy1Da8spGSH9EP4VADDZW34xIfZoXtlm69U6LBWA/out.mp3


# 1. DATASET PREPARATION

## 1.A Download Eulogy Audio

### Option 1: From SoundCloud
1. Go to: https://soundcloud.com/user-920408597/eulogy
2. Right-click "Download" and save as MP3
3. Upload to Colab file browser on the left (drag and drop)
4. Rename to `eulogy.mp3`

### Option 2: From YouTube
1. Go to: https://www.youtube.com/watch?v=hXtPdB3_RVs
2. Use a YouTube to MP3 converter (e.g., https://www.y2mate.com/)
3. Download and upload to Colab

### Option 3: Use yt-dlp (in Colab)

In [36]:
# # Install yt-dlp to download from YouTube
# !pip install -q yt-dlp

# # Download Eulogy from YouTube
# !yt-dlp -f bestaudio -x --audio-format mp3 -o "eulogy.mp3" "https://www.youtube.com/watch?v=hXtPdB3_RVs"

# print("‚úì Eulogy downloaded")

## 1.B Check Audio Duration & Quality

In [37]:
import librosa
import os
from pathlib import Path

# Load and analyze the audio
audio_path = "eulogy.mp3"
y, sr = librosa.load(audio_path, sr=None)
duration_seconds = librosa.get_duration(y=y, sr=sr)
file_size_mb = os.path.getsize(audio_path) / (1024 * 1024)

print(f"Audio File: {audio_path}")
print(f"Duration: {duration_seconds:.2f} seconds ({duration_seconds/60:.2f} minutes)")
print(f"Sample Rate: {sr} Hz")
print(f"File Size: {file_size_mb:.2f} MB")
print(f"\n‚ÑπÔ∏è  MusicGen requires each track to be > 30 seconds")
print(f"Your file: {'‚úì GOOD' if duration_seconds > 30 else '‚úó TOO SHORT'}")

Audio File: eulogy.mp3
Duration: 218.36 seconds (3.64 minutes)
Sample Rate: 44100 Hz
File Size: 3.33 MB

‚ÑπÔ∏è  MusicGen requires each track to be > 30 seconds
Your file: ‚úì GOOD


## 1.C Chunk Audio into 30-Second Segments

MusicGen training automatically chunks audio, but we'll create explicit 30-second segments for better control.

In [38]:
from pydub import AudioSegment
import os

# Create output directory
os.makedirs("dataset/audio", exist_ok=True)

# Load audio
audio = AudioSegment.from_mp3("eulogy.mp3")
chunk_length_ms = 33 * 1000  # 33 seconds

# Split into chunks
chunks = []
for i in range(0, len(audio), chunk_length_ms):
    chunk = audio[i:i + chunk_length_ms]
    # Skip chunks shorter than 33 seconds (we need exact 33s for training)
    if len(chunk) >= chunk_length_ms:
        chunk = chunk[:chunk_length_ms]  # Ensure exactly 33s
        chunks.append(chunk)
        chunk_num = len(chunks)
        filename = f"dataset/audio/eulogy_segment_{chunk_num:02d}.mp3"
        chunk.export(filename, format="mp3", bitrate="192k")
        print(f"‚úì Created: {filename} ({len(chunk)/1000:.1f}s)")

print(f"\n‚úì Total segments created: {len(chunks)}")

‚úì Created: dataset/audio/eulogy_segment_01.mp3 (33.0s)
‚úì Created: dataset/audio/eulogy_segment_02.mp3 (33.0s)
‚úì Created: dataset/audio/eulogy_segment_03.mp3 (33.0s)
‚úì Created: dataset/audio/eulogy_segment_04.mp3 (33.0s)
‚úì Created: dataset/audio/eulogy_segment_05.mp3 (33.0s)
‚úì Created: dataset/audio/eulogy_segment_06.mp3 (33.0s)

‚úì Total segments created: 6


## 1.D Create Descriptions for Each Segment

Each audio file needs a corresponding `.txt` file with a description.
These descriptions help the model understand what style to learn.

In [39]:
# Create descriptions for each segment
# All segments describe the Eulogy style

description = """Dark ambient synthesizer music. Haunting ethereal pads with slow atmospheric build.
Wordless vocal-like Otamatone or synthesizer tones. Ghostly reverb-heavy drones. Cinematic horror soundtrack.
Melancholic, mysterious, ominous. Similar to Stranger Things Eulogy by Kyle Dixon and Michael Stein.
Pure synthesizers, no drums, no percussion. Minimal rhythm. Slow tempo. Foggy, distant, ethereal."""

# Write description for each segment
for i in range(1, len(chunks) + 1):
    txt_path = f"dataset/audio/eulogy_segment_{i:02d}.txt"
    with open(txt_path, 'w') as f:
        f.write(description)
    print(f"‚úì Created: {txt_path}")

print(f"\n‚úì Descriptions created for all {len(chunks)} segments")

‚úì Created: dataset/audio/eulogy_segment_01.txt
‚úì Created: dataset/audio/eulogy_segment_02.txt
‚úì Created: dataset/audio/eulogy_segment_03.txt
‚úì Created: dataset/audio/eulogy_segment_04.txt
‚úì Created: dataset/audio/eulogy_segment_05.txt
‚úì Created: dataset/audio/eulogy_segment_06.txt

‚úì Descriptions created for all 6 segments


## 1.E Create Dataset ZIP

Replicate requires all training data in a single ZIP file.

In [40]:
import shutil

# Create ZIP archive
shutil.make_archive('eulogy_dataset', 'zip', 'dataset/audio')

# Check file size
zip_size = os.path.getsize('eulogy_dataset.zip') / (1024 * 1024)
print(f"‚úì Dataset ZIP created: eulogy_dataset.zip")
print(f"File size: {zip_size:.2f} MB")

‚úì Dataset ZIP created: eulogy_dataset.zip
File size: 4.47 MB


## 1.F Download ZIP for Upload to Replicate

The ZIP file is now ready. Download it from Colab's file browser (left sidebar):
1. Click the folder icon on the left
2. Find `eulogy_dataset.zip`
3. Right-click and select "Download"

You'll upload this ZIP to Replicate in the next section.

# 2. SETUP REPLICATE MODEL

## 2.A Create Model Repository on Replicate

First, get your Replicate username from https://replicate.com/account

In [41]:
from replicate.exceptions import ReplicateError

# TODO: Update these with your actual values
replicate_username = "sundai-club"  # e.g., "qsimeon" or "sundai-club"
model_name = "musicgen-eulogy"  # Must be unique, lowercase, no spaces

try:
    model = replicate.models.create(
        owner=replicate_username,
        name=model_name,
        visibility="public",  # or "private"
        hardware="gpu-t4",  # Replicate will override for fine-tuned models
        description="MusicGen fine-tuned on Eulogy from Stranger Things",
    )
    print(f"‚úì Model created: {model.name}")
    print(f"URL: https://replicate.com/{model.owner}/{model.name}")
except ReplicateError as e:
    if "already exists" in str(e):
        print("Model already exists, loading it...")
        model = replicate.models.get(f"{replicate_username}/{model_name}")
        print(f"‚úì Model loaded: {model.name}")
    else:
        print(f"‚úó Error: {e}")
        print("Make sure you've set replicate_username correctly")

Model already exists, loading it...
‚úì Model loaded: musicgen-eulogy


# 3. TRAIN THE MODEL

## 3.A Upload Dataset and Start Training

**IMPORTANT**: You must upload the `eulogy_dataset.zip` file you downloaded in Step 1.F to Colab first.

In [42]:
# Make sure eulogy_dataset.zip is in the current directory
# If not uploaded, upload it via the Files menu on the left

dataset_path = "eulogy_dataset.zip"

if not os.path.exists(dataset_path):
    print("‚úó ERROR: eulogy_dataset.zip not found!")
    print("Please upload it via the Files menu on the left sidebar.")
else:
    print(f"‚úì Found: {dataset_path}")
    print(f"Size: {os.path.getsize(dataset_path) / (1024*1024):.2f} MB")

‚úì Found: eulogy_dataset.zip
Size: 4.47 MB


In [43]:
# Start training
# This uses the ostris/musicgen-lora-trainer or similar
# Note: To convert a Google Drive link into a downloadable link: https://www.ayrshare.com/how-to-get-direct-download-urls-from-google-drive/
# We need to upload our zip file somewhere on the internet that is publicly accessible,
# like an S3 bucket or a GitHub Pages site.
# The melody model lets you generate music based on the melody of an input.
# This feature is only available in your fine-tune if you‚Äôve chosen to train the melody base model.

try:
    training = replicate.trainings.create(
        # Using MusicGen fine-tuning trainer
        version="sakemin/musicgen-fine-tuner:8d02c56b9a3d69abd2f1d6cc1a65027de5bfef7f0d34bd23e0624ecabb65acac",
        input={
            "dataset_path": "https://www.dropbox.com/scl/fi/p544n6ef8k7agrhu5ggdn/eulogy_dataset.zip?rlkey=41wvydbu5fim8xys7jt5lr9h9&dl=1",
            "one_same_description": description,
            "auto_labeling": False, # Auto-generate metadata
            "drop_vocals": True, # Remove vocals before training
            "model_version": "melody", # Options: small, medium or melody
            "lr": 1.0,
            "epochs": 5,
            "updates_per_epoch": 100,
            "batch_size": 8, # Must be a multiple of 8
        },
        destination=f"{model.owner}/{model.name}"
    )

    training.reload()
    print(f"‚úì Training started: {training.status}")
    print(f"Training URL: https://replicate.com/p/{training.id}")
    print(f"\n".join(training.logs.split("\n")[-10:]))
    print(f"\n‚ÑπÔ∏è  Training takes ~15 minutes on L40s hardware")
    print(f"Check the URL above to monitor progress.")
except Exception as e:
    print(f"‚úó Error starting training: {e}")
    print("Make sure:")
    print("1. eulogy_dataset.zip is uploaded")
    print("2. replicate_username is correct")
    print("3. model_name is unique")
    print("4. The trainer version ID is correct")

‚úì Training started: starting
Training URL: https://replicate.com/p/q8t41nbj0srnc0cvz48tkneaar


‚ÑπÔ∏è  Training takes ~15 minutes on L40s hardware
Check the URL above to monitor progress.


## 3.B Monitor Training

**Visit the training URL** printed above to monitor progress.

Training typically takes ~15-20 minutes on L40s hardware.

You'll see:
- Real-time training metrics (loss, etc.)
- Generated samples every few steps
- ETA until completion

In [47]:
# Optional: Poll training status
import time

training_id = f"{training.id}"  # From the URL above

try:
    training = replicate.trainings.get(training_id)
    print(f"Training Status: {training.status}")
    print(f"Output: {training.output}")
except:
    print("Couldn't fetch training status. Check the URL above instead.")

Training Status: processing
Output: None


# 4. USE YOUR FINE-TUNED MODEL

## 4.A Get Model Version ID

Once training completes:
1. Go to https://replicate.com/your-username/musicgen-eulogy
2. Find the latest version ID (long hash)
3. Copy it below

In [55]:
# After training completes, get your model's version ID
model_version_id = model.versions.list()[0].id  # e.g., "abc123def456..."
model_owner = replicate_username
model_name = "musicgen-eulogy"

print(f"Your fine-tuned model:")
print(f"  Owner: {model_owner}")
print(f"  Name: {model_name}")
print(f"  Version ID: {model_version_id}")
print(f"\nFull identifier: {model_owner}/{model_name}:{model_version_id}")


Your fine-tuned model:
  Owner: sundai-club
  Name: musicgen-eulogy
  Version ID: 93ed0f8d7560876afd2a087263a4a788716c36e59de91f83130e700fedc7b2e3

Full identifier: sundai-club/musicgen-eulogy:93ed0f8d7560876afd2a087263a4a788716c36e59de91f83130e700fedc7b2e3


## 4.B Test Your Fine-Tuned Model

In [None]:
# Test generating music with your fine-tuned model
model_identifier = f"{model_owner}/{model_name}:{model_version_id}"

test_prompt = "Haunting dark ambient synthesizer, ethereal and mysterious rendition of the chorus from Hozier's 'Too Sweet'."

try:
    output = replicate.run(
        model_identifier,
        input={
            "prompt": test_prompt,
            "duration": 33,
        }
    )

    print(f"‚úì Generation successful!")
    print(f"Prompt: {test_prompt}")
    print(f"Output: {output}")
    print(f"\nüéµ Your fine-tuned model works! Update the webapp with:")
    print(f'  model_identifier: "{model_identifier}"')
except Exception as e:
    print(f"‚úó Generation failed: {e}")
    print("Make sure the model_version_id is correct.")

# 5. UPDATE WHISPER SYNTH WEBAPP

## 5.A Update lib/replicate-client.ts

Replace the `generateInstrumentalMusic` function with your fine-tuned model:

```typescript
export async function generateInstrumentalMusic(
  prompt: string,
  melodySpeechUrl?: string
): Promise<any> {
  try {
    const input: any = {
      prompt: prompt,
      duration: 33,
      output_format: 'mp3',
    }

    const prediction = await replicate.predictions.create({
      // YOUR FINE-TUNED MODEL HERE:
      version: "<YOUR_MODEL_VERSION_ID>",  // e.g., "abc123..."
      input: input,
    })

    return prediction
  } catch (error) {
    console.error('Error generating instrumental music:', error)
    throw new Error('Failed to generate instrumental music')
  }
}
```

Replace `<YOUR_MODEL_VERSION_ID>` with your actual version ID from Step 4.A.

## 5.B Update lib/prompt-engineer.ts

Since your model is fine-tuned on Eulogy, simplify the prompts:

```typescript
export function createVocalToInstrumentPrompt(mood: Mood = 'eerie'): string {
  const config = MOODS[mood]

  return `
Dark ambient synthesizer music inspired by Eulogy.
${config.descriptors.join(', ')} atmosphere.
Haunting, ethereal, wordless vocal-like theremin tones.
Pure synthesizers with heavy reverb and delay.
${config.tempo} tempo.
${config.intensity}.
  `.trim()
}
```

## 5.C Redeploy Webapp

After updating the code:
```bash
cd /Users/quileesimeon/sundai_01252026/whisper-synth
npm run build
npm run dev  # Test locally
# Then push to Vercel
```

# 6. SUMMARY

You've successfully:

1. ‚úì Downloaded "Eulogy" audio
2. ‚úì Chunked it into 30-second segments
3. ‚úì Created proper metadata descriptions
4. ‚úì Packaged into a ZIP dataset
5. ‚úì Created a model repository on Replicate
6. ‚úì Fine-tuned MusicGen on Eulogy's sound
7. ‚úì Tested the trained model
8. ‚úì Updated Whisper Synth to use the custom model

Your Whisper Synth webapp now generates music in the exact style of "Eulogy" instead of generic "dark ambient"!

## Next Steps

- Iterate on prompts to get the best results
- Try different mood descriptors
- Share with friends!
- Consider fine-tuning with more Stranger Things tracks for even better results