# üéµ Stem Separation with SoundLab

This notebook demonstrates how to use SoundLab to separate a song into individual stems (vocals, drums, bass, other) using the Demucs model.

**What you'll learn:**
- Loading audio files
- Configuring the stem separator
- Running separation with different models
- Comparing A/B results between models
- Saving and exporting stems

## Setup

First, let's install SoundLab and import the necessary modules.

In [None]:
# Install SoundLab (uncomment if running in Colab)
# !pip install soundlab[separation]

from pathlib import Path

from soundlab.io import load_audio, save_audio
from soundlab.separation import DemucsModel, SeparationConfig, StemSeparator

print("‚úÖ SoundLab imported successfully!")

## 1. Load Your Audio

Load a song you want to separate. SoundLab supports WAV, MP3, FLAC, and other common formats.

In [None]:
# @title Upload or specify audio path
# @markdown Either upload a file or specify a path to an existing audio file.

AUDIO_PATH = "your_song.wav"  # @param {type: "string"}

# For Colab: uncomment to upload
# from google.colab import files
# uploaded = files.upload()
# AUDIO_PATH = list(uploaded.keys())[0]

# Load the audio
audio = load_audio(AUDIO_PATH)

print(f"üìÅ Loaded: {AUDIO_PATH}")
print(f"   Duration: {audio.duration:.2f}s")
print(f"   Sample rate: {audio.sample_rate} Hz")
print(f"   Channels: {audio.channels}")

## 2. Configure the Separator

SoundLab uses Demucs for stem separation. You can choose from several model variants:

| Model | Description | Quality | Speed |
|-------|-------------|---------|-------|
| `htdemucs` | Hybrid Transformer Demucs | Good | Fast |
| `htdemucs_ft` | Fine-tuned version (default) | Better | Fast |
| `htdemucs_6s` | 6-stem model (adds piano, guitar) | Best | Slower |
| `mdx_extra` | MDX architecture | Good | Fast |
| `mdx_extra_q` | Quantized MDX | Good | Fastest |

In [None]:
# @title Separation Configuration
# @markdown Configure the stem separation parameters.

MODEL = (
    "htdemucs_ft"  # @param ["htdemucs", "htdemucs_ft", "htdemucs_6s", "mdx_extra", "mdx_extra_q"]
)
TWO_STEMS = False  # @param {type: "boolean"}
SEGMENT = 40  # @param {type: "slider", min: 10, max: 60, step: 5}
OVERLAP = 0.25  # @param {type: "slider", min: 0.1, max: 0.5, step: 0.05}
SHIFTS = 1  # @param {type: "slider", min: 0, max: 5, step: 1}

# Create configuration
config = SeparationConfig(
    model=DemucsModel(MODEL),
    two_stems="vocals" if TWO_STEMS else None,
    segment=SEGMENT,
    overlap=OVERLAP,
    shifts=SHIFTS,
)

print("üéõÔ∏è Configuration:")
print(f"   Model: {config.model.value}")
print(f"   Mode: {'Two-stem (vocals + accompaniment)' if TWO_STEMS else 'Four-stem'}")
print(f"   Segment: {config.segment}s")
print(f"   Overlap: {config.overlap * 100:.0f}%")
print(f"   Shifts: {config.shifts}")

## 3. Run Separation

Now let's separate the audio into stems. This may take a few minutes depending on the song length and your hardware.

In [None]:
# Create the separator
separator = StemSeparator(config)

# Run separation
print("üé∂ Separating stems...")
result = separator.separate(audio)

print("\n‚úÖ Separation complete!")
print(f"   Stems: {list(result.stems.keys())}")
print(f"   Processing time: {result.processing_time:.2f}s")

## 4. Preview the Stems

Let's listen to each separated stem.

In [None]:
from IPython.display import Audio, display

for stem_name, stem_audio in result.stems.items():
    print(f"\nüéß {stem_name.upper()}")
    display(Audio(stem_audio.samples.T, rate=stem_audio.sample_rate))

## 5. A/B Comparison: htdemucs vs htdemucs_ft

Let's compare the quality between two different models to see which works better for your audio.

In [None]:
# @title A/B Model Comparison
# @markdown Compare separation quality between two models.

MODEL_A = "htdemucs"  # @param ["htdemucs", "htdemucs_ft", "mdx_extra"]
MODEL_B = "htdemucs_ft"  # @param ["htdemucs", "htdemucs_ft", "mdx_extra"]
COMPARE_STEM = "vocals"  # @param ["vocals", "drums", "bass", "other"]

print(f"üî¨ Comparing {MODEL_A} vs {MODEL_B} on {COMPARE_STEM}...\n")

# Separate with Model A
config_a = SeparationConfig(model=DemucsModel(MODEL_A))
sep_a = StemSeparator(config_a)
result_a = sep_a.separate(audio)

# Separate with Model B
config_b = SeparationConfig(model=DemucsModel(MODEL_B))
sep_b = StemSeparator(config_b)
result_b = sep_b.separate(audio)

print("\nüìä Results:")
print(f"   {MODEL_A}: {result_a.processing_time:.2f}s")
print(f"   {MODEL_B}: {result_b.processing_time:.2f}s")

In [None]:
# Listen to the comparison
from IPython.display import HTML, Audio, display

stem_a = result_a.stems[COMPARE_STEM]
stem_b = result_b.stems[COMPARE_STEM]

display(HTML(f"<h4>üÖ∞Ô∏è {MODEL_A} - {COMPARE_STEM}</h4>"))
display(Audio(stem_a.samples.T, rate=stem_a.sample_rate))

display(HTML(f"<h4>üÖ±Ô∏è {MODEL_B} - {COMPARE_STEM}</h4>"))
display(Audio(stem_b.samples.T, rate=stem_b.sample_rate))

## 6. Save the Stems

Finally, let's save the separated stems to disk.

In [None]:
# @title Save Stems
# @markdown Configure output settings.

OUTPUT_DIR = "stems_output"  # @param {type: "string"}
OUTPUT_FORMAT = "wav"  # @param ["wav", "mp3", "flac"]

# Create output directory
output_path = Path(OUTPUT_DIR)
output_path.mkdir(parents=True, exist_ok=True)

# Save each stem
saved_files = []
for stem_name, stem_audio in result.stems.items():
    filename = output_path / f"{stem_name}.{OUTPUT_FORMAT}"
    save_audio(stem_audio, filename)
    saved_files.append(filename)
    print(f"üíæ Saved: {filename}")

print(f"\n‚úÖ All stems saved to {OUTPUT_DIR}/")

In [None]:
# For Colab: Download as ZIP
# import shutil
# from google.colab import files
#
# zip_path = shutil.make_archive("stems", "zip", OUTPUT_DIR)
# files.download(zip_path)

## üéâ Done!

You've successfully separated your audio into stems using SoundLab.

**Next steps:**
- Try the [MIDI Transcription](./midi_transcription.ipynb) notebook to convert stems to MIDI
- Explore the [Voice Conversion](./voice_conversion.ipynb) notebook for TTS and voice cloning
- Check out the [SoundLab Studio](../soundlab_studio.ipynb) for the full pipeline

**Tips:**
- Use `htdemucs_ft` for best vocal quality
- Use `htdemucs_6s` if you need piano/guitar stems
- Increase `shifts` for better quality (but slower processing)