# MusicGen Remixer - RunPod Jupyter Demo

This notebook demonstrates how to use MusicGen Remixer on RunPod GPU instances without Cog dependencies.

## Features
- Optimized for RunPod GPU instances
- Uses Meta's facebook/musicgen-melody model by default
- No Cog dependencies required
- Vocal separation and beat synchronization
- Text-based music generation with audio conditioning
- File upload/download support

## Environment Detection and Setup

In [None]:
# Check environment
import os
import sys

# Detect RunPod environment
IN_RUNPOD = os.path.exists('/runpod-volume') or 'RUNPOD' in os.environ
IN_COLAB = 'google.colab' in sys.modules

print(f"Running in RunPod: {IN_RUNPOD}")
print(f"Running in Google Colab: {IN_COLAB}")
print(f"Python version: {sys.version}")
print(f"Current working directory: {os.getcwd()}")

# Set working directory for RunPod
if IN_RUNPOD:
    # Use persistent volume if available
    if os.path.exists('/runpod-volume'):
        os.chdir('/runpod-volume')
        print("Changed to persistent volume directory")
    else:
        # Use workspace or tmp
        workspace_dir = '/workspace' if os.path.exists('/workspace') else '/tmp'
        os.chdir(workspace_dir)
        print(f"Changed to {workspace_dir}")

print(f"Working directory: {os.getcwd()}")

## Python Dependencies Installation

Installing required Python packages optimized for RunPod environment

In [None]:
# Check GPU availability
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")
    print(f"GPU count: {torch.cuda.device_count()}")
    for i in range(torch.cuda.device_count()):
        print(f"GPU {i}: {torch.cuda.get_device_name(i)}")
        print(f"GPU {i} Memory: {torch.cuda.get_device_properties(i).total_memory / 1024**3:.1f} GB")

## Import Required Modules

In [None]:
import sys
import os
import torch
import torchaudio
import numpy as np
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Add the audiocraft directory to the Python path
current_dir = os.getcwd()
sys.path.append(current_dir)

# Import our standalone remixer
try:
    from standalone_inference import MusicGenRemixer
    print("MusicGenRemixer imported successfully!")
except ImportError as e:
    print(f"Error importing MusicGenRemixer: {e}")
    print("Make sure you're in the correct directory and all dependencies are installed.")

# Jupyter display imports
from IPython.display import Audio, display, HTML, clear_output
import ipywidgets as widgets

print("All imports completed successfully!")

## Initialize the Model

Setting up MusicGen Remixer with optimal settings for RunPod

In [None]:
# Initialize the remixer
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")

if device == "cuda":
    # Clear CUDA cache
    torch.cuda.empty_cache()
    print(f"GPU Memory before model loading: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")

print("Initializing MusicGenRemixer...")
print("This will download models on first run (may take several minutes)")

remixer = MusicGenRemixer(device=device)

print("Setting up model...")
remixer.setup()

if device == "cuda":
    allocated = torch.cuda.memory_allocated() / 1024**3
    cached = torch.cuda.memory_reserved() / 1024**3
    print(f"GPU Memory allocated: {allocated:.2f} GB")
    print(f"GPU Memory cached: {cached:.2f} GB")

print("\n✅ Model initialization completed successfully!")

## File Upload Interface

Upload your audio file for remixing

In [None]:
# Create upload widget
uploader = widgets.FileUpload(
    accept='.wav,.mp3,.flac,.m4a,.aac',  # Accept common audio formats
    multiple=False,
    description='Upload Audio'
)

upload_info = widgets.HTML(value="<b>Select an audio file to upload (wav, mp3, flac, m4a, aac)</b>")

# Function to handle file upload
def on_upload_change(change):
    if uploader.value:
        # Get uploaded file
        uploaded_file = list(uploader.value.values())[0]
        filename = list(uploader.value.keys())[0]
        
        # Save to local file
        with open(filename, 'wb') as f:
            f.write(uploaded_file['content'])
        
        upload_info.value = f"<b>✅ Uploaded: {filename}</b>"
        
        # Display the uploaded audio
        display(HTML(f"<h4>Uploaded Audio: {filename}</h4>"))
        display(Audio(filename))
        
        # Store filename globally
        globals()['input_file'] = filename

uploader.observe(on_upload_change, names='value')

display(upload_info)
display(uploader)

print("\n📁 Alternative: You can also place audio files directly in the current directory:")
print(f"Current directory: {os.getcwd()}")

# Show existing audio files
audio_extensions = ['.wav', '.mp3', '.flac', '.m4a', '.aac']
existing_files = [f for f in os.listdir('.') if any(f.lower().endswith(ext) for ext in audio_extensions)]

if existing_files:
    print("\n🎵 Audio files found in current directory:")
    for f in existing_files:
        print(f"  - {f}")
else:
    print("\n⚠️  No audio files found in current directory.")

## Generation Parameters

Configure the music generation settings

In [None]:
# Create parameter widgets
prompt_widget = widgets.Textarea(
    value="electronic dance music, upbeat, energetic",
    placeholder="Describe the style of music you want to generate...",
    description="Prompt:",
    layout=widgets.Layout(width='100%', height='60px')
)

# Advanced parameters
temperature_widget = widgets.FloatSlider(
    value=1.0, min=0.1, max=2.0, step=0.1,
    description="Temperature:",
    tooltip="Controls randomness (lower = more conservative)"
)

guidance_widget = widgets.FloatSlider(
    value=3.0, min=1.0, max=10.0, step=0.5,
    description="Guidance:",
    tooltip="How strongly to follow the prompt"
)

top_k_widget = widgets.IntSlider(
    value=250, min=50, max=500, step=25,
    description="Top-k:",
    tooltip="Number of top tokens to consider"
)

instrumental_widget = widgets.Checkbox(
    value=True,
    description="Generate instrumental version",
    tooltip="Also output version without vocals"
)

seed_widget = widgets.IntText(
    value=42,
    description="Seed:",
    tooltip="Random seed for reproducibility (0 for random)"
)

# Display widgets
display(HTML("<h3>Generation Parameters</h3>"))
display(widgets.VBox([
    prompt_widget,
    widgets.HBox([temperature_widget, guidance_widget]),
    widgets.HBox([top_k_widget, seed_widget]),
    instrumental_widget
]))

print("\n💡 Tips:")
print("- Be specific in your prompt (genre, instruments, mood)")
print("- Lower temperature for more conservative generation")
print("- Higher guidance for stronger prompt adherence")
print("- Use seed=0 for random results each time")

## Generate Music Remix

Run the remixing process

In [None]:
# Check if input file exists
if 'input_file' not in globals():
    # Try to find an audio file in the directory
    audio_files = [f for f in os.listdir('.') if any(f.lower().endswith(ext) for ext in ['.wav', '.mp3', '.flac', '.m4a', '.aac'])]
    
    if audio_files:
        input_file = audio_files[0]
        print(f"Using first audio file found: {input_file}")
    else:
        print("❌ No input file specified. Please upload a file first or place an audio file in the current directory.")
        input_file = None

if input_file and os.path.exists(input_file):
    # Get parameters from widgets
    prompt = prompt_widget.value
    temperature = temperature_widget.value
    guidance = guidance_widget.value
    top_k = top_k_widget.value
    return_instrumental = instrumental_widget.value
    seed = seed_widget.value if seed_widget.value != 0 else None
    
    output_path = "remix_output.wav"
    
    print(f"🎵 Input: {input_file}")
    print(f"📝 Prompt: '{prompt}'")
    print(f"🎛️  Parameters: temp={temperature}, guidance={guidance}, top_k={top_k}")
    print(f"🎚️  Instrumental: {return_instrumental}, Seed: {seed}")
    print("\n🚀 Starting generation process...")
    print("⏱️  This may take 2-5 minutes depending on file length and GPU...")
    
    # Clear GPU cache before generation
    if device == "cuda":
        torch.cuda.empty_cache()
    
    try:
        # Generate the remix
        output_files = remixer.remix_music(
            prompt=prompt,
            music_input_path=input_file,
            output_path=output_path,
            multi_band_diffusion=False,  # Keep False for stability
            normalization_strategy="loudness",
            beat_sync_threshold=None,  # Auto-detect
            top_k=int(top_k),
            top_p=0.0,
            temperature=float(temperature),
            classifier_free_guidance=float(guidance),
            return_instrumental=return_instrumental,
            seed=seed,
        )
        
        print("\n✅ Generation completed successfully!")
        
        # Display generated audio files
        for i, file_path in enumerate(output_files):
            if os.path.exists(file_path):
                file_size = os.path.getsize(file_path) / (1024 * 1024)  # MB
                label = "🎵 Full Mix" if i == 0 else "🎼 Instrumental"
                print(f"\n{label}: {file_path} ({file_size:.1f} MB)")
                display(Audio(file_path))
        
        # Show additional generated files
        additional_files = ["background_synced.wav", "background.wav", "input_vocal.wav"]
        print("\n📁 Additional generated files:")
        for file_path in additional_files:
            if os.path.exists(file_path):
                file_size = os.path.getsize(file_path) / (1024 * 1024)  # MB
                print(f"  ✓ {file_path} ({file_size:.1f} MB)")
            else:
                print(f"  - {file_path} (not generated)")
        
        # GPU memory info
        if device == "cuda":
            allocated = torch.cuda.memory_allocated() / 1024**3
            print(f"\n💾 GPU Memory used: {allocated:.2f} GB")
        
    except Exception as e:
        print(f"\n❌ Error during generation: {e}")
        import traceback
        print("\n📋 Detailed error:")
        traceback.print_exc()
        
        # Clear GPU cache on error
        if device == "cuda":
            torch.cuda.empty_cache()

else:
    print(f"❌ Input file not found: {input_file if 'input_file' in globals() else 'Not specified'}")
    print("Please upload a file using the widget above or place an audio file in the current directory.")

## File Management

Download generated files and manage workspace

In [None]:
# List all generated files
print("📁 Current directory contents:")
!ls -lh *.wav *.mp3 2>/dev/null || echo "No audio files found"

# Create download links for RunPod
if IN_RUNPOD:
    print("\n💾 Files available for download in RunPod:")
    audio_files = [f for f in os.listdir('.') if f.endswith(('.wav', '.mp3'))]
    for file in audio_files:
        if os.path.exists(file):
            size_mb = os.path.getsize(file) / (1024 * 1024)
            print(f"  📄 {file} ({size_mb:.1f} MB)")
    
    print("\n📋 To download files:")
    print("1. Right-click on files in the file browser")
    print("2. Select 'Download'")
    print("3. Or use the file manager in the RunPod interface")

# Archive function for batch download
def create_archive():
    import zipfile
    from datetime import datetime
    
    # Create timestamp
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    archive_name = f"musicgen_outputs_{timestamp}.zip"
    
    # Find all audio files
    audio_files = [f for f in os.listdir('.') if f.endswith(('.wav', '.mp3')) and os.path.exists(f)]
    
    if audio_files:
        with zipfile.ZipFile(archive_name, 'w', zipfile.ZIP_DEFLATED) as zipf:
            for file in audio_files:
                zipf.write(file)
                print(f"Added {file} to archive")
        
        archive_size = os.path.getsize(archive_name) / (1024 * 1024)
        print(f"\n✅ Created archive: {archive_name} ({archive_size:.1f} MB)")
        return archive_name
    else:
        print("❌ No audio files found to archive")
        return None

# Create archive button
archive_button = widgets.Button(
    description="📦 Create ZIP Archive",
    button_style='info',
    tooltip="Create a ZIP file with all generated audio files"
)

def on_archive_click(b):
    with archive_button.container:
        archive_name = create_archive()
        if archive_name:
            archive_button.description = "✅ Archive Created"

archive_button.on_click(on_archive_click)
display(archive_button)

## Advanced Usage Examples

Try different styles and parameters

In [None]:
# Advanced generation function
def generate_with_style(input_file, style_prompt, style_name, **kwargs):
    """Generate music with specific style parameters."""
    if not os.path.exists(input_file):
        print(f"❌ Input file not found: {input_file}")
        return None
        
    output_path = f"{style_name}_remix.wav"
    
    default_params = {
        'multi_band_diffusion': False,
        'normalization_strategy': 'loudness',
        'top_k': 250,
        'top_p': 0.0,
        'temperature': 1.0,
        'classifier_free_guidance': 3,
        'return_instrumental': False,
        'seed': 42,  # Fixed seed for reproducibility
    }
    
    # Update with custom parameters
    default_params.update(kwargs)
    
    print(f"\n🎵 Generating {style_name} style")
    print(f"📝 Prompt: '{style_prompt}'")
    print(f"🎛️  Custom params: {kwargs}")
    
    # Clear GPU cache
    if device == "cuda":
        torch.cuda.empty_cache()
    
    try:
        output_files = remixer.remix_music(
            prompt=style_prompt,
            music_input_path=input_file,
            output_path=output_path,
            **default_params
        )
        
        if os.path.exists(output_path):
            file_size = os.path.getsize(output_path) / (1024 * 1024)
            print(f"✅ {style_name} style generated: {output_path} ({file_size:.1f} MB)")
            display(Audio(output_path))
            return output_path
    except Exception as e:
        print(f"❌ Error generating {style_name}: {e}")
        # Clear GPU cache on error
        if device == "cuda":
            torch.cuda.empty_cache()
    
    return None

# Style presets
style_presets = {
    "jazz": {
        "prompt": "smooth jazz, saxophone, piano, mellow, sophisticated",
        "temperature": 0.8,
        "classifier_free_guidance": 4.0
    },
    "rock": {
        "prompt": "hard rock, electric guitar, drums, energetic, powerful",
        "temperature": 1.2,
        "classifier_free_guidance": 3.5
    },
    "classical": {
        "prompt": "classical orchestra, symphonic, elegant, orchestral",
        "temperature": 0.6,
        "classifier_free_guidance": 5.0
    },
    "electronic": {
        "prompt": "electronic dance music, synthesizer, bass, upbeat",
        "temperature": 1.0,
        "classifier_free_guidance": 3.0
    },
    "ambient": {
        "prompt": "ambient, atmospheric, peaceful, ethereal, calming",
        "temperature": 0.7,
        "classifier_free_guidance": 4.5
    }
}

print("🎨 Available style presets:")
for style, params in style_presets.items():
    print(f"  • {style}: {params['prompt']}")

# Example usage (uncomment to try)
print("\n💡 To generate different styles, uncomment and run:")
print("# jazz_output = generate_with_style(input_file, **style_presets['jazz'], style_name='jazz')")
print("# rock_output = generate_with_style(input_file, **style_presets['rock'], style_name='rock')")

In [None]:
# Uncomment to try different styles (make sure input_file is defined)

# Generate Jazz version
# if 'input_file' in globals() and os.path.exists(input_file):
#     jazz_output = generate_with_style(input_file, **style_presets['jazz'], style_name='jazz')

# Generate Electronic version
# if 'input_file' in globals() and os.path.exists(input_file):
#     electronic_output = generate_with_style(input_file, **style_presets['electronic'], style_name='electronic')

## System Information and Troubleshooting

In [None]:
# System information
print("🖥️  System Information:")
print(f"Python: {sys.version}")
print(f"PyTorch: {torch.__version__}")
print(f"CUDA Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA Version: {torch.version.cuda}")
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    total_memory = torch.cuda.get_device_properties(0).total_memory / 1024**3
    allocated_memory = torch.cuda.memory_allocated() / 1024**3
    cached_memory = torch.cuda.memory_reserved() / 1024**3
    print(f"GPU Memory: {allocated_memory:.2f} GB / {total_memory:.2f} GB (allocated)")
    print(f"GPU Memory: {cached_memory:.2f} GB (cached)")

print(f"\n📂 Working Directory: {os.getcwd()}")
print(f"💾 Disk Space:")
!df -h .

# Memory cleanup function
def cleanup_memory():
    """Clean up GPU and system memory"""
    import gc
    
    # Python garbage collection
    gc.collect()
    
    # CUDA memory cleanup
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
        torch.cuda.synchronize()
    
    print("🧹 Memory cleanup completed")

# Cleanup button
cleanup_button = widgets.Button(
    description="🧹 Cleanup Memory",
    button_style='warning',
    tooltip="Clear GPU and system memory"
)

cleanup_button.on_click(lambda b: cleanup_memory())
display(cleanup_button)

print("\n🔧 Troubleshooting Tips:")
print("• If generation fails, try reducing temperature or top_k values")
print("• For out-of-memory errors, click 'Cleanup Memory' and try again")
print("• Shorter input files generally process faster")
print("• Multi-band diffusion requires more GPU memory")
print("• Check that your input file is a valid audio format")

## Notes and Documentation

### RunPod-Specific Features
- **Persistent Storage**: Files saved to `/runpod-volume` persist between sessions
- **GPU Optimization**: Automatic GPU detection and memory management
- **File Management**: Built-in file browser for easy file handling
- **Network Access**: Full internet access for model downloads

### Model Information
- Uses Meta's `facebook/musicgen-melody` model by default
- Supports both text and melody conditioning
- Generated audio is at 32kHz sample rate
- Model size: ~3.3GB (downloaded on first run)

### Performance Tips
1. **GPU Memory**: 8GB+ VRAM recommended for optimal performance
2. **Input Length**: Shorter files (30-60 seconds) process faster
3. **Batch Processing**: Generate multiple styles sequentially for efficiency
4. **Memory Management**: Use cleanup function between generations

### Advanced Parameters
- **Temperature**: Controls randomness (0.1-2.0, default: 1.0)
- **Guidance**: Prompt adherence strength (1.0-10.0, default: 3.0)
- **Top-k**: Token selection diversity (50-500, default: 250)
- **Seed**: Reproducibility control (0 for random)

### Known Limitations
- Generation time scales with input length
- Multi-band diffusion may require 16GB+ VRAM
- Very long files (>2 minutes) may cause memory issues
- Some audio formats may need conversion to WAV

### Support
- Repository: https://github.com/sakemin/cog-musicgen-remixer
- MusicGen: https://huggingface.co/facebook/musicgen-melody
- RunPod Documentation: https://docs.runpod.io/