## Download Models Manually

To download all models at once without running inference:

## Model Downloads

The following models will be auto-downloaded on first run:
- **Plachta/Seed-VC**: Main voice conversion models (~1GB)
- **funasr/campplus**: Speaker embedding model
- **nvidia/bigvgan_v2_22khz_80band_256x**: Vocoder for speech
- **nvidia/bigvgan_v2_44khz_128band_512x**: Vocoder for singing
- **lj1995/VoiceConversionWebUI**: F0 (pitch) extraction
- **openai/whisper-small** or **whisper-base**: Speech tokenizer

Models are cached in: `./checkpoints/` and `~/.cache/huggingface/`

## Model Requirements by Use Case

### For Speech Conversion Only:
✅ **Required:**
- Plachta/Seed-VC base model
- funasr/campplus (speaker embedding)
- nvidia/bigvgan_v2_22khz (vocoder)
- openai/whisper-small (speech features)
- lj1995/VoiceConversionWebUI/rmvpe (F0 extraction)

❌ **Can Skip:**
- Seed-VC F0 model
- nvidia/bigvgan_v2_44khz

### For Singing Voice Conversion:
✅ **All models required** - Cannot skip any

**Total Size:**
- Speech only: ~1.5-2 GB
- Speech + Singing: ~2-3 GB

In [None]:
# Download all models manually (run this once)
from huggingface_hub import hf_hub_download
import os

os.makedirs("./checkpoints", exist_ok=True)

print("Downloading models...")

# 1. Main Seed-VC models
print("\n1. Downloading Seed-VC base model...")
hf_hub_download(
    repo_id="Plachta/Seed-VC",
    filename="DiT_seed_v2_uvit_whisper_small_wavenet_bigvgan_pruned.pth",
    cache_dir="./checkpoints"
)
hf_hub_download(
    repo_id="Plachta/Seed-VC",
    filename="config_dit_mel_seed_uvit_whisper_small_wavenet.yml",
    cache_dir="./checkpoints"
)

print("2. Downloading Seed-VC F0 model...")
hf_hub_download(
    repo_id="Plachta/Seed-VC",
    filename="DiT_seed_v2_uvit_whisper_base_f0_44k_bigvgan_pruned_ft_ema.pth",
    cache_dir="./checkpoints"
)
hf_hub_download(
    repo_id="Plachta/Seed-VC",
    filename="config_dit_mel_seed_uvit_whisper_base_f0_44k.yml",
    cache_dir="./checkpoints"
)

# 2. CAMPPlus speaker embedding
print("3. Downloading CAMPPlus model...")
hf_hub_download(
    repo_id="funasr/campplus",
    filename="campplus_cn_common.bin",
    cache_dir="./checkpoints"
)

# 3. RMVPE for F0 extraction
print("4. Downloading RMVPE model...")
hf_hub_download(
    repo_id="lj1995/VoiceConversionWebUI",
    filename="rmvpe.pt",
    cache_dir="./checkpoints"
)

# 4. BigVGAN vocoders (these are large!)
print("5. Downloading BigVGAN 22kHz model...")
from modules.bigvgan import bigvgan
bigvgan_22k = bigvgan.BigVGAN.from_pretrained('nvidia/bigvgan_v2_22khz_80band_256x')

print("6. Downloading BigVGAN 44kHz model...")
bigvgan_44k = bigvgan.BigVGAN.from_pretrained('nvidia/bigvgan_v2_44khz_128band_512x')

# 5. Whisper models
print("7. Downloading Whisper models...")
from transformers import WhisperModel, AutoFeatureExtractor
whisper_small = WhisperModel.from_pretrained("openai/whisper-small")
whisper_base = WhisperModel.from_pretrained("openai/whisper-base")

print("\n✓ All models downloaded!")

## Check what models are already cached locally


In [None]:
import os
from pathlib import Path

checkpoints_dir = Path("./checkpoints")
hf_cache = Path.home() / ".cache" / "huggingface"

print("Local checkpoints folder:")
if checkpoints_dir.exists():
    for item in checkpoints_dir.rglob("*.pth"):
        print(f"  {item}")
    for item in checkpoints_dir.rglob("*.bin"):
        print(f"  {item}")
    for item in checkpoints_dir.rglob("*.pt"):
        print(f"  {item}")
else:
    print("  (not created yet)")

print(f"\nHugging Face cache: {hf_cache}")
print(f"Cache exists: {hf_cache.exists()}")