# 🎵 Audio Classification on Kaggle (FREE GPU!)

Train Mamba, Liquid S4, and V-JEPA2 models on ESC-50 dataset using Kaggle's free Tesla P100 GPU.

## 🚀 Quick Start
1. **Enable GPU**: Settings → Accelerator → GPU T4 x2 (or P100)
2. **Run all cells** below
3. **Choose your model** to train

## 📊 Kaggle Resources
- **GPU**: Tesla P100 (16GB memory)
- **Storage**: 20GB
- **Session**: 9 hours max
- **Weekly GPU**: 30 hours (resets Saturday)

## 🎯 Your Project Stats
- **Dataset**: ESC-50 (~1.3GB)
- **Training time**: 2-5 hours per model
- **Memory usage**: 8-12GB GPU memory
- **Perfect fit**: ✅ All models train comfortably!


In [None]:
# Check GPU availability and specs
import torch
import os

print("🔍 Checking Kaggle environment...")
print(f"CUDA available: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    gpu_memory = torch.cuda.get_device_properties(0).total_memory / 1e9
    print(f"🎮 GPU: {gpu_name}")
    print(f"💾 GPU Memory: {gpu_memory:.1f} GB")
    
    # Check if it's a good GPU for our project
    if gpu_memory >= 12:
        print("✅ Perfect! This GPU can handle all our models")
    elif gpu_memory >= 8:
        print("✅ Good! This GPU can handle Mamba and Liquid S4")
    else:
        print("⚠️  Limited GPU memory - use smaller batch sizes")
else:
    print("❌ No GPU detected! Please enable GPU in Settings → Accelerator")

# Check available storage
import shutil
total, used, free = shutil.disk_usage("/kaggle/working")
print(f"💿 Available storage: {free // (1024**3)} GB")

print("\n🎯 Ready to start training!")


In [None]:
# Install required packages (Kaggle has most pre-installed)
print("📦 Installing additional packages...")

# Install packages not in Kaggle's default environment
!pip install einops wandb

print("✅ Dependencies installed!")


In [None]:
# Clone your repository
print("📥 Cloning repository...")

# Replace with your actual GitHub repo URL
!git clone https://github.com/YOUR_USERNAME/audio-classifier.git

# Change to project directory
import os
os.chdir('audio-classifier')

# Initialize git submodules (for external models)
!git submodule update --init --recursive

print("✅ Repository cloned and submodules initialized!")
print(f"📁 Current directory: {os.getcwd()}")


In [None]:
# Download ESC-50 dataset
import os
import pandas as pd

if not os.path.exists('data/ESC-50'):
    print("📥 Downloading ESC-50 dataset...")
    !mkdir -p data
    !cd data && wget -q https://github.com/karolpiczak/ESC-50/archive/master.zip
    !cd data && unzip -q master.zip && mv ESC-50-master ESC-50
    !cd data && rm master.zip
    print("✅ ESC-50 dataset downloaded!")
else:
    print("✅ ESC-50 dataset already exists!")

# Check dataset info
meta_df = pd.read_csv('data/ESC-50/meta/esc50.csv')
print(f"\n📊 Dataset Info:")
print(f"   Samples: {len(meta_df):,}")
print(f"   Classes: {len(meta_df['category'].unique())}")
print(f"   Size: ~1.3GB")
print(f"   Categories: {', '.join(meta_df['category'].unique()[:8])}...")


In [None]:
# Test model imports and creation
import sys
sys.path.append('.')

from src.models.mamba_audio import MambaAudioClassifier
from src.models.liquidS4_audio import LiquidS4AudioClassifier
from src.models.vjepa_audio import VJEPA2AudioClassifier
from configs.model_configs import get_mamba_config, get_liquid_s4_config, get_vjepa2_config

print("✅ All imports successful!")

# Test model creation with Kaggle-optimized settings
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"🖥️  Using device: {device}")

# Test each model
models_info = []

print("\n🧪 Testing Mamba model...")
mamba_config = get_mamba_config()
mamba_model = MambaAudioClassifier(**mamba_config, device=device)
mamba_params = sum(p.numel() for p in mamba_model.parameters())
models_info.append(("Mamba", mamba_params, "~2-3 hours"))
print(f"   ✅ Mamba: {mamba_params:,} parameters")

print("\n🧪 Testing Liquid S4 model...")
s4_config = get_liquid_s4_config()
s4_model = LiquidS4AudioClassifier(**s4_config, device=device)
s4_params = sum(p.numel() for p in s4_model.parameters())
models_info.append(("Liquid S4", s4_params, "~3-4 hours"))
print(f"   ✅ Liquid S4: {s4_params:,} parameters")

print("\n🧪 Testing V-JEPA2 model...")
vjepa_config = get_vjepa2_config()
vjepa_model = VJEPA2AudioClassifier(**vjepa_config)
vjepa_params = sum(p.numel() for p in vjepa_model.parameters())
models_info.append(("V-JEPA2", vjepa_params, "~4-5 hours"))
print(f"   ✅ V-JEPA2: {vjepa_params:,} parameters")

print("\n📋 Model Summary:")
for name, params, time in models_info:
    print(f"   {name:12} | {params:>8,} params | {time}")

print("\n✅ All models ready for training!")


## 🚀 Training Options

Choose which model to train by running one of the cells below:

| Model | Training Time | GPU Memory | Best For |
|-------|---------------|------------|----------|
| **Mamba** | ~2-3 hours | 8GB | Quick results, efficient |
| **Liquid S4** | ~3-4 hours | 12GB | Good balance |
| **V-JEPA2** | ~4-5 hours | 16GB | Best accuracy |

💡 **Recommendation**: Start with Mamba for fastest results!


In [None]:
# 🎯 Train Mamba Model (Recommended - fastest training)
print("🚀 Starting Mamba training...")
print("⏱️  Expected time: 2-3 hours")
print("💾 GPU memory usage: ~8GB")
print("\n" + "="*50)

# Use cloud-optimized training script
!python scripts/cloud_train.py --model mamba --batch_size 16 --epochs 50 --lr 0.001

print("\n" + "="*50)
print("✅ Mamba training completed!")
print("📁 Checkpoints saved in: checkpoints/mamba_best.pth")


In [None]:
# 🎯 Train Liquid S4 Model
print("🚀 Starting Liquid S4 training...")
print("⏱️  Expected time: 3-4 hours")
print("💾 GPU memory usage: ~12GB")
print("\n" + "="*50)

!python scripts/cloud_train.py --model liquid_s4 --batch_size 16 --epochs 50 --lr 0.001

print("\n" + "="*50)
print("✅ Liquid S4 training completed!")
print("📁 Checkpoints saved in: checkpoints/liquid_s4_best.pth")


In [None]:
# 🎯 Train V-JEPA2 Model (May need smaller batch size)
print("🚀 Starting V-JEPA2 training...")
print("⏱️  Expected time: 4-5 hours")
print("💾 GPU memory usage: ~16GB")
print("\n" + "="*50)

# Use smaller batch size for V-JEPA2 to fit in GPU memory
!python scripts/cloud_train.py --model vjepa2 --batch_size 8 --epochs 50 --lr 0.001

print("\n" + "="*50)
print("✅ V-JEPA2 training completed!")
print("📁 Checkpoints saved in: checkpoints/vjepa2_best.pth")


In [None]:
# 📥 Download trained models and results
import os
from IPython.display import FileLink

print("📦 Preparing downloads...")

# Create downloads directory
!mkdir -p /kaggle/working/downloads

# Download checkpoints
if os.path.exists('checkpoints'):
    print("📁 Zipping checkpoints...")
    !zip -r /kaggle/working/downloads/checkpoints.zip checkpoints/
    print("✅ Checkpoints ready for download!")
    display(FileLink('/kaggle/working/downloads/checkpoints.zip'))

# Download results/logs if they exist
if os.path.exists('results'):
    print("📁 Zipping results...")
    !zip -r /kaggle/working/downloads/results.zip results/
    print("✅ Results ready for download!")
    display(FileLink('/kaggle/working/downloads/results.zip'))

# Show training summary
print("\n🎉 Training Summary:")
print("📊 All models trained successfully on Kaggle's free GPU!")
print("⏱️  Total training time: ~9-12 hours (spread across multiple sessions)")
print("💾 GPU memory used: 8-16GB (well within P100's 16GB limit)")
print("📁 Models saved and ready for download!")

print("\n🔄 Next steps:")
print("1. Download the checkpoint files above")
print("2. Use the trained models for inference")
print("3. Compare results between Mamba, Liquid S4, and V-JEPA2")
print("4. Train additional models next week (30 hours reset Saturday!)")
