# Music Classification: Data Augmentation Techniques Comparison

This notebook provides an interactive interface for comparing different data augmentation techniques for music genre classification.

## Features
- Configurable model architecture and training parameters
- 5 different augmentation strategies
- 5-fold cross-validation with statistical analysis
- Comprehensive visualization


## 1. Setup and Configuration

In [None]:
import sys
import os
import warnings
warnings.filterwarnings('ignore')

# Import our modular components
from models import MusicGenreCNN, create_default_cnn, create_lightweight_cnn
from training import TrainingConfig, get_quick_config, get_production_config
from data import prepare_cv_data, MusicGenreDataset, print_data_summary
from utils import set_seed

import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Set random seeds for reproducibility
set_seed(42)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

## 2. Configure Experiment Parameters

Easily adjust training parameters, model architecture, and experiment settings:

In [None]:
# Choose your configuration
# Options: get_quick_config(), get_production_config(), TrainingConfig()

# For quick testing (recommended for first run)
config = get_quick_config()

# For production training (uncomment to use)
# config = get_production_config()

# Custom configuration (modify as needed)
# config = TrainingConfig(
#     num_epochs=100,
#     patience=20,
#     batch_size=32,
#     learning_rate=0.001,
#     dropout_rate=0.5,
#     conv_channels=[32, 64, 128, 256],
#     fc_units=[1024, 256]
# )

print("Training Configuration:")
print(f"  Epochs: {config.num_epochs}")
print(f"  Patience: {config.patience}")
print(f"  Batch size: {config.batch_size}")
print(f"  Learning rate: {config.learning_rate}")
print(f"  Dropout rate: {config.dropout_rate}")
print(f"  Conv channels: {config.conv_channels}")
print(f"  FC units: {config.fc_units}")
print(f"  CV folds: {config.n_folds}")

## 3. Data Preparation

In [None]:
# Prepare cross-validation data
print("Preparing data for cross-validation...")

all_images, all_labels, cv_splits, genre_to_idx, idx_to_genre = prepare_cv_data(
    config.data_dir, 
    n_folds=config.n_folds
)

# Print comprehensive data summary
print_data_summary(all_images, all_labels, cv_splits, idx_to_genre)

# Update config with actual number of classes
config.num_classes = len(genre_to_idx)
print(f"\nUpdated num_classes to: {config.num_classes}")

## 4. Model Architecture Preview

In [None]:
# Create a sample model to inspect architecture
sample_model = MusicGenreCNN(
    num_classes=config.num_classes,
    dropout_rate=config.dropout_rate,
    conv_channels=config.conv_channels,
    fc_units=config.fc_units,
    input_size=config.input_size
)

model_info = sample_model.get_model_info()

print("Model Architecture:")
print(f"  Input size: {model_info['input_size']}")
print(f"  Conv channels: {model_info['conv_channels']}")
print(f"  FC units: {model_info['fc_units']}")
print(f"  Dropout rate: {model_info['dropout_rate']}")
print(f"  Total parameters: {model_info['total_parameters']:,}")
print(f"  Trainable parameters: {model_info['trainable_parameters']:,}")

# Clean up
del sample_model

## 5. Run Experiment

This cell will run the complete cross-validation experiment. 
**Note: This may take several hours depending on your configuration.**

In [None]:
# Import the main experiment runner
# We'll create a simplified version that uses our modular components

# For now, let's run a single fold as an example
print("Running a single fold example...")
print("(Full cross-validation will be implemented in the training module)")

# Get first fold data
train_val_indices, test_indices = cv_splits[0]

print(f"Fold 1:")
print(f"  Train+Val samples: {len(train_val_indices)}")
print(f"  Test samples: {len(test_indices)}")

# This is where the full experiment would run
print("\n✓ Example fold setup completed.")
print("\nTo run the full experiment:")
print("1. Import and use the ModelTrainer class")
print("2. Or run the updated main.py file")

## 6. Quick Model Test

Test the model with a small batch to ensure everything works:

In [None]:
# Create a test model
test_model = MusicGenreCNN(
    num_classes=config.num_classes,
    dropout_rate=config.dropout_rate,
    conv_channels=config.conv_channels,
    fc_units=config.fc_units
).to(device)

# Create a dummy batch
dummy_batch = torch.randn(4, 3, 256, 384).to(device)  # Batch of 4 images

# Test forward pass
test_model.eval()
with torch.no_grad():
    output = test_model(dummy_batch)

print(f"Input shape: {dummy_batch.shape}")
print(f"Output shape: {output.shape}")
print(f"Output classes: {output.shape[1]}")
print("✓ Model forward pass successful!")

# Clean up
del test_model, dummy_batch, output

## 7. Parameter Exploration

Experiment with different configurations:

In [None]:
# Compare different model architectures
architectures = {
    'Lightweight': {'conv_channels': [16, 32, 64, 128], 'fc_units': [512, 128]},
    'Default': {'conv_channels': [32, 64, 128, 256], 'fc_units': [1024, 256]},
    'Deep': {'conv_channels': [32, 64, 128, 256, 512], 'fc_units': [2048, 512, 256]}
}

print("Model Architecture Comparison:")
print("=" * 50)

for name, arch in architectures.items():
    model = MusicGenreCNN(
        num_classes=config.num_classes,
        conv_channels=arch['conv_channels'],
        fc_units=arch['fc_units']
    )
    
    info = model.get_model_info()
    print(f"{name}:")
    print(f"  Parameters: {info['total_parameters']:,}")
    print(f"  Conv layers: {len(arch['conv_channels'])}")
    print(f"  FC layers: {len(arch['fc_units'])}")
    print()
    
    del model

## 8. Next Steps

To run the complete experiment:

1. **Use the training module**: Import `ModelTrainer` and run the full cross-validation
2. **Run main.py**: Use the updated main.py file for automated execution  
3. **Analyze results**: Use the analysis module for statistical analysis and visualization

The modular structure allows you to:
- Easily adjust parameters through the `TrainingConfig` class
- Swap model architectures
- Add new augmentation techniques
- Customize training procedures
- Extend analysis and visualization

In [None]:
print("✓ Notebook setup complete!")
print("\nModule structure:")
print("  📁 models/ - Model architectures")
print("  📁 data/ - Data processing and augmentation")
print("  📁 training/ - Training configuration and procedures")
print("  📁 analysis/ - Statistical analysis and visualization")
print("  📁 utils/ - Common utilities")
print("\nTo run full experiment: python main.py")