# CIFAR-10 Competition - Google Colab Version

This notebook combines everything you need to compete in the CIFAR-10 image classification competition!

**Goal:** Build a CNN that achieves the highest accuracy on the augmented test set.

**What's in this notebook:**
1. Data exploration of CIFAR-10
2. Model architecture (SimpleCNN)
3. Training pipeline with configurable hyperparameters
4. Submission generation for Kaggle

**Workflow:**
1. Run Part 1 to explore the data
2. (Optional) Modify the model architecture in Part 2
3. (Optional) Add data augmentations in Part 3
4. Run Part 4 to train the model
5. Upload test files from Kaggle and run Part 5 to generate submission.csv

Good luck! 🚀

## Setup: Install Libraries and Import

In [None]:
# Install required libraries (if needed)
!pip install torch torchvision tqdm pandas matplotlib pillow -q

# Import libraries
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
from tqdm import tqdm
import pandas as pd
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
from collections import Counter
import os

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Using device: {device}')
if device.type == 'cuda':
    print(f'GPU: {torch.cuda.get_device_name(0)}')
    print(f'Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB')

---

# PART 1: Data Exploration

Let's explore the CIFAR-10 dataset to understand what we're working with.

## 1.1: Load CIFAR-10 Dataset

CIFAR-10 is a classic image classification dataset:
- 10 classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck
- Same tiny 32×32 pixel images
- 5,000 training images per class (50,000 total)
- 1,000 test images per class (10,000 total)

In [None]:
# Load CIFAR-10 dataset with basic transforms
basic_transform = transforms.Compose([transforms.ToTensor()])

train_dataset_explore = datasets.CIFAR10(root='./data', train=True, download=True, transform=basic_transform)
test_dataset_explore = datasets.CIFAR10(root='./data', train=False, download=True, transform=basic_transform)

# CIFAR-10 class names
classes = ['airplane', 'automobile', 'bird', 'cat', 'deer', 
           'dog', 'frog', 'horse', 'ship', 'truck']

print(f'Training images: {len(train_dataset_explore)}')
print(f'Test images: {len(test_dataset_explore)}')
print(f'Number of classes: {len(classes)}')
print(f'Classes: {classes}')
print(f'Image shape: {train_dataset_explore[0][0].shape}')  # (3, 32, 32)

## 1.2: Visualize Random Samples

In [None]:
# Visualize random samples from the training set
fig, axes = plt.subplots(4, 8, figsize=(16, 8))
for i, ax in enumerate(axes.flat):
    img, label = train_dataset_explore[np.random.randint(len(train_dataset_explore))]
    ax.imshow(img.permute(1, 2, 0))  # Convert from (C, H, W) to (H, W, C)
    ax.set_title(f'{classes[label]}', fontsize=8)
    ax.axis('off')
plt.suptitle('Random CIFAR-10 Training Samples', fontsize=16)
plt.tight_layout()
plt.show()

print('\nCIFAR-10 has 10 different classes!')
print('The competition will test your model on AUGMENTED versions of these images.')

## 1.3: Class Distribution

In [None]:
# Count images per class in training set
train_labels = [label for _, label in train_dataset_explore]
label_counts = Counter(train_labels)

# Plot distribution
plt.figure(figsize=(12, 4))
plt.bar(range(len(classes)), [label_counts[i] for i in range(len(classes))], 
        color='steelblue', alpha=0.7, tick_label=classes)
plt.xlabel('Class')
plt.ylabel('Number of Images')
plt.title('CIFAR-10 Training Set - Class Distribution')
plt.axhline(y=5000, color='red', linestyle='--', label='Expected: 5,000 per class')
plt.xticks(rotation=45, ha='right')
plt.legend()
plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

print(f'Total classes: {len(label_counts)}')
print(f'Images per class: {label_counts[0]}')
print('Dataset is balanced!' if len(set(label_counts.values())) == 1 else 'Dataset is imbalanced!')

## 1.4: Image Statistics

In [None]:
# Sample 1000 random images and compute statistics
sample_images = [train_dataset_explore[i][0] for i in np.random.choice(len(train_dataset_explore), 1000, replace=False)]
sample_tensor = torch.stack(sample_images)

# Compute mean and std per channel
mean = sample_tensor.mean(dim=[0, 2, 3])
std = sample_tensor.std(dim=[0, 2, 3])

print('Pixel Statistics (from 1000 random images):')
print(f'Mean (R, G, B): {mean.numpy()}')
print(f'Std  (R, G, B): {std.numpy()}')
print('\nNote: Values are in [0, 1] range after ToTensor()')
print('These statistics can be used for normalization in your transforms!')

---

# PART 2: Model Architecture

Define the CNN model. You can modify this architecture to improve performance!

In [None]:
class SimpleCNN(nn.Module):
    """
    Simple CNN baseline for CIFAR-10

    This is a basic architecture to get you started.
    Current architecture achieves ~50-60% accuracy.

    TODO: Improve this architecture! Some ideas:
    - Add more convolutional layers
    - Add BatchNorm layers after Conv layers
    - Try different filter sizes
    - Experiment with different pooling strategies
    - Add residual connections (ResNet-style)
    - Try different activation functions
    """
    def __init__(self, num_classes=10):
        super(SimpleCNN, self).__init__()

        # Feature extraction layers
        self.features = nn.Sequential(
            # Block 1
            nn.Conv2d(3, 32, kernel_size=3, padding=1),  # 32x32x3 -> 32x32x32
            nn.ReLU(inplace=True),
            # TODO: Add BatchNorm here? nn.BatchNorm2d(32)
            nn.MaxPool2d(2),  # 32x32x32 -> 16x16x32

            # Block 2
            nn.Conv2d(32, 64, kernel_size=3, padding=1),  # 16x16x32 -> 16x16x64
            nn.ReLU(inplace=True),
            # TODO: Add BatchNorm here?
            nn.MaxPool2d(2),  # 16x16x64 -> 8x8x64

            # Block 3
            nn.Conv2d(64, 128, kernel_size=3, padding=1),  # 8x8x64 -> 8x8x128
            nn.ReLU(inplace=True),
            # TODO: Add BatchNorm here?
            nn.AdaptiveAvgPool2d((1, 1))  # 8x8x128 -> 1x1x128

            # TODO: Add more blocks? More layers usually = better performance!
        )

        # Classification layers
        self.classifier = nn.Sequential(
            nn.Dropout(0.5),  # TODO: Experiment with different dropout rates?
            nn.Linear(128, num_classes)
            # TODO: Add more fully connected layers?
        )

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), -1)  # Flatten
        x = self.classifier(x)
        return x

print('SimpleCNN model defined!')
print('You can modify the architecture above to improve performance.')

---

# PART 3: Data Transforms and Augmentation

**THIS IS KEY TO SUCCESS!** The competition test set has augmentations (noise, blur, color shifts, etc.).

You MUST train with similar augmentations to generalize well!

In [None]:
def get_transforms(augment=False):
    """
    Get data transforms for training and testing

    Args:
        augment: If True, apply data augmentation (for training)
                 If False, only normalize (for testing)

    Returns:
        Composed transforms
    """
    if augment:
        # TODO: Add MORE augmentations here! This is KEY to better performance!
        # The test set has augmentations (noise, blur, color shifts, etc.)
        # Train with similar augmentations to generalize better!
        #
        # Suggestions:
        # - transforms.ColorJitter(brightness=0.3, contrast=0.3, saturation=0.3)
        # - transforms.RandomRotation(15)
        # - transforms.RandomAffine(degrees=0, translate=(0.1, 0.1))
        # - transforms.RandomGrayscale(p=0.1)
        # - Add noise? Blur? (use custom transforms)
        #
        return transforms.Compose([
            transforms.RandomHorizontalFlip(),
            transforms.RandomCrop(32, padding=4),
            # TODO: Add more augmentations here!

            transforms.ToTensor(),
            transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
        ])
    else:
        # No augmentation for testing
        return transforms.Compose([
            transforms.ToTensor(),
            transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
        ])

print('Data transforms defined!')
print('Modify get_transforms() above to add more augmentations.')

---

# PART 4: Training

Configure hyperparameters and train the model!

## 4.1: Configure Training Hyperparameters

Modify these to experiment with different settings!

In [None]:
# TRAINING CONFIGURATION - Modify these!
EPOCHS = 10  # TODO: Try 20-30 epochs for better performance
LEARNING_RATE = 0.001  # TODO: Experiment with 0.0001, 0.001, 0.01
BATCH_SIZE = 128  # TODO: Try 64, 128, 256 (smaller batch size if out of memory)
OPTIMIZER_TYPE = 'adam'  # Options: 'adam', 'sgd', 'adamw'
USE_SCHEDULER = False  # TODO: Try True to use learning rate scheduler

print('='*60)
print('TRAINING CONFIGURATION')
print('='*60)
print(f'Device: {device}')
print(f'Epochs: {EPOCHS}')
print(f'Learning Rate: {LEARNING_RATE}')
print(f'Batch Size: {BATCH_SIZE}')
print(f'Optimizer: {OPTIMIZER_TYPE}')
print(f'LR Scheduler: {USE_SCHEDULER}')
print('='*60)

## 4.2: Load Training Data

In [None]:
# Load CIFAR-10 dataset with augmentation
print('Loading CIFAR-10 dataset...')
train_dataset = datasets.CIFAR10(root='./data', train=True, download=True,
                                 transform=get_transforms(augment=True))
test_dataset = datasets.CIFAR10(root='./data', train=False, download=True,
                                transform=get_transforms(augment=False))

train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True, num_workers=2)
test_loader = DataLoader(test_dataset, batch_size=256, shuffle=False, num_workers=2)

print(f'Training images: {len(train_dataset)}')
print(f'Test images: {len(test_dataset)}')

## 4.3: Setup Model, Optimizer, and Loss

In [None]:
# Create model
model = SimpleCNN(num_classes=10).to(device)
print(f'Model: SimpleCNN')
print(f'Parameters: {sum(p.numel() for p in model.parameters()):,}')

# Loss function
criterion = nn.CrossEntropyLoss()

# Optimizer selection
if OPTIMIZER_TYPE == 'adam':
    optimizer = optim.Adam(model.parameters(), lr=LEARNING_RATE)
elif OPTIMIZER_TYPE == 'sgd':
    optimizer = optim.SGD(model.parameters(), lr=LEARNING_RATE, momentum=0.9, nesterov=True)
elif OPTIMIZER_TYPE == 'adamw':
    optimizer = optim.AdamW(model.parameters(), lr=LEARNING_RATE, weight_decay=0.01)
else:
    raise ValueError(f'Unknown optimizer: {OPTIMIZER_TYPE}')

print(f'Optimizer: {optimizer.__class__.__name__}')

# Learning rate scheduler (optional)
scheduler = None
if USE_SCHEDULER:
    scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
    print(f'LR Scheduler: StepLR (step_size=10, gamma=0.1)')
else:
    print('LR Scheduler: None')

## 4.4: Define Training and Validation Functions

In [None]:
def train_one_epoch(model, loader, criterion, optimizer, device):
    """Train for one epoch"""
    model.train()
    total_loss = 0.0
    correct = 0
    total = 0

    for images, labels in tqdm(loader, desc='Training'):
        images, labels = images.to(device), labels.to(device)

        # Forward pass
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backward pass
        loss.backward()
        optimizer.step()

        # Track metrics
        total_loss += loss.item() * images.size(0)
        _, predicted = outputs.max(1)
        total += labels.size(0)
        correct += predicted.eq(labels).sum().item()

    avg_loss = total_loss / total
    accuracy = 100. * correct / total
    return avg_loss, accuracy


def validate(model, loader, criterion, device):
    """Validate the model"""
    model.eval()
    total_loss = 0.0
    correct = 0
    total = 0

    with torch.no_grad():
        for images, labels in tqdm(loader, desc='Validating'):
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            loss = criterion(outputs, labels)

            total_loss += loss.item() * images.size(0)
            _, predicted = outputs.max(1)
            total += labels.size(0)
            correct += predicted.eq(labels).sum().item()

    avg_loss = total_loss / total
    accuracy = 100. * correct / total
    return avg_loss, accuracy

print('Training functions defined!')

## 4.5: Train the Model!

This will take a few minutes. Watch the accuracy improve!

In [None]:
print('='*60)
print('TRAINING START')
print('='*60)
print()

best_acc = 0.0

for epoch in range(1, EPOCHS + 1):
    print(f'Epoch {epoch}/{EPOCHS}')

    # Train
    train_loss, train_acc = train_one_epoch(model, train_loader, criterion, optimizer, device)

    # Validate
    test_loss, test_acc = validate(model, test_loader, criterion, device)

    print(f'Train - Loss: {train_loss:.4f}, Acc: {train_acc:.2f}%')
    print(f'Test  - Loss: {test_loss:.4f}, Acc: {test_acc:.2f}%')

    # Save best model
    if test_acc > best_acc:
        best_acc = test_acc
        torch.save(model.state_dict(), 'best_model.pth')
        print(f'✓ Saved best model (acc: {best_acc:.2f}%)')

    # Update learning rate scheduler
    if scheduler:
        scheduler.step()
        print(f'Learning rate: {scheduler.get_last_lr()[0]:.6f}')

    print()

print('='*60)
print('TRAINING COMPLETE')
print('='*60)
print(f'Best test accuracy: {best_acc:.2f}%')
print('Model saved as best_model.pth')

---

# PART 5: Generate Kaggle Submission

Upload your test files from Kaggle and generate submission.csv!

## 5.1: Upload Test Files from Kaggle

**Instructions:**
1. Download `test.csv` and `test_images.zip` from the Kaggle competition page
2. Upload them to this Colab notebook using the file browser on the left
3. Unzip the test images by running the cell below

In [None]:
# Unzip test images (if test_images.zip exists)
import zipfile

if os.path.exists('test_images.zip'):
    print('Unzipping test_images.zip...')
    with zipfile.ZipFile('test_images.zip', 'r') as zip_ref:
        zip_ref.extractall('.')
    print('✓ Test images extracted to test_images/')
else:
    print('❌ test_images.zip not found!')
    print('   Please upload test_images.zip from Kaggle.')

# Check if test.csv exists
if os.path.exists('test.csv'):
    print('✓ test.csv found!')
else:
    print('❌ test.csv not found!')
    print('   Please upload test.csv from Kaggle.')

## 5.2: Generate Submission File

This will create `submission.csv` that you can upload to Kaggle!

In [None]:
def generate_submission(model, test_csv='test.csv', test_images_dir='test_images',
                       output_csv='submission.csv', device='cpu'):
    """
    Generate Kaggle submission file

    Args:
        model: Trained model
        test_csv: Path to test.csv (contains image IDs)
        test_images_dir: Directory containing test images
        output_csv: Path to save submission
        device: Device to run inference on
    """
    print('\n' + '='*60)
    print('GENERATING KAGGLE SUBMISSION')
    print('='*60)

    # Check if test files exist
    if not os.path.exists(test_csv):
        print(f'❌ {test_csv} not found!')
        print('   Download test.csv from Kaggle to generate submission.')
        return

    if not os.path.exists(test_images_dir):
        print(f'❌ {test_images_dir}/ not found!')
        print('   Download and unzip test_images.zip from Kaggle.')
        return

    # Load test image IDs
    test_df = pd.read_csv(test_csv)
    print(f'Found {len(test_df)} test images')

    # Get transforms (no augmentation for testing!)
    test_transform = get_transforms(augment=False)

    # Generate predictions
    model.eval()
    predictions = []

    with torch.no_grad():
        for img_id in tqdm(test_df['id'], desc='Predicting'):
            # Load image - format ID with leading zeros to match filename
            img_filename = f'{str(img_id).zfill(5)}.png'
            img_path = os.path.join(test_images_dir, img_filename)
            if not os.path.exists(img_path):
                print(f'Warning: {img_path} not found, skipping...')
                continue

            img = Image.open(img_path).convert('RGB')
            img_tensor = test_transform(img).unsqueeze(0).to(device)

            # Predict
            output = model(img_tensor)
            pred_class = output.argmax(1).item()

            predictions.append({
                'id': img_id,
                'label': pred_class
            })

    # Save submission
    submission_df = pd.DataFrame(predictions)
    submission_df.to_csv(output_csv, index=False)

    print(f'\n✅ Submission saved to {output_csv}')
    print(f'   Total predictions: {len(submission_df)}')
    print('\nPreview:')
    print(submission_df.head(10))
    print('\n' + '='*60)
    print(f'📤 Download {output_csv} and upload to Kaggle!')
    print('='*60 + '\n')


# Load best model and generate submission
print('Loading best model...')
model.load_state_dict(torch.load('best_model.pth', map_location=device))
print('✓ Best model loaded!')

# Generate submission
generate_submission(model, device=device)

## 5.3: Download Submission File

If submission.csv was generated successfully, download it using the file browser on the left and upload to Kaggle!

In [None]:
# Check if submission.csv exists
if os.path.exists('submission.csv'):
    print('✅ submission.csv ready for download!')
    print('\nNext steps:')
    print('1. Download submission.csv using the file browser on the left')
    print('2. Go to the Kaggle competition page')
    print('3. Click "Submit Predictions"')
    print('4. Upload submission.csv')
    print('5. Check your score on the leaderboard!')
    print('\nGood luck! 🚀')
else:
    print('❌ submission.csv not found!')
    print('   Make sure you uploaded test.csv and test_images.zip from Kaggle.')

---

# Tips for Improving Your Score 💡

## 1. Data Augmentation is KEY! ⭐⭐⭐
The competition test set has heavy augmentations (noise, blur, color shifts). You MUST train with similar augmentations!

**Try adding to `get_transforms()`:**
```python
transforms.ColorJitter(brightness=0.3, contrast=0.3, saturation=0.3, hue=0.1)
transforms.RandomRotation(15)
transforms.RandomAffine(degrees=0, translate=(0.1, 0.1))
transforms.RandomGrayscale(p=0.1)
transforms.GaussianBlur(kernel_size=3, sigma=(0.1, 2.0))
```

## 2. Improve Model Architecture
- Add more convolutional blocks
- Add BatchNorm after each Conv layer: `nn.BatchNorm2d(channels)`
- Try deeper networks (4-6 blocks)
- Increase number of filters (256, 512)

## 3. Train Longer
- 10 epochs is just a baseline
- Try 20-30 epochs for better results
- Monitor for overfitting (train acc >> test acc)

## 4. Experiment with Hyperparameters
- Learning rate: Try 0.0001, 0.001, 0.01
- Batch size: Try 64, 128, 256
- Optimizer: Try 'adam', 'sgd', 'adamw'
- Use learning rate scheduler: `USE_SCHEDULER = True`

## 5. Monitor Overfitting
- If training accuracy >> test accuracy → overfitting!
- Solutions:
  - Add more data augmentation
  - Increase dropout
  - Train for fewer epochs
  - Use weight decay (AdamW)

---

Good luck with your competition! 🎉