# Computer Vision Project - Getting Started

This notebook provides a comprehensive introduction to our computer vision project using PyTorch. We'll cover:

- Setting up the environment
- Loading and exploring data
- Building and training models
- Evaluating results
- Visualizing predictions


## 1. Setup and Imports


In [1]:
import sys
import os

# Add src directory to path
sys.path.append('../src')

# Standard libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from pathlib import Path
import yaml
from tqdm import tqdm

# PyTorch
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
import torchvision
from torchvision import transforms

# Custom modules
from models.cnn_model import SimpleCNN, SimpleResNet
from data.dataset import get_dataloaders, get_transforms
from utils.training import Trainer, get_optimizer, get_scheduler
from utils.visualization import (
    show_batch, plot_confusion_matrix, visualize_predictions,
    plot_class_distribution, visualize_feature_maps
)

# Set random seeds for reproducibility
torch.manual_seed(42)
np.random.seed(42)

# Configure matplotlib
plt.style.use('default')
%matplotlib inline

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")

PyTorch version: 2.7.1
CUDA available: False


## 2. Load Configuration


In [None]:
# Load configuration
config_path = "../configs/config.yaml"
with open(config_path, "r") as f:
    config = yaml.safe_load(f)

print("Configuration loaded:")
print(f"Dataset: {config['dataset']['name']}")
print(f"Model: {config['model']['name']}")
print(f"Batch size: {config['dataloader']['batch_size']}")
print(f"Learning rate: {config['training']['learning_rate']}")
print(f"Epochs: {config['training']['epochs']}")


Configuration loaded:
Dataset: CIFAR10
Model: SimpleCNN
Batch size: 32
Learning rate: 0.001
Epochs: 50


## 3. Load and Explore Data


In [None]:
# Create data loaders
train_loader, val_loader = get_dataloaders(
    dataset_name=config["dataset"]["name"],
    data_dir=config["dataset"]["data_dir"],
    batch_size=config["dataloader"]["batch_size"],
    num_workers=config["dataloader"]["num_workers"],
    input_size=tuple(config["dataset"]["input_size"]),
)

print(f"Training batches: {len(train_loader)}")
print(f"Validation batches: {len(val_loader)}")
print(f"Training samples: {len(train_loader.dataset)}")
print(f"Validation samples: {len(val_loader.dataset)}")


 11%|█         | 18.8M/170M [00:33<04:28, 565kB/s]   


KeyboardInterrupt: 

### Visualize Sample Images


In [None]:
# Show a batch of training images
show_batch(
    train_loader,
    class_names=config["dataset"]["class_names"],
    num_images=8,
    figsize=(15, 8),
)


### Analyze Class Distribution


In [None]:
# Collect all labels to analyze distribution
all_labels = []
for _, labels in train_loader:
    all_labels.extend(labels.numpy())

plot_class_distribution(
    all_labels, class_names=config["dataset"]["class_names"], figsize=(12, 6)
)


## 4. Create and Initialize Model


In [None]:
# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Create model
if config["model"]["name"] == "SimpleCNN":
    model = SimpleCNN(
        num_classes=config["model"]["num_classes"],
        input_channels=config["model"]["input_channels"],
    )
elif config["model"]["name"] == "SimpleResNet":
    model = SimpleResNet(
        num_classes=config["model"]["num_classes"],
        input_channels=config["model"]["input_channels"],
    )
else:
    raise ValueError(f"Unknown model: {config['model']['name']}")

print(f"Model created: {config['model']['name']}")
print(f"Model parameters: {model.count_parameters():,}")

# Move model to device
model = model.to(device)


### Model Architecture Summary


In [None]:
# Print model architecture
print("Model Architecture:")
print(model)

# Test model with a dummy input
dummy_input = torch.randn(1, 3, *config["dataset"]["input_size"]).to(device)
with torch.no_grad():
    output = model(dummy_input)
    print(f"\nOutput shape: {output.shape}")
    print(f"Output range: [{output.min():.3f}, {output.max():.3f}]")


## 5. Set Up Training Components


In [None]:
# Create optimizer
optimizer = get_optimizer(
    model=model,
    optimizer_name=config["training"]["optimizer"],
    lr=config["training"]["learning_rate"],
    **config["training"]["optimizer_params"],
)

# Create scheduler
scheduler = get_scheduler(
    optimizer=optimizer,
    scheduler_name=config["training"]["scheduler"]["name"],
    **{k: v for k, v in config["training"]["scheduler"].items() if k != "name"},
)

# Create loss function
criterion = nn.CrossEntropyLoss()

print(f"Optimizer: {optimizer.__class__.__name__}")
print(f"Scheduler: {scheduler.__class__.__name__}")
print(f"Loss function: {criterion.__class__.__name__}")


## 6. Create Trainer and Start Training


In [None]:
# Create trainer
trainer = Trainer(
    model=model,
    device=device,
    criterion=criterion,
    optimizer=optimizer,
    scheduler=scheduler,
)

print("Trainer initialized successfully!")


### Start Training


In [None]:
# Train the model
# Note: You might want to reduce epochs for quick testing
num_epochs = 5  # Reduced for demonstration, change to config['training']['epochs'] for full training

history = trainer.train(
    train_loader=train_loader,
    val_loader=val_loader,
    epochs=num_epochs,
    save_best=True,
    save_path=f"{config['logging']['model_save_dir']}/best_model.pth",
)

print("Training completed!")


## 7. Visualize Training Results


In [None]:
# Plot training history
trainer.plot_training_history(
    save_path=f"{config['logging']['figure_save_dir']}/training_history.png"
)


## 8. Evaluate Model Performance


In [None]:
# Evaluate on validation set
results = trainer.evaluate(
    test_loader=val_loader, class_names=config["dataset"]["class_names"]
)

print(f"\nFinal Results:")
print(f"Validation Loss: {results['test_loss']:.4f}")
print(f"Validation Accuracy: {results['test_accuracy']:.4f}")


### Plot Confusion Matrix


In [None]:
# Plot confusion matrix
plot_confusion_matrix(
    y_true=results["targets"],
    y_pred=results["predictions"],
    class_names=config["dataset"]["class_names"],
    figsize=(10, 8),
    save_path=f"{config['logging']['figure_save_dir']}/confusion_matrix.png",
)


## 9. Visualize Predictions


In [None]:
# Visualize model predictions
visualize_predictions(
    model=model,
    dataloader=val_loader,
    device=device,
    class_names=config["dataset"]["class_names"],
    num_images=config["visualization"]["num_prediction_samples"],
    figsize=(20, 12),
    save_path=f"{config['logging']['figure_save_dir']}/predictions.png",
)


## 10. Feature Visualization (Optional)


In [None]:
# Get a sample image for feature visualization
data_iter = iter(val_loader)
sample_images, sample_labels = next(data_iter)
sample_image = sample_images[0]  # Take first image

# Visualize feature maps from the first convolutional layer
try:
    visualize_feature_maps(
        model=model,
        image=sample_image,
        layer_name="conv1",  # This might need adjustment based on your model
        device=device,
        max_maps=16,
        figsize=(15, 10),
        save_path=f"{config['logging']['figure_save_dir']}/feature_maps.png",
    )
except Exception as e:
    print(f"Feature visualization failed: {e}")
    print("You might need to adjust the layer_name parameter.")


## 11. Save and Load Model


In [None]:
# Save the final model
final_model_path = f"{config['logging']['model_save_dir']}/final_model.pth"
model.save_model(final_model_path)

# Demonstrate loading the model
# Create a new model instance
if config["model"]["name"] == "SimpleCNN":
    loaded_model = SimpleCNN(
        num_classes=config["model"]["num_classes"],
        input_channels=config["model"]["input_channels"],
    )
elif config["model"]["name"] == "SimpleResNet":
    loaded_model = SimpleResNet(
        num_classes=config["model"]["num_classes"],
        input_channels=config["model"]["input_channels"],
    )

# Load the saved weights
loaded_model.load_model(final_model_path, device=str(device))
loaded_model = loaded_model.to(device)

print("Model saved and loaded successfully!")


## 12. Next Steps and Improvements

Congratulations! You've successfully:

- Set up a computer vision project with PyTorch
- Loaded and explored your dataset
- Built and trained a neural network
- Evaluated model performance
- Visualized results

### Potential Improvements:

1. **Data Augmentation**: Experiment with different augmentation techniques
2. **Model Architecture**: Try different architectures (ResNet, EfficientNet, Vision Transformers)
3. **Hyperparameter Tuning**: Use techniques like grid search or Bayesian optimization
4. **Transfer Learning**: Use pre-trained models and fine-tune them
5. **Advanced Techniques**: Implement techniques like mixup, cutmix, or label smoothing
6. **Model Interpretability**: Use techniques like Grad-CAM for better understanding
7. **Deployment**: Convert models to ONNX or TensorRT for production deployment

### Creating More Notebooks:

- `02_advanced_models.ipynb`: Implement advanced architectures
- `03_transfer_learning.ipynb`: Use pre-trained models
- `04_hyperparameter_tuning.ipynb`: Optimize hyperparameters
- `05_model_interpretability.ipynb`: Understand model decisions
- `06_deployment.ipynb`: Deploy models for inference


## Summary

This notebook provided a complete end-to-end workflow for computer vision projects. The modular structure makes it easy to experiment with different components while maintaining clean, organized code.
