# Multi-Dataset ImageNet Training with ResNet50 - Modular Version

This notebook demonstrates training ResNet50 on multiple ImageNet variants using a modular approach. The code supports experiments across different datasets for comprehensive analysis.

## Supported Datasets:
- **ImageNette**: 10 classes, 224x224 images (fastest for experiments)
- **Tiny ImageNet**: 200 classes, 64x64 images (medium complexity)
- **ImageNet Mini**: 1000 classes, 224x224 images (subset of full ImageNet)
- **Full ImageNet**: 1000 classes, 224x224 images (full dataset)

## Features:
- **Modular Design**: Separate modules for configuration, data loading, models, and training utilities
- **Multi-Dataset Support**: Easy switching between different ImageNet variants
- **Dataset-Specific Training**: Optimized hyperparameters for each dataset
- **Comprehensive Metrics**: Tracks training and validation metrics
- **Model Saving**: Automatic model checkpointing with dataset-specific naming


In [74]:
# Install required packages
%pip install torchsummary albumentations




In [75]:
# Step 1: Clone the repo
!git clone https://github.com/nitin-vig/ERAv4S9.git

# Step 2: Move into the repo folder
%cd ERAv4S9

# Step 3: (Optional) List files to verify
!ls -l

Cloning into 'ERAv4S9'...
remote: Enumerating objects: 35, done.[K
remote: Counting objects: 100% (35/35), done.[K
remote: Compressing objects: 100% (27/27), done.[K
remote: Total 35 (delta 12), reused 31 (delta 8), pack-reused 0 (from 0)[K
Receiving objects: 100% (35/35), 166.21 KiB | 20.78 MiB/s, done.
Resolving deltas: 100% (12/12), done.
/content/ERAv4S9/ERAv4S9/ERAv4S9/ERAv4S9/ERAv4S9/ERAv4S9/ERAv4S9/ERAv4S9/ERAv4S9/ERAv4S9/ERAv4S9
total 416
-rw-r--r-- 1 root root  20980 Oct 25 22:33 advanced_optimizer_scheduler.py
-rw-r--r-- 1 root root  11707 Oct 25 22:33 config.py
-rw-r--r-- 1 root root  25532 Oct 25 22:33 dataset_loader.py
-rw-r--r-- 1 root root  22334 Oct 25 22:33 enhanced_progressive_training.py
-rw-r--r-- 1 root root   7974 Oct 25 22:33 example_usage.py
-rw-r--r-- 1 root root 231834 Oct 25 22:33 ImageNet_Experiment_Resnet_50.ipynb
-rw-r--r-- 1 root root   6905 Oct 25 22:33 models.py
-rw-r--r-- 1 root root  25541 Oct 25 22:33 progressive_training_strategy.py
-rw-r--r-- 1

In [76]:
# Import all necessary libraries
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import matplotlib.pyplot as plt
import numpy as np
import os
from google.colab import drive

# Import our modular components
from config import Config
from dataset_loader import get_data_loaders, visualize_samples
from models import get_model, count_parameters, get_model_summary, save_model
from training_utils import train_model, evaluate_model, MetricsTracker

print("All imports successful!")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")


All imports successful!
PyTorch version: 2.8.0+cu126
CUDA available: True


## Configuration

Let's configure our training parameters. You can easily switch between datasets and modify training parameters here.


In [77]:
# Configuration setup for multi-dataset experiments
# You can easily switch between datasets by changing DATASET_NAME

# Dataset configuration
DATASET_NAME = "imagenette"  # Options: "imagenette", "tiny_imagenet", "imagenet_mini", "imagenet"
USE_PRETRAINED = False  # Custom implementation without pretrained weights

# Update configuration for the selected dataset
Config.update_for_dataset(DATASET_NAME)

print("Configuration updated!")
print(f"Dataset: {DATASET_NAME}")
print(f"Image size: {Config.IMAGE_SIZE}")
print(f"Number of classes: {Config.NUM_CLASSES}")
print(f"Batch size: {Config.BATCH_SIZE}")
print(f"Epochs: {Config.NUM_EPOCHS}")
print(f"Learning rate: {Config.LEARNING_RATE}")
print(f"Use pretrained: {USE_PRETRAINED}")

# Display dataset-specific training parameters
dataset_config = Config.get_dataset_config()
print(f"\nDataset-specific parameters:")
print(f"Optimizer: {dataset_config['optimizer']}")
print(f"Scheduler: {dataset_config['scheduler']}")


Configuration updated for imagenette
Image size: 224
Number of classes: 10
Batch size: 64
Epochs: 30
Learning rate: 0.001
Configuration updated!
Dataset: imagenette
Image size: 224
Number of classes: 10
Batch size: 64
Epochs: 30
Learning rate: 0.001
Use pretrained: False

Dataset-specific parameters:
Optimizer: adamw
Scheduler: reduce_lr


## Environment Setup

Setup the environment and check GPU availability. For full ImageNet training, you'll need significant computational resources.


In [78]:
# Setup environment
def setup_environment():
    """Setup environment for ImageNet training"""
    print("Setting up environment for ImageNet training...")

    # Create necessary directories
    os.makedirs(Config.DATA_ROOT, exist_ok=True)
    os.makedirs(Config.SAVE_MODEL_PATH, exist_ok=True)

    print("Environment setup complete!")

def check_gpu_availability():
    """Check GPU availability and setup device"""
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print(f"Using device: {device}")

    if torch.cuda.is_available():
        print(f"GPU: {torch.cuda.get_device_name(0)}")
        print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
        print("Warning: Full ImageNet training requires significant GPU memory!")
    else:
        print("Warning: CPU training will be very slow for ImageNet!")

    return device

# Run setup
setup_environment()
device = check_gpu_availability()

# Set random seed for reproducibility
torch.manual_seed(42)
if torch.cuda.is_available():
    torch.cuda.manual_seed(42)


Setting up environment for ImageNet training...
Environment setup complete!
Using device: cuda
GPU: NVIDIA L4
GPU Memory: 23.8 GB


In [79]:
# 🧩 Colab Dataset Setup Cell
# Copy-paste this cell into your Colab notebook

import os
import subprocess

# Install required packages (if not already installed)
!pip install -q torch torchvision albumentations tqdm requests

DATA_DIR = "/content/data"
IMAGENETTE_DIR = os.path.join(DATA_DIR, "imagenette2")
TINY_IMAGENET_DIR = os.path.join(DATA_DIR, "tiny-imagenet-200")

os.makedirs(DATA_DIR, exist_ok=True)

def run_cmd(cmd):
    """Helper to run shell commands cleanly."""
    result = subprocess.run(cmd, shell=True, check=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    if result.returncode != 0:
        print(f"⚠️ Warning: {cmd} failed with error:\n{result.stderr}")
    return result

# ------------------------------
# 🔹 Download ImageNette
# ------------------------------
if not os.path.exists(IMAGENETTE_DIR) or len(os.listdir(IMAGENETTE_DIR)) == 0:
    print("🔄 Downloading ImageNette...")
    run_cmd("wget -q https://s3.amazonaws.com/fast-ai-imageclas/imagenette2.tgz")
    run_cmd("tar -xzf imagenette2.tgz")
    run_cmd(f"mv imagenette2 {DATA_DIR}/")
    run_cmd("rm imagenette2.tgz")
    print("✅ ImageNette downloaded!")
else:
    print(f"✅ Skipping ImageNette — already exists at {IMAGENETTE_DIR}")

# ------------------------------
# 🔹 Download Tiny ImageNet
# ------------------------------
if not os.path.exists(TINY_IMAGENET_DIR) or len(os.listdir(TINY_IMAGENET_DIR)) == 0:
    print("🔄 Downloading Tiny ImageNet...")
    run_cmd("wget -q http://cs231n.stanford.edu/tiny-imagenet-200.zip")
    run_cmd("unzip -q tiny-imagenet-200.zip")
    run_cmd(f"mv tiny-imagenet-200 {DATA_DIR}/")
    run_cmd("rm tiny-imagenet-200.zip")
    print("✅ Tiny ImageNet downloaded!")
else:
    print(f"✅ Skipping Tiny ImageNet — already exists at {TINY_IMAGENET_DIR}")

# ------------------------------
# 🔹 Verify
# ------------------------------
print("\n📁 Dataset verification:")
!ls -la {DATA_DIR}
!du -sh {DATA_DIR}/* | sort -h

print("\n🎉 Datasets ready! You can now use:")
print("from dataloader import get_data_loaders")
print("train_loader, test_loader = get_data_loaders('imagenette')")


✅ Skipping ImageNette — already exists at /content/data/imagenette2
✅ Skipping Tiny ImageNet — already exists at /content/data/tiny-imagenet-200

📁 Dataset verification:
total 16
drwxr-xr-x 4 root root  4096 Oct 25 22:25 .
drwxr-xr-x 1 root root  4096 Oct 25 22:21 ..
drwxr-xr-x 4  501 staff 4096 Feb  6  2021 imagenette2
drwxrwxr-x 5 root root  4096 Feb  9  2015 tiny-imagenet-200
481M	/content/data/tiny-imagenet-200
1.5G	/content/data/imagenette2

🎉 Datasets ready! You can now use:
from dataloader import get_data_loaders
train_loader, test_loader = get_data_loaders('imagenette')


## Data Loading

Load the dataset and visualize some sample images.


In [80]:
# Load dataset
print("Loading dataset...")
train_loader, test_loader = get_data_loaders(DATASET_NAME)

# Visualize some samples
print("\nVisualizing sample images...")
visualize_samples(train_loader, num_samples=12)


Loading dataset...
ImageNette dataset download instructions:
1. Download from: https://s3.amazonaws.com/fast-ai-imageclas/imagenette2.tgz
2. Extract to ./data/imagenette2/
3. Ensure folder structure:
   imagenette2/
   ├── train/
   │   ├── n01440764/
   │   └── ...
   └── val/
       ├── n01440764/
       └── ...


ValueError: num_samples should be a positive integer value, but got num_samples=0

## Model Creation

Create the ResNet50 model and display its architecture.


In [None]:
# Create model
print(f"Creating {Config.MODEL_NAME} model...")
model = get_model(
    model_name=Config.MODEL_NAME,
    dataset_name=DATASET_NAME,
    pretrained=USE_PRETRAINED
)

# Move model to device
model = model.to(device)

# Print model info
print(f"Model parameters: {count_parameters(model):,}")

# Get model summary
dataset_config = Config.get_dataset_config()
input_size = (3, dataset_config["image_size"], dataset_config["image_size"])
print(f"\nModel summary (input size: {input_size}):")
get_model_summary(model, input_size=input_size)


## Training

Train the model using our modular training utilities.


In [None]:
# Train model
stage_config = Config.STAGES[DATASET_NAME]
NUM_EPOCHS = stage_config["epochs"]
BATCH_SIZE = stage_config["batch_size"]
LEARNING_RATE = stage_config["lr"]
WEIGHT_DECAY = stage_config["weight_decay"]

print(f"Starting training for {NUM_EPOCHS} epochs...")
print(f"Batch size: {BATCH_SIZE}")
print(f"Learning rate: {LEARNING_RATE}")
print(f"Weight decay: {WEIGHT_DECAY}")

metrics_tracker = train_model(model, train_loader, test_loader, device, Config)


## Results Visualization

Plot the training metrics and evaluate the final model.


In [None]:
# Plot training metrics
print("Plotting training metrics...")
metrics_tracker.plot_metrics(save_path=f"{Config.SAVE_MODEL_PATH}/training_metrics.png")

# Final evaluation
print("\nFinal evaluation...")
test_loss, test_acc, test_top5_acc = evaluate_model(model, test_loader, device)

print(f"\nTraining completed!")
print(f"Final Test Accuracy: {test_acc:.2f}%")
print(f"Final Top-5 Accuracy: {test_top5_acc:.2f}%")


## Model Saving

Save the trained model to local storage and Google Drive.


In [None]:
# Save final model
final_model_path = f"{Config.SAVE_MODEL_PATH}/final_model.pth"
save_model(model, final_model_path, epoch=NUM_EPOCHS, loss=test_loss)

# Save to Google Drive if mounted
if Config.MOUNT_DRIVE:
    drive_model_path = f"{Config.DRIVE_MODEL_PATH}/final_model.pth"
    save_model(model, drive_model_path, epoch=NUM_EPOCHS, loss=test_loss)
    print(f"Model also saved to Google Drive: {drive_model_path}")

print("Model saving completed!")


## Multi-Dataset Experiments

You can easily switch between different datasets for experiments. Each dataset has optimized hyperparameters.


In [83]:
# Multi-dataset experiment examples
# Uncomment the dataset you want to experiment with

# Example 1: ImageNette (fastest, 10 classes)
DATASET_NAME = "imagenette"
Config.update_for_dataset(DATASET_NAME)

# Example 2: Tiny ImageNet (medium complexity, 200 classes)
# DATASET_NAME = "tiny_imagenet"
# Config.update_for_dataset(DATASET_NAME)

# Example 3: ImageNet Mini (1000 classes subset)
# DATASET_NAME = "imagenet_mini"
# Config.update_for_dataset(DATASET_NAME)

# Example 4: Full ImageNet (1000 classes, full dataset)
# DATASET_NAME = "imagenet"
# Config.update_for_dataset(DATASET_NAME)

print("To experiment with different datasets:")
print("1. Uncomment the dataset you want to use")
print("2. Run this cell to update configuration")
print("3. Continue with the rest of the notebook")
print("\nDataset comparison:")
print("- ImageNette: ~13k images, 10 classes, ~30 epochs")
print("- Tiny ImageNet: ~100k images, 200 classes, ~50 epochs")
print("- ImageNet Mini: ~100k images, 1000 classes, ~50 epochs")
print("- Full ImageNet: ~1.2M images, 1000 classes, ~90 epochs")


NameError: name 'config' is not defined



# Tiny ImageNet with Resnet50

In [None]:
# import torch
# from torch import nn, optim
# from torch_lr_finder import LRFinder
# import matplotlib.pyplot as plt

# # 3. Instantiate model, optimizer, and criterion
# device = "cuda" if torch.cuda.is_available() else "cpu"
# # Assuming 'model' is already defined and on the correct device
# # e.g., model = resnet34_cifar().to(device)
# optimizer = optim.Adam(model.parameters(), lr=1e-7)  # Start with a very small LR
# criterion = nn.CrossEntropyLoss()

# # 4. Run the LR finder
# lr_finder = LRFinder(model, optimizer, criterion, device=device)
# lr_finder.range_test(train_loader, end_lr=0.1, num_iter=300)
# # 5. Plot the results
# lr_finder.plot()

# # 6. Reset the model and optimizer to their initial states
# lr_finder.reset()
# #>>>>>>>>4.32e-03 for batch size 128,
# # LR: 9.05E-03 for batch size 256
# ##Suggested LR: 7.88E-03 for  batch size 512
