# Tiny ImageNet Training with ResNet18

This notebook trains a ResNet18 model on the Tiny ImageNet dataset (200 classes, 64x64 images).

**Requirements:**
- Enable GPU runtime: Runtime → Change runtime type → GPU (T4 recommended)

**Note:** Dataset and models are stored locally on the Colab VM and will be deleted when the runtime disconnects. Make sure to download your trained models before ending the session!

## 1. Environment Setup & GPU Check

In [None]:
import torch
import os

# Check GPU availability
if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    gpu_memory = torch.cuda.get_device_properties(0).total_memory / 1e9
    print(f"✓ GPU Runtime Enabled")
    print(f"  Device: {gpu_name}")
    print(f"  Memory: {gpu_memory:.1f} GB")
    print(f"  CUDA Version: {torch.version.cuda}")
else:
    print("⚠️  WARNING: No GPU detected!")
    print("   Training will be VERY slow on CPU.")
    print("   Please enable GPU: Runtime → Change runtime type → GPU")
    print("\n   Continue anyway? This may take several hours...")

## 2. Clone Repository

Clone the training code from GitHub.

In [None]:
# Clone repository
REPO_URL = 'https://github.com/abhi1021/resnet50-imagenet-1k'
REPO_DIR = '/content/resnet50-imagenet-1k'

if os.path.exists(REPO_DIR):
    print("Repository already cloned, pulling latest changes...")
    !cd {REPO_DIR} && git pull
else:
    print(f"Cloning repository from {REPO_URL}...")
    !git clone {REPO_URL} {REPO_DIR}

# Change to repository directory
%cd {REPO_DIR}

print(f"\n✓ Repository ready at {REPO_DIR}")

## 4. Download Tiny ImageNet Dataset

Downloads the dataset to local storage (~237 MB compressed, ~500 MB extracted).

**Note:** The dataset will be deleted when the Colab runtime disconnects. It will need to be re-downloaded in new sessions.

In [None]:
import os
import zipfile
from pathlib import Path

DATASET_URL = 'http://cs231n.stanford.edu/tiny-imagenet-200.zip'
DATA_DIR = '/content/data'
DATASET_ZIP = os.path.join(DATA_DIR, 'tiny-imagenet-200.zip')
DATASET_DIR = os.path.join(DATA_DIR, 'tiny-imagenet-200')
TRAIN_DIR = os.path.join(DATASET_DIR, 'train')
VAL_DIR = os.path.join(DATASET_DIR, 'val')

# Create data directory
os.makedirs(DATA_DIR, exist_ok=True)

# Check if dataset already exists
if os.path.exists(TRAIN_DIR) and os.path.exists(VAL_DIR):
    num_train_classes = len([d for d in os.listdir(TRAIN_DIR) if os.path.isdir(os.path.join(TRAIN_DIR, d))])
    num_val_images = len([f for f in Path(VAL_DIR).rglob('*.JPEG')])
    
    if num_train_classes == 200 and num_val_images > 0:
        print("✓ Dataset already exists in local storage")
        print(f"  Train classes: {num_train_classes}")
        print(f"  Val images: {num_val_images}")
        print("  Skipping download...\n")
    else:
        print("⚠️  Dataset incomplete, re-downloading...")
        os.system(f'rm -rf {DATASET_DIR}')
        needs_download = True
else:
    print("Dataset not found. Downloading...")
    needs_download = True

if 'needs_download' in locals() and needs_download:
    # Check if zip file already exists
    if os.path.exists(DATASET_ZIP):
        zip_size_mb = os.path.getsize(DATASET_ZIP) / (1024 * 1024)
        print(f"✓ Zip file already exists: {DATASET_ZIP}")
        print(f"  Size: {zip_size_mb:.1f} MB")
        print("  Skipping download, proceeding to extraction...\n")
    else:
        print(f"Downloading Tiny ImageNet from {DATASET_URL}")
        print("This may take 2-5 minutes...\n")
        
        # Download
        !wget -q --show-progress {DATASET_URL} -O {DATASET_ZIP}
        print("\n✓ Download complete")
    
    print("\nExtracting dataset to local storage...")
    with zipfile.ZipFile(DATASET_ZIP, 'r') as zip_ref:
        zip_ref.extractall(DATA_DIR)
    
    # Clean up zip file to save space
    os.remove(DATASET_ZIP)
    
    print("✓ Dataset extracted and ready")
    print(f"  Location: {DATASET_DIR}")

# Verify dataset structure
print("\n📊 Dataset Information:")
print(f"  Train directory: {TRAIN_DIR}")
print(f"  Val directory: {VAL_DIR}")
print(f"  Number of classes: 200")
print(f"  Image size: 64x64")
print(f"  Train images per class: 500")
print(f"  Validation images: 10,000")

## 4. Install Dependencies

Install required Python packages for training.

In [None]:
# Install dependencies from requirements.txt using pip
print("Installing dependencies from requirements.txt...\n")

!pip install -r requirements.txt

print("\n✓ All dependencies installed")

# Verify installation
import torch
import torchvision
import albumentations as A
print(f"  PyTorch version: {torch.__version__}")
print(f"  Torchvision version: {torchvision.__version__}")
print(f"  Albumentations version: {A.__version__}")

## 5. Train the Model

Train ResNet18 on Tiny ImageNet with Colab-optimized parameters.

**Training Parameters:**
- Model: ResNet18 (200 classes)
- Epochs: 20
- Batch size: 256
- Image size: 64x64
- Optimizer: SGD (lr=0.1, momentum=0.9, weight_decay=5e-4)

**Expected training time:**
- With GPU (T4): ~30-40 minutes
- With CPU: ~8-12 hours (not recommended)

In [None]:
# Set model directory
MODEL_DIR = '/content/saved_model'
os.makedirs(MODEL_DIR, exist_ok=True)

# Run training script with optimized parameters
!python neural_network_analysis/train.py \
    --train-dir {TRAIN_DIR} \
    --val-dir {VAL_DIR} \
    --model-dir {MODEL_DIR} \
    --batch-size 256 \
    --img-size 64 \
    --num-workers 2 \
    --epochs 20 \
    --num-classes 200

## 6. Results & Trained Model

The trained model has been saved to local storage.

**⚠️ IMPORTANT:** Models are stored locally and will be deleted when the runtime disconnects. Download them now!

In [None]:
import os

MODEL_DIR = '/content/saved_model'

print("📁 Trained Model Location:")
print(f"  {MODEL_DIR}")
print("\nSaved files:")

if os.path.exists(MODEL_DIR):
    for file in os.listdir(MODEL_DIR):
        file_path = os.path.join(MODEL_DIR, file)
        if os.path.isfile(file_path):
            size_mb = os.path.getsize(file_path) / (1024 * 1024)
            print(f"  - {file} ({size_mb:.2f} MB)")
else:
    print("  No models found. Training may have failed.")

print("\n💡 How to Download Models:")
print("  1. Click the folder icon on the left sidebar")
print("  2. Navigate to /content/saved_model/")
print("  3. Right-click on the .pth file and select 'Download'")
print("\n⚠️  WARNING: Models will be deleted when runtime disconnects!")
print("   Download them before ending your session.")

## Optional: Load and Test the Model

In [None]:
import torch
import os

MODEL_DIR = '/content/saved_model'

# Find the best model checkpoint
model_files = [f for f in os.listdir(MODEL_DIR) if f.endswith('.pth')]

if model_files:
    # Load the checkpoint
    best_model_path = os.path.join(MODEL_DIR, model_files[0])
    checkpoint = torch.load(best_model_path)
    
    print(f"✓ Loaded model: {model_files[0]}")
    print(f"  Epoch: {checkpoint.get('epoch', 'N/A')}")
    print(f"  Best Accuracy: {checkpoint.get('best_acc', 'N/A'):.2f}%")
else:
    print("No model checkpoints found.")

## Optional: Download Model Using Code

Alternative method to download the trained model programmatically.

In [None]:
from google.colab import files
import os

MODEL_DIR = '/content/saved_model'

# Download all .pth files
model_files = [f for f in os.listdir(MODEL_DIR) if f.endswith('.pth')]

if model_files:
    print("Downloading trained models...\n")
    for model_file in model_files:
        model_path = os.path.join(MODEL_DIR, model_file)
        print(f"Downloading: {model_file}")
        files.download(model_path)
    print("\n✓ All models downloaded!")
else:
    print("No model files found to download.")