# Image Classification Training - Google Colab

Training notebook for DenseNet and ViT models on Tiny ImageNet dataset using Google Colab GPU.

## 1. Mount Google Drive

Google Drive is used to persist all training artifacts (datasets, checkpoints, results) across Colab sessions.

In [None]:
from google.colab import drive
import os

drive.mount('/content/drive')

DRIVE_PATH = '/content/drive/MyDrive/image-classification'
os.makedirs(DRIVE_PATH, exist_ok=True)
print(f'Working directory: {DRIVE_PATH}')

## 2. GPU Setup

**Required:** `Runtime > Change runtime type > Hardware accelerator > GPU (T4)`

In [None]:
import torch

print(f'GPU available: {torch.cuda.is_available()}')
if torch.cuda.is_available():
    print(f'GPU: {torch.cuda.get_device_name(0)}')
    print(f'Total memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB')

## 3. Clone Repository

Repository is cloned into Google Drive. Subsequent runs will skip this step if the directory already exists.

In [None]:
import os

if not os.path.exists(f'{DRIVE_PATH}/image-classification'):
    !git clone https://github.com/alejobarrera28/image-classification.git {DRIVE_PATH}/image-classification
    print('Repository cloned')
else:
    print('Repository already exists, skipping clone')

%cd {DRIVE_PATH}/image-classification

## 4. Install Dependencies

In [None]:
!pip install -r requirements.txt -q

## 5. Dataset Preparation

Tiny ImageNet dataset is downloaded and curated. Files are cached in Drive to avoid re-downloading.

In [None]:
import os

# Download dataset if not present
if not os.path.exists('data/raw/tiny-imagenet-200'):
    !mkdir -p data/raw
    !wget http://cs231n.stanford.edu/tiny-imagenet-200.zip -P data/raw/
    !unzip -q data/raw/tiny-imagenet-200.zip -d data/raw/
    !rm data/raw/tiny-imagenet-200.zip
    print('Dataset downloaded')
else:
    print('Dataset already exists')

# Curate dataset if not already done
if not os.path.exists('data/curated'):
    !python src/data/curate_data.py
    print('Dataset curated')
else:
    print('Curated dataset already exists')

## 6. Train DenseNet121

Training outputs (checkpoints, history, metrics) are saved to `results/densenet/`

In [None]:
!python src/training/train.py --model densenet --epochs 100 --batch_size 128

## 7. Train Vision Transformer (ViT)

Training outputs are saved to `results/vit/`

In [None]:
!python src/training/train.py --model vit --epochs 100 --batch_size 64 --optimizer adamw

## 8. Results

All results are stored in Google Drive at `MyDrive/image-classification/results/` and persist across sessions.

To resume training from a checkpoint:
```bash
!python src/training/train.py --model densenet --epochs 100 --batch_size 128 --resume results/densenet/best_model.pth
```

In [None]:
# List results
!ls -lh results/

# Optional: download as zip
# !zip -r results.zip results/
# from google.colab import files
# files.download('results.zip')

## 9. GPU Monitoring

In [None]:
!nvidia-smi