This project implements deep learning models for image classification on the CIFAR-10 dataset. It provides a complete pipeline for training, evaluating, and visualizing the performance of different CNN architectures.
The CIFAR-10 dataset consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class. This project offers:
- Training of custom CNN architectures from scratch
- Transfer learning using pre-trained ResNet50
- Comprehensive evaluation metrics and visualizations
- Modular codebase for easy experimentation
CIFAR/
│
├── main.py # Entry point for training and evaluation
├── README.md # This file
│
└── src/
├── data_processing/ # Data loading and augmentation
│ ├── augment.py # Data augmentation functions
│ └── utils.py # Data utilities
│
├── networks/ # Model architectures
│ ├── classic_network.py # Custom CNN architecture
│ ├── transfer_network.py # Transfer learning with ResNet50
│ └── utils.py # Network utilities
│
├── training/ # Training functionality
│ ├── trainer.py # Training loop implementation
│ └── utils.py # Training utilities
│
└── evaluation/ # Evaluation functionality
├── evaluate.py # Model evaluation
├── utils.py # Evaluation utilities
└── visualize.py # Visualization functions
To train a model from scratch:
python main.py --model classic --epochs 30 --batch_size 128 --lr 0.001 --gpuTo train using transfer learning with ResNet50:
python main.py --model transfer --epochs 20 --batch_size 64 --lr 0.0001 --gpu--model: Model architecture to use (classicortransfer)--epochs: Number of training epochs (default: 30)--batch_size: Batch size for training (default: 128)--lr: Learning rate (default: 0.001)--weight_decay: Weight decay for optimizer (default: 1e-4)--seed: Random seed (default: 42)--gpu: Use GPU if available (flag)--evaluate_only: Only run evaluation on a trained model (flag)
To evaluate a trained model without retraining:
python main.py --model classic --evaluate_only --gpuThe project implements several data augmentation techniques:
- Random cropping
- Random horizontal flips
- Random rotation
- Color jitter
-
ClassicCNN: A custom CNN architecture with:
- 4 convolutional blocks with increasing filter sizes
- Batch normalization
- Max pooling
- Dropout for regularization
- Fully connected layers
-
TransferResNet50: A transfer learning approach using:
- Pre-trained ResNet50 as feature extractor
- Custom classification head for CIFAR-10
- Accuracy (overall and per-class)
- Precision, recall, and F1 score
- Confusion matrix
- Feature embeddings visualization (t-SNE and PCA)
- Visualization of misclassified samples
- Training and validation curves
The results are saved in an output/session_X directory, where X is the session number. Each session directory contains:
checkpoints/: Model weights for each epoch and the best modelplots/: Visualization plots (confusion matrices, embeddings, etc.)test_results_*.txt: Detailed evaluation metricstraining_summary.txt: Summary of the training processstats_*.json: Training statistics for plotting
- Python 3.6+
- PyTorch
- torchvision
- numpy
- matplotlib
- scikit-learn
- tqdm
- pytorch-lightning