This repository contains implementations and experiments for a Deep Learning for Visual Computing course, covering image classification and semantic segmentation tasks using PyTorch.
- Exercise 1: Image Classification on CIFAR-10 using CNN architectures (ResNet18, custom CNN, Vision Transformer)
- Exercise 2: Semantic Segmentation on Cityscapes and Oxford-IIIT Pet datasets using SegFormer and FCN
Models: ResNet18, Custom CNN, Vision Transformer
Dataset: CIFAR-10 (60k images, 10 classes)
Features: Data augmentation, regularization, advanced optimizers, accuracy metrics
Models: SegFormer, FCN-ResNet50
Datasets: Oxford-IIIT Pet (3 classes), Cityscapes (19 classes)
Features: mIoU metrics, pre-training, fine-tuning
- Python 3.8+
- CUDA-compatible GPU (recommended)
- Clone the repository:
git clone <repository-url>
cd DLVC
- Install dependencies for Exercise 1:
cd exercise1
pip install -r requirements.txt
- Download datasets:
- CIFAR-10: Download from official website
- Cityscapes: Contact course instructors for preprocessed subset
- Oxford-IIIT Pet: Automatically downloaded via torchvision
Train ResNet18:
cd exercise1
python train_resnet18.py
Train custom CNN:
python train_yourCNN.py
Train Vision Transformer:
python train_yourViT.py
Test models:
python test_resnet18.py
python test_yourCNN.py
python test_yourViT.py
Generate result visualizations:
python generate_graphs.py
Train SegFormer:
cd exercise2
python train_segformer.py
Train FCN:
python train.py
Visualize results:
python viz_pets.py
Experimental results are stored in exercise1/tested_configs/
and exercise2/training/
with extensive hyperparameter exploration and performance comparisons.
- Weights & Biases / TensorBoard logging
- Comprehensive metrics and visualization
- Configurable training pipelines
- Pre-training and fine-tuning support
Educational project for Deep Learning for Visual Computing course.