Skip to content

Comparing performance of a small transformer model with and without Knowledge Distillation

Notifications You must be signed in to change notification settings

JakubTomaszewski/Knowledge-Distillation-in-Semantic-Segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SegFormer Semantic Segmentation with Knowledge Distillation

Overview

This project is being created for my Bachelor's Thesis - "Tuning Small Semantic Segmentation Models via Knowledge Distillation". The idea was to create a small and robust Semantic Segmentation model, for the purpose of segmenting road scenes in a fully autonomous vehicle. Hence, while preserving good performance the model has to work in real time.

To achive both, a technique known as Knowledge Distillation is employed in order to transfer knowledge from a large and complex SegFormer B5 model, to a small SegFormer B0 model, being lightweight enough to work on most devices in real time.

Technologies

Python 3.8

numpy

PyTorch

Method

Response-Based Knowledge Distillation in employed while training the model. The training pipeline is presented in the diagram below. See knowledge_distillation for the implementation details.

Creating the development environment

$ conda create --name <env> --file requirements_conda.txt python=3.8

or

$ conda create --name <env> python=3.8
$ pip install -r requirements.txt

Train

$ python src/train.py

Visualize training logs

mlflow ui --backend-store-uri src/models/mlflow_logs