Skip to content
Code implementation of Stagewise Knowledge Distillation paper.
Jupyter Notebook Python Shell
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.


Type Name Latest commit message Commit time
Failed to load latest commit information.

Data Efficient Stagewise Knowledge Distillation

Note: A new version of this paper based on semantic segmentation and image classification results will be released soon. The previous version described only the image classification results. This repo will soon be updated with details of the new experiments (code is already uploaded).

Stagewise Training Procedure

Code Implementation for Stagewise Knowledge Distillation

This repository presents the code implementation for Stagewise Knowledge Distillation, a technique for improving knowledge transfer between a teacher model and student model.


Architectures Used

Following ResNet architectures are used:

  • ResNet10
  • ResNet14
  • ResNet18
  • ResNet20
  • ResNet26
  • ResNet34

Note: ResNet34 is used as a teacher (being a standard architecture), while others are used as student models.

Datasets Used

Note: Imagenette and Imagewoof are subsets of ImageNet.

Code Organization

  • root/code/models/ - Contains code for models.
  • root/code/ - Code for modifying dataset to get smaller sized dataset.
  • root/code/ - Code for simultaneous training of all stages on complete dataset
  • root/code/ - Code for stagewise training on complete dataset.
  • root/code/ - Code for stagewise training on smaller sized dataset.
  • root/code/ - Some utility code.
  • root/notebooks/ - Contains notebooks with similar code as above.


If you use this code or method in your work, please cite using

    title={Stagewise Knowledge Distillation},
    author={Akshay Kulkarni and Navid Panchi and Shital Chiddarwar},
You can’t perform that action at this time.