This repository is maintained to implement some state-of-the-art knowledge distillation and knowledge transfer methods.
Knowledge distillation was proposed to distill knowledge from a large teacher network to a smaller student network. KD can help the student model to achieve higher generalization performance. It's applications include model compression.
- Basic knowledge distillation
- Born-again Neural Networks
- Knowledge Transfer with Jacobian Matching
- Deep Mutual Learning
- Co-teaching
- One-the-fly Native Ensemble
- MentorNet