Skip to content
Repository to track the progress in model compression and acceleration
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.


Type Name Latest commit message Commit time
Failed to load latest commit information. Update Dec 17, 2019
comparative_results.pdf Add files via upload Jun 7, 2019

Model Compression and Acceleration Progress

Repository to track the progress in model compression and acceleration

Low-rank approximation

  • T-Net: Parametrizing Fully Convolutional Nets with a Single High-Order Tensor (CVPR 2019) paper
  • MUSCO: Multi-Stage COmpression of neural networks (ICCVW 2019) paper | code (PyTorch)
  • Efficient Neural Network Compression (CVPR 2019) paper | code (Caffe)
  • Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling (ICLR 2019) paper | code (PyTorch)
  • Extreme Network Compression via Filter Group Approximation (ECCV 2018) paper
  • Ultimate tensorization: compressing convolutional and FC layers alike (NIPS 2016 workshop) paper | code (TensorFlow) | code (MATLAB, Theano + Lasagne)
  • Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications (ICLR 2016) paper
  • Accelerating Very Deep Convolutional Networks for Classification and Detection (IEEE TPAMI 2016) paper
  • Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition (ICLR 2015) paper | code (Caffe)
  • Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation (NIPS 2014) paper
  • Speeding up Convolutional Neural Networks with Low Rank Expansions (2014) paper

Pruning & Sparsification



Knowledge distillation


  • Learning Efficient Detector with Semi-supervised Adaptive Distillation (arxiv 2019) paper | code (Caffe)
  • Model compression via distillation and quantization (ICLR 2018) paper | code (Pytorch)
  • Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks (ICLR 2018 workshop) paper
  • Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks ( BMVC 2018) paper
  • Net2Net: Accelerating Learning via Knowledge Transfer (ICLR 2016) paper
  • Distilling the Knowledge in a Neural Network (NIPS 2014) paper
  • FitNets: Hints for Thin Deep Nets (2014) paper | code (Theano + Pylearn2)


TensorFlow implementation of three papers, results for CIFAR-10


  • XNOR-Net++ (2019) paper
  • Matrix and tensor decompositions for training binary neural networks (2019) paper
  • XNOR-Net (ECCV 2016) paper | code (Pytorch)

Architecture search

  • MobileNets
  • EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (ICML 2019) paper | code and pretrained models (TensorFlow)
  • MnasNet: Platform-Aware Neural Architecture Search for Mobile (CVPR 2019) paper | code (TensorFlow)
  • MorphNet: Fast & Simple Resource-Constrained Learning of Deep Network Structure (CVPR 2018) paper | code (TensorFlow)
  • ShuffleNets
    • ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design (ECCV 2018) paper
    • ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices (CVPR 2018) paper
  • Multi-Fiber Networks for Video Recognition (ECCV 2018) paper | code (PyTorch)
  • IGCVs
    • IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks (BMVC 2018) paper | code and pretrained models (MXNet)
    • IGCV2: Interleaved Structured Sparse Convolutional Neural Networks (CVPR 2018) paper
    • Interleaved Group Convolutions for Deep Neural Networks (ICCV 2017) paper

PhD thesis and overviews

  • Algorithms for speeding up convolutional neural networks (2018) thesis
  • Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges (2018) paper
  • Efficient methods and hardware for deep learning (2017) thesis


  • MUSCO - framework for model compression using tensor decompositions (PyTorch, TensorFlow)
  • Distiller - package for compression using pruning and low-precision arithmetic (PyTorch)
  • MorphNet - framework for neural networks architecture learning (TensorFlow)
  • Mayo - deep learning framework with fine- and coarse-grained pruning, network slimming, and quantization methods
  • PocketFlow - framework for model pruning, sparcification, quantization (TensorFlow implementation)
  • Keras compressor - compression using low-rank approximations, SVD for matrices, Tucker for tensors.
  • Caffe compressor K-means based quantization

Comparison of different approaches

Please, see comparative_results.pdf

Similar repos

You can’t perform that action at this time.