[20220524] Compression Roadmap 2022

Design Note

The method of simulating the compression effect is to replace some nodes in the graph with the wrapped ones. But note that sometimes the method of wrapping only the current node is not equivalent to the actual compressed effect.

Work Items

base
- evaluator (handle train, validate, hook, patch...)
  - api design
- config list refactor
  1. specify compression target (input, output, weight, ...)
  2. specify compression algo (include related parameters, such as sparse pattern, quant bit)
- Support for variable compression targets
  1. compressor & wrapper refactor, provide a unified interface for parsing config list.
  2. basic pruner refactor & quantizer design
- Super compressor? most existed basic pruner/quantizer can implement by config super compressor?
  1. universal wrapper
pruning
- refactor sparse pattern
  - metric calculator & sparsity allocator
- migrate to evaluator
quantization
- refactor design (key consideration: experiment, evaluator, wrapper, conv-bn-fusion)
experiment
- wrap tuner as strategy
  - search space generator
- support more pruners & quantizers
- a good strategy (how to search in search space)
speedup
- mask propagation stands alone as a module
- quantization speedup supports more backend
benchmark
visualization

This wiki is a journal that tracks the development of NNI. It's not guaranteed to be up-to-date. Read NNI documentation for latest information: https://nni.readthedocs.io/en/latest/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[20220524] Compression Roadmap 2022

Design Note

Work Items

Clone this wiki locally