Skip to content

[20220524] Compression Roadmap 2022

J-shang edited this page May 26, 2022 · 3 revisions

Design Note

  • The method of simulating the compression effect is to replace some nodes in the graph with the wrapped ones. But note that sometimes the method of wrapping only the current node is not equivalent to the actual compressed effect.

Work Items

  • base
    • evaluator (handle train, validate, hook, patch...)
      • api design
    • config list refactor
      1. specify compression target (input, output, weight, ...)
      2. specify compression algo (include related parameters, such as sparse pattern, quant bit)
    • Support for variable compression targets
      1. compressor & wrapper refactor, provide a unified interface for parsing config list.
      2. basic pruner refactor & quantizer design
    • Super compressor? most existed basic pruner/quantizer can implement by config super compressor?
      1. universal wrapper
  • pruning
    • refactor sparse pattern
      • metric calculator & sparsity allocator
    • migrate to evaluator
  • quantization
    • refactor design (key consideration: experiment, evaluator, wrapper, conv-bn-fusion)
  • experiment
    • wrap tuner as strategy
      • search space generator
    • support more pruners & quantizers
    • a good strategy (how to search in search space)
  • speedup
    • mask propagation stands alone as a module
    • quantization speedup supports more backend
  • benchmark
  • visualization