Skip to content

v1.4.0: ORTQuantizer and ORTOptimizer refactorization

Choose a tag to compare

@echarlaix echarlaix released this 08 Sep 17:56
· 1042 commits to main since this release

ONNX Runtime

  • Refactorization of ORTQuantizer (#270) and ORTOptimizer (#294)
  • Add ONNX Runtime fused Adam Optimizer (#295)
  • Add ORTModelForCustomTasks allowing ONNX Runtime inference support for custom tasks (#303)
  • Add ORTModelForMultipleChoice allowing ONNX Runtime inference for models with multiple choice classification head (#358)

Torch FX

  • Add FuseBiasInLinear a transformation that fuses the weight and the bias of linear modules (#253)

Improvements and bugfixes

  • Enable the possibility to disregard the precomputed past_key_values during ONNX Runtime inference of Seq2Seq models (#241)
  • Enable node exclusion from quantization for benchmark suite (#284)
  • Enable possibility to use a token authentication when loading a calibration dataset (#289)
  • Fix optimum pipeline when no model is given (#301)