Skip to content

ONNX Runtime v1.4.0

Compare
Choose a tag to compare
@yuslepukhin yuslepukhin released this 17 Jul 20:24

Key Updates

  • Performance optimizations for Transformer models
    • GPT2 - Enable optimizations for Attention with Past State and Attention Mask
    • BERT - Improve EmbedLayerNormalization fusion coverage
  • Quantization updates
    • Added new quantization operators: QLinearAdd, QAttention
    • Improved quantization performance for transformer based models on CPU
      • More graph fusion
      • Further optimization in MLAS kernel
      • Introduced pre-packing for constant Matrix B of DynamicQuantizeMatMul and Qattention
  • New Python IOBinding APIs (bind_cpu_input, bind_output, copy_outputs_to_cpu) allow easier benchmarking
    • Users no longer need to allocate inputs and outputs on non-CPU devices using third-party allocators.
    • Users no longer need to copy inputs to non-CPU devices; ORT handles the copy.
    • Users can now use copy_outputs_to_cpu to copy outputs from non-CPU devices to CPU for verification.
  • CUDA support for Einsum (opset12)
  • ONNX Runtime Training updates
    • Opset 12 support
    • New sample for training experiment using Huggingface GPT-2.
      • Upgraded docker image built from the latest PyTorch release
  • Telemetry is now enabled by default for Python packages and Github release zip files (C API); see more details on what/how telemetry is collected in ORT
  • [Coming soon] Availability of Python package for ONNX Runtime 1.4 for Jetpack 4.4

Execution Providers

New Execution Providers available for preview:

Contributions

Contributors to ONNX Runtime include members across teams at Microsoft, along with our community members:

snnn, tianleiwu, edgchen1, hariharans29, skottmckay, tracysh, yufenglee, fs-eire, codemzs, tiagoshibata, yuslepukhin, gwang-msft, wschin, smk2007, prabhat00155, liuziyue, liqunfu, ytaous, iK1D, BowenBao, askhade, pranavsharma, faxu, jywu-msft, ryanlai2, xzhu1900, KeDengMS, tlh20, smkarlap, weixingzhang, jeffbloo, RyanUnderhill, mrry, jgbradley1, stevenlix, zhanghuanrong, suffiank, Andrews548, pengwa, SherlockNoMad, orilevari, duli2012, yangchen-MS, yan12125, jornt-xilinx, ashbhandare, neginraoof, Tixxx, thiagocrepaldi, Craigacp, mayeut, chilo-ms, prasanthpul, martinb35, manashgoswami, zhangxiang1993, suryasidd, wangyems, kit1980, RandySheriffH, fdwr