Skip to content

ONNX Runtime v1.15.0

Compare
Choose a tag to compare
@snnn snnn released this 25 May 01:44
· 1 commit to rel-1.15.0 since this release
638146b

Announcements

Starting from the next release(ONNX Runtime 1.16.0), at operating system level we will drop the support for

  • iOS 11 and below. iOS 12 will be the minimum supported version.
  • CentOS 7, Ubuntu 18.04, and any Linux distro without glibc version >=2.28.

At compiler level we will drop the support for

  • GCC version <= 9
  • Visual Studio 2019

Also, we will remove the onnxruntime_DISABLE_ABSEIL build option since we will upgrade protobuf and the new protobuf version will need abseil.

General

  • Added support for ONNX Optional type in C# API
  • Added collectives to support multi-GPU inferencing
  • Updated macOS build machines to macOS-12, which comes with Xcode 14.2 and we should stop using Xcode 12.4
  • Added Python 3.11 support (deprecate 3.7, support 3.8-3.11) in packages for Onnxruntime CPU, Onnxruntime-GPU, Onnxruntime-directml, and onnxruntime-training.
  • Updated to CUDA 11.8. ONNX Runtime source code is still compatible with CUDA 11.4 and 12.x.
  • Dropped the support for Windows 8.1 and below
  • Eager mode code and onnxruntime_ENABLE_EAGER_MODE cmake option are deleted.
  • Upgraded Mimalloc version from 2.0.3 to 2.1.1
  • Upgraded protobuf version from 3.18.3 to 21.12
  • New dependency: cutlass, which is only used in CUDA/TensorRT packages.
  • Upgraded DNNL from 2.7.1 to 3.0

Build System

  • On POSIX systems by default we disallow using "root" user to build the code. If needed, you can append "--allow_running_as_root" to your build command to bypass the check.
  • Add the support for building the source natively on Windows ARM64 with Visual Studio 2022.
  • Added a Gradle wrapper and updated Gradle version from 6.8.3 to 8.0.1. (Gradle is the tool for building ORT Java package)
  • When doing cross-compiling, the build scripts will try to download a prebuit protoc from Github instead of building the binary from source. Because now protobuf has many dependencies. It is not easy to setup a build environment for protobuf.

Performance

Execution Providers

Two new execution providers: JS EP and QNN EP.

TensorRT EP

  • Official support for TensorRT 8.6
  • Explicit shape profile overrides
  • Support for TensorRT plugins via ORT custom op
  • Improve support for TensorRT options (heuristics, sparsity, optimization level, auxiliary stream, tactic source selection etc.)
  • Support for TensorRT timing cache
  • Improvements to our test coverage, specifically for opset16-17 models and package pipeline unit test coverage.
  • Other misc bugfixes and improvements.

OpenVINO EP

  • Support for OpenVINO 2023.0
  • Dynamic shapes support for iGPU
  • Changes to OpenVINO backend to improve first inference latency
  • Deprecation of HDDL-VADM and Myriad VPU support
  • Misc bug fixes.

QNN EP

DirectML EP:

AzureEP

  • Added support for OpenAI whisper model
  • Available in a Nuget pkg in addition to Python

Mobile

New packages

  • Swift Package Manager for onnxruntime
  • Nuget package for onnxruntime-extensions (supports Android/iOS for MAUI/Xamarin)
  • React Native package for onnxruntime can optionally include onnxruntime-extensions

Pre/Post processing

  • Added support for built-in pre and post processing for NLP scenarios: classification, question-answering, text-prediction

  • Added support for built-in pre and post processing for Speech Recognition (Whisper)

  • Added support for built-in post processing for Object Detection (YOLO). Non-max suppression, draw bounding boxes

  • Additional CoreML and NNAPI kernels to support customer scenarios

    • NNAPI: BatchNormalization, LRN
    • CoreML: Div, Flatten, LeakyRelu, LRN, Mul, Pad, Pow, Sub

Web

  • [preview] WebGPU support
  • Support building the source code with "MinGW make" on Windows.

ORT Training

On-device training:

  • Official package for On-Device Training now available. On-device training extends ORT Inference solutions to enable training on edge devices.
  • APIs and Language bindings supported for C, C++, Python, C#, Java.
  • Packages available for Desktop and Android.
  • For custom builds refer build instructions.

Others

  • Added graph optimizations which leverage the sparsity in the label data to improve performance. With these optimizations we see performance gains ranging from 4% to 15% for popular HF models over baseline ORT.
  • Vision transformer models like ViT, BEIT and SwinV2 see upto 44% speedup with ORT Training+ DeepSpeed over PyTorch eager mode on AzureML.
  • Added optimizations for SOTA models like Dolly and Whisper. ORT Training + DS now gives ~17% speedup for Whisper and ~4% speedup for Dolly over PyTorch eager mode. Dolly optimizations on main branch show a ~40% over eager mode.

Known Issues

  • The onnxruntime-training 1.15.0 packages published to pypi.org were actually built in Debug mode instead of Release mode. You can get the right one from https://download.onnxruntime.ai/ . We will fix the issue in the next patch release.
  • XNNPack EP does not work on x86 CPUs without AVX-512 instructions, because we used wrong alignment when allocating buffers for XNNPack to use.
  • The CUDA EP source code has a build error when CUDA version <11.6. See #16000.
  • The onnxruntime-training builds are missing the training header files.

Contributions

Contributors to ONNX Runtime include members across teams at Microsoft, along with our community members:
snnn, fs-eire, edgchen1, wejoncy, mszhanyi, PeixuanZuo, pengwa, jchen351, cloudhan, tianleiwu, PatriceVignola, wangyems, adrianlizarraga, chenfucn, HectorSVC, baijumeswani, justinchuby, skottmckay, yuslepukhin, RandyShuai, RandySheriffH, natke, YUNQIUGUO, smk2007, jslhcl, chilo-ms, yufenglee, RyanUnderhill, hariharans29, zhanghuanrong, askhade, wschin, jywu-msft, mindest, zhijxu-MS, dependabot[bot], xadupre, liqunfu, nums11, gramalingam, Craigacp, fdwr, shalvamist, jstoecker, yihonglyu, sumitsays, stevenlix, iK1D, pranavsharma, georgen117, sfatimar, MaajidKhan, satyajandhyala, faxu, jcwchen, hanbitmyths, jeffbloo, souptc, ytaous kunal-vaishnavi