Skip to content

ONNX Runtime QNN Execution Provider v2.1.1

Choose a tag to compare

@qti-mbadnara qti-mbadnara released this 20 May 04:05
· 1 commit to rel-2.1.1 since this release
6c58650

This is a patch release of the QNN Execution Provider, containing bug fixes and packaging updates.

ONNX Runtime Compatibility: >= 1.24.1 (compiled with v1.24.4)

QAIRT SDK Compatibility: 2.45.41

pip install onnxruntime==1.24.4
pip install onnxruntime-qnn==2.1.1

Bug Fixes

  • QNN EP: Fixed a per-tensor, per-inference memory leak in OrtTensorTypeAndShapeInfo during ExecuteGraph on the ABI path. (#326)
  • QNN EP: Fixed TryGetMaxSpillFillSize reading all EP contexts instead of only main contexts, which caused QNN_CONTEXT_ERROR_INVALID_CONFIG on multi-split weight-shared models. (#328)

Improvements

  • QNN EP: Switched onnxruntime_providers_qnn.dll to static MSVC runtime linkage, eliminating the runtime dependency on MSVCP140.dll and VCRUNTIME140.dll. (#241)
  • QNN EP: Reduced peak memory in model compatibility validation from ~200 MB to ~50 MB by removing the context blob version from compatibility checks, avoiding the fake context binary creation and preparation library load. (#366)

Packaging

  • Linux ARM64 Python wheels — promoted from preview (v2.1.0) to officially supported. As with Windows, wheels are published for Python 3.11 through Python 3.14.
  • Linux ARM64 .tgz archive — new distribution shipping the QNN EP shared library and headers for use outside of Python.

Known Issues

  • WoS AMD64 — Python 3.11 installer issue causes inference failure — On Windows on Snapdragon, ep.get_library_path() returns the amd64 folder path instead of arm64ec, causing inference to fail in the AMD64 Python 3.11 environment, due to a known issue with the installer. As a workaround, manually construct the path to the arm64ec library. This issue affects Python 3.11 only.

Platform Support

Package Windows ARM64 Windows x64 Linux ARM64
Python Wheel Inference AOT compilation + Inference Inference
NuGet Inference
ZIP Inference
tgz Inference

Full Changelog: rel-2.1.0...rel-2.1.1

Contributors

This release includes contributions from:

Arnav Deshpande, Ashwath Shankarnarayan, Badri Narayanan, Calvin Nguyen, Cheng-Hsin Weng, Chun-Chih Teng, Hua-Yu Chou, Hung-Jui Wang, Jeff Kilpatrick, Kuan-Yu Lin, Kyle Romero, Matthew Sinclair, Mike Hsu, Min Fong Hong, Samrat Dutta, Shubham Patel, Tirupathi Reddy T, Yathindra Kota, Yuduo Wu, Yu-Hung Chuang