What's new?

This release adds an optimized CPU/MLAS implementation of DequantizeLinear (8 bit) and introduces the build option client_package_build, which enables default options that are more appropriate for client/on-device workloads (e.g., disable thread spinning by default).

Build System & Packages

Add –client_package_build option (#25351) - @jywu-msft
Remove the python installation steps from win-qnn-arm64-ci-pipeline.yml (#25552) - @snnn

CPU EP

Add multithreaded/vectorized implementation of DequantizeLinear for int8 and uint8 inputs (SSE2, NEON) (#24818) - @adrianlizarraga

QNN EP

Add support for the Upsample, Einsum, LSTM, and CumSum operators (#24265, #24616, #24646, #24820) - @quic-zhaoxul, @1duo, @chenweng-quic, @Akupadhye
Fuse scale into Softmax (#24809) - @qti-yuduo
Enable DSP queue polling when performance is set to “burst” mode (#25361) - @quic-calvnguy
Update QNN SDK to version 2.36.1 (#25388) - @qti-jkilpatrick
Include the license file from QNN SDK in the Microsoft.ML.OnnxRunitme.QNN NuGet package (#25158) - @HectorSVC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ONNX Runtime v1.22.2

What's new?

Build System & Packages

CPU EP

QNN EP

Contributors

Uh oh!