Pulse · microsoft/onnxruntime · GitHub

February 7, 2025 – March 7, 2025

Overview

216 Active pull requests

121 Active issues

Could not load contribution data

Please try again later

1 Release published by 1 person

v1.20.2 ONNX Runtime v1.20.2 [QNN-only]
published Feb 12, 2025

164 Pull requests merged by 58 people

replace usage of gsl::narrow and gsl::narrow_cast in WebGPU EP
#23926 merged Mar 7, 2025
Fix license in example test code.
#23936 merged Mar 7, 2025
Create a packaging pipeline for a custom nuget package
#23918 merged Mar 7, 2025
[AIX] External data handling
#23859 merged Mar 7, 2025
Updated ov version in pipeline (#595)
#23882 merged Mar 7, 2025
Fix ConvInteger handling of optional inputs.
#23935 merged Mar 7, 2025
Updated run_CIs_for_external_pr.py to support the Windows OpenVINO CI pipeline
#23931 merged Mar 7, 2025
fix binplace file in web pipeline
#23930 merged Mar 7, 2025
Enabling L2+ Optimizations for EPs
#23517 merged Mar 7, 2025
Example custom op with output type inferencing
#23916 merged Mar 7, 2025
Support all block sizes that are multiples of 32 for DP4A
#23907 merged Mar 7, 2025
Exclude MAUI projects from GPU C# packaging builds
#23923 merged Mar 7, 2025
[WebGPU EP] SoftMax Implementation
#23538 merged Mar 7, 2025
Adding OpenVINO Windows CI Pipeline
#23919 merged Mar 7, 2025
enable WebGPU EP in WebAssembly build
#23913 merged Mar 6, 2025
[JSEP/WebGPU] Fixed error in softmax dispatch.
#23906 merged Mar 6, 2025
WebGPU: Remove deprecated subgroups-f16 from WebGPU native and JS EP
#23898 merged Mar 6, 2025
Ensure that the 'cmake_minimum_required' is version 3.5 or greater
#23888 merged Mar 6, 2025
[WebNN] Accept Float16Array for float16 data type if it is available
#23894 merged Mar 6, 2025
[webgpu] support Pad operator
#23141 merged Mar 6, 2025
[webgpu] Restore MatMulNBits workgroup size for Phi-3.5
#23349 merged Mar 6, 2025
Round 2 of cherry-picks into rel-1.21.0
#23899 merged Mar 6, 2025
[js/web] improve workaround for bundlers
#23902 merged Mar 6, 2025
Dynamo export and improve benchmark script for SAM2 encoder
#23887 merged Mar 5, 2025
[WebGPU EP] introduce BiasAdd contrib op
#23861 merged Mar 5, 2025
[WebGPU-EP Native] Add ReduceMean
#23860 merged Mar 5, 2025
Fix enable_pix_capture build for WebGPU
#23857 merged Mar 5, 2025
Fix formatting in snapdragon.md
#23900 merged Mar 5, 2025
Add snapdragon tutorial
#23890 merged Mar 5, 2025
[QNN-EP]: Fix inference failures while running with htp_shared_memory
#23892 merged Mar 5, 2025
[TensorRT EP] Add doc for trt_op_types_to_exclude
#23893 merged Mar 5, 2025
[QNN EP Docs] Update docs for building QNN EP as shared or static library
#23873 merged Mar 5, 2025
Enable QNN EP weight sharing generation using public API
#23702 merged Mar 5, 2025
Doc update relate to EPContext model default name
#23865 merged Mar 4, 2025
Add dawn to ThirdPartyNotices
#23876 merged Mar 4, 2025
Allow using extended minimal build for several EPs
#23834 merged Mar 4, 2025
Change gsl::byte to std::byte
#23872 merged Mar 4, 2025
[OpenVINO] Fix a build warning
#23877 merged Mar 4, 2025
[js/webgpu] Reland the optimization of ConvTranspose
#23858 merged Mar 4, 2025
[Doc] Update CUDA option prefer_nhwc
#23812 merged Mar 4, 2025
[js/common] allows using Uint16Array as data for float16 tensor
#23827 merged Mar 3, 2025
Make Nuget QNN package pipeline 1ES compliant
#23805 merged Mar 3, 2025
Change the logic to generate the default ep context file name
#23788 merged Mar 3, 2025
Quant tool: Consistent get_qdq_config and get_qnn_qdq_config behavior
#23856 merged Mar 2, 2025
Fix typos in csharp/src/Microsoft.ML.OnnxRuntime/
#23848 merged Mar 1, 2025
Fix typo: change Upample to Upsample.
#23838 merged Mar 1, 2025
Model Builder API
#23223 merged Feb 28, 2025
Cherry-picks into rel-1.21.0
#23846 merged Feb 28, 2025
Fix flash attention for GQA (Phi4)
#23850 merged Feb 28, 2025
Revert changes onn mac-react-native-ci-pipeline.yml
#23845 merged Feb 27, 2025
[Mlas] Unblock hardcoded matmul blocking size
#23815 merged Feb 27, 2025
Increase npm package pipeline ReactNative_CI_iOS timeout to 120 mins
#23825 merged Feb 27, 2025
[ORT/CI_Pipeline] Use --enable_generic_interface in ORT builds for EP testing
#23801 merged Feb 27, 2025
Quant tool: Add nodes_to_exclude in get_qnn_qdq_config
#23779 merged Feb 27, 2025
Update onnxruntime_external_deps.cmake: add missing EXCLUDE_FROM_ALL
#23829 merged Feb 27, 2025
[OVEP] Update support for Contrib Ops
#23789 merged Feb 27, 2025
upgrade emsdk to 4.0.4
#23819 merged Feb 27, 2025
[webgpu] Fix alignment issues in shader code
#23776 merged Feb 27, 2025
[TensorRT EP] update oss parser to latest
#23710 merged Feb 27, 2025
[ARM CPU] Fix flaky hgemmb ut
#23814 merged Feb 27, 2025
Make Nuget CUDA package pipeline 1ES compliant
#23804 merged Feb 26, 2025
Upgrade React Native to 0.73
#23575 merged Feb 26, 2025
[webgpu] support resize operator
#23780 merged Feb 26, 2025
Conveting npm packaging pipeline to 1ES
#23767 merged Feb 26, 2025
Make Nuget package pipeline 1ES compliant
#23803 merged Feb 26, 2025
[QNN EP] Re-enable several disabled QNN-EP UTs
#23799 merged Feb 26, 2025
[VitisAI] add new interfece
#23777 merged Feb 25, 2025
[QNN EP] Use absolute path of libcdsprpc.dll on Windows so it doesn't need to be copied anywhere.
#23791 merged Feb 24, 2025
Create dependencies.md for improving build document
#23786 merged Feb 23, 2025
Bump version from 1.21 to 1.22
#23787 merged Feb 23, 2025
[webgpu] Enable FlashAttention for GQA
#23761 merged Feb 22, 2025
[WebNN] Fix missing parameter
#23778 merged Feb 22, 2025
Uploaded deepseek blog, ready for post.
#23740 merged Feb 22, 2025
Set build user's uid when creating Migraphx/ROCM docker images
#23657 merged Feb 21, 2025
[TensorRT EP] Add new provider option to exclude ops from running on TRT
#23705 merged Feb 21, 2025
Update cmake_cuda_architecture to control package size
#23671 merged Feb 21, 2025
[webgpu] Implement SubGroupMatrix based MatMulNBits for Metal
#23729 merged Feb 21, 2025
[Optimizer] Fix exception for Q -> DQ sequence with different scale types
#23771 merged Feb 21, 2025
OVEP: Bug Fixes, Refactoring, and Contrib Ops Update
#23742 merged Feb 21, 2025
Shape inference: GatherBlockQuantized dispatcher
#23748 merged Feb 21, 2025
[QNN EP] Passthrough EP Parameters in Node
#23468 merged Feb 20, 2025
[JSEP] fix scatter-nd jsep kernel
#23755 merged Feb 20, 2025
[onnxruntime/build] Add CI testing for ORT build with generic interface
#23530 merged Feb 20, 2025
Rope imbedding kernel to use avx2
#23694 merged Feb 20, 2025
Add a new build flag to build.py for using with vcpkg
#23723 merged Feb 20, 2025
Capacity aware partitioning
#22766 merged Feb 20, 2025
[ARM CPU] Enable FP16 kernels for GQA op
#23746 merged Feb 20, 2025
[webgpu] Use components for VxAttentionScore
#23726 merged Feb 20, 2025
[AIX]eigen update fix and test failures fix
#23751 merged Feb 20, 2025
Add condition to gpu wheel build flag
#23760 merged Feb 20, 2025
Fix security vulnerability with Whisper export
#23743 merged Feb 20, 2025
[Doc] Update CUDA and cuDNN installation and preload
#23708 merged Feb 20, 2025
[QNN EP] Include QNN error handle value in fallback error message.
#23756 merged Feb 20, 2025
[AIX] cmake cleanup
#23752 merged Feb 20, 2025
Replace Linux A10 pools with A100
#23547 merged Feb 19, 2025
Quantization tool: Use nanmin, nanmax, nanmean in calibrator
#23749 merged Feb 19, 2025
[VitisAI] fix deinit vitisai ep
#23725 merged Feb 19, 2025
[QNN EP] Build Python 3.13 packages for QNN
#23706 merged Feb 19, 2025
[CUDA] Update preload_dlls to coexist with PyTorch
#23744 merged Feb 19, 2025
Bump esbuild from 0.19.3 to 0.25.0 in /js
#23639 merged Feb 19, 2025
Add migration guide
#23482 merged Feb 18, 2025
Update Eigen to the latest
#23717 merged Feb 18, 2025
[QNN] MatMulAddFusion and Reshape Related Fusion
#22494 merged Feb 18, 2025
Increase the python version requirement in CMakeLists.txt from 3.8 to 3.10
#23718 merged Feb 18, 2025
[js/web] remove "types" from "exports" field in package.json
#23733 merged Feb 18, 2025
Fix ACL option parsing
#23586 merged Feb 17, 2025
Fix attention fusion in conformer encoder
#23711 merged Feb 16, 2025
[CUDA] Preload dependent DLLs
#23674 merged Feb 15, 2025
Create DeepSeek-R1-Distill-Qwen-python.md
#23709 merged Feb 15, 2025
[webgpu] Use workgroup_idx instead of workgroup_id.x
#23696 merged Feb 14, 2025
VCPKG improvements
#23688 merged Feb 14, 2025
[webgpu] Fix MatMulNBits prefill shader synchronization
#23663 merged Feb 14, 2025
Update Wheel location for GenAI
#23690 merged Feb 14, 2025
Add execution provider arg
#23692 merged Feb 14, 2025
Add extra requires for cuda/cudnn DLLs to onnxruntime-gpu python package
#23659 merged Feb 14, 2025
Remove duplicated mimalloc def
#23695 merged Feb 14, 2025
Add new kernels
#23220 merged Feb 14, 2025
Change Execution Provider arg
#23691 merged Feb 14, 2025
[VitisAI] fix throw on dfs
#23678 merged Feb 14, 2025
Bump ruff from 0.9.4 to 0.9.5
#23624 merged Feb 14, 2025
[QNN EP] Dump QNN json graph
#22843 merged Feb 14, 2025
Exclude node quantization in RTN
#23683 merged Feb 13, 2025
[ARM CPU] add notrans hgemm mlas kernel
#23668 merged Feb 13, 2025
WIP: DP4AMatMul fix matmul for subgoup size 64 GPUs
#23637 merged Feb 13, 2025
Avoid compiling wil on non-Windows platforms
#23675 merged Feb 13, 2025
Update Dockerfile.manylinux2_28_rocm
#23650 merged Feb 13, 2025
Set ANDROID_AVD_HOME env variable
#23672 merged Feb 13, 2025
Mapping ORT verbose log level to QNN verbose log level
#23673 merged Feb 13, 2025
Remove training CIs from external PR CI list.
#23425 merged Feb 13, 2025
Enable Relocatable Device Code (RDC) to build ORT with cuda 12.8
#23562 merged Feb 13, 2025
Fix broken links in website
#23654 merged Feb 12, 2025
[CUDA] Not link CUDNN sub libs
#23656 merged Feb 12, 2025
Update deploy pages action version
#23670 merged Feb 12, 2025
add op_types_to_quantize to get_qnn_qdq_config
#23458 merged Feb 12, 2025
Revert "remove --use_vcpkg flag for Python-CUDA-Packaging-Pipeline"
#23651 merged Feb 12, 2025
[webgpu] fixes buffer handle leak in cache manager
#23655 merged Feb 12, 2025
disable codeql for cuda gpu pipelines
#23652 merged Feb 12, 2025
Upgrade emsdk version to v4.0.3
#23633 merged Feb 12, 2025
Update upload pages artifact action version
#23653 merged Feb 11, 2025
[WebNN] Add op support validation for decomposed WebNN ops
#23370 merged Feb 11, 2025
Add VCPKG's prerequisites to AMD GPU EPs docker files
#23636 merged Feb 11, 2025
Enable averagepool tests
#23595 merged Feb 11, 2025
Fix an installation issue related to absl
#23641 merged Feb 11, 2025
Remove unused local variables
#23634 merged Feb 11, 2025
Update win-ort-main to tip main 250211
#23646 merged Feb 11, 2025
[WebNN EP] Automatically move input CPU tensors to ml-tensor
#23073 merged Feb 11, 2025
use correct total length to fix static kv_cache performance
#23615 merged Feb 11, 2025
remove --use_vcpkg flag for Python-CUDA-Packaging-Pipeline
#23631 merged Feb 11, 2025
Add python_requires to package metadata
#23604 merged Feb 11, 2025
[QNN EP] Add QNN EP to ARM64X build targets
#23635 merged Feb 11, 2025
[webgpu] no longer need pass-in gpu adapter for custom context
#23593 merged Feb 10, 2025
Fix logic for selecting alternate name for blob
#23617 merged Feb 10, 2025
[VitisAI] Add vaip Integration Using FetchContent (Cherry-pick of PR#22038 to win-ort-main branch)
#23608 merged Feb 10, 2025
[ARM CPU] Add fp16 mlas kernels for exp, tanh, softmax, logsoftmax, softcap
#23597 merged Feb 10, 2025
Update pybind and json to the latest
#23589 merged Feb 10, 2025
Migrate iOS release pipeline to 1 ES
#23606 merged Feb 10, 2025
Increase timeout for Windows TensorRT CI
#23625 merged Feb 10, 2025
[ORT 1.20.2 Release] Cherry pick 1st round
#23574 merged Feb 10, 2025
fix on trtCudaVersion
#23616 merged Feb 8, 2025
update run CI script
#23621 merged Feb 8, 2025
[WebGPU] Support PIX Capture for WebGPU EP
#23192 merged Feb 8, 2025
Fix for C4267 warning
#23610 merged Feb 8, 2025
Validate the context_file_path before EP compile graphs
#23611 merged Feb 8, 2025
[webgpu] Use pushErrorScope()/popErrorScope() once for an inference run
#23438 merged Feb 7, 2025

52 Pull requests opened by 33 people

Enable multithreading on FP16 to FP32 cast operator
#23619 opened Feb 7, 2025
Integrate KleidiAI for MatMulNBits via MlasQNBitGemm
#23627 opened Feb 10, 2025
WIP: Enable FA for GQA
#23630 opened Feb 10, 2025
Bump transformers from 4.41.2 to 4.48.0 in /onnxruntime/python/tools/transformers/models/stable_diffusion/requirements
#23645 opened Feb 11, 2025
Reenable some averagepool tests that fail
#23649 opened Feb 11, 2025
Test CUDNN_FRONTEND_SKIP_JSON_LIB=ON
#23660 opened Feb 12, 2025
Make quantize shader work for all gpus
#23676 opened Feb 13, 2025
Cleanup onnx test runner's code
#23693 opened Feb 13, 2025
[WIP] migrate WebGPU EP to WebAssembly to replace JSEP
#23697 opened Feb 14, 2025
[webgpu]Add MaxPool and AveragePool
#23714 opened Feb 15, 2025
[VitisAI] export Graph::SetName to VitisA IEP
#23731 opened Feb 18, 2025
Updated MetaHuman testimonial from Epic Games.
#23737 opened Feb 18, 2025
Investigate crash due to empty size
#23753 opened Feb 19, 2025
Subgroupmatmul memory optimzations
#23758 opened Feb 19, 2025
Enable test_gathernd_example_int32_batch_dim1
#23763 opened Feb 20, 2025
Check for continuous decoding in MHA
#23766 opened Feb 20, 2025
RoPE fp16 avx
#23772 opened Feb 21, 2025
[webgpu] Optimize MatMulNBits f16 prefill shader for subgroup size 32
#23773 opened Feb 21, 2025
NHWC DepthToSpace U8 and its transformation
#23784 opened Feb 21, 2025
Make python package pipeline 1ES compliant
#23800 opened Feb 24, 2025
Make python CUDA package pipeline 1ES compliant
#23802 opened Feb 24, 2025
Make Cuda packaging pipeline 1ES compliant
#23806 opened Feb 24, 2025
[WIP] Flash attention for generation
#23808 opened Feb 25, 2025
Add Snapdragon NPU tutorial
#23813 opened Feb 25, 2025
Add OpenCL EP
#23830 opened Feb 27, 2025
[WebNN] Better int64 integration
#23831 opened Feb 27, 2025
[mobile/reactnative] Remove namespace from AndroidManifest.XML to resolve warning
#23847 opened Feb 27, 2025
[VitisAI] Just for internal test
#23849 opened Feb 28, 2025
[OpenVINO]Session Options Appended After AppendExecutionProvider
#23852 opened Feb 28, 2025
Synchronize patch files, fix resource compiler invocations in some situations
#23855 opened Feb 28, 2025
Bump ruff from 0.9.5 to 0.9.9
#23863 opened Mar 3, 2025
Move Linux DNNL/OpenVino pipelines to onnxruntime-Ubuntu2204-AMD-CPU machine pool
#23870 opened Mar 3, 2025
[QNN EP] Add example that uses a custom CPU allocator for a QNN session
#23880 opened Mar 4, 2025
[VitisAI EP] export InferShapes to VitisAIEP
#23881 opened Mar 4, 2025
Pick Jian's pipeline changes to the 1.21 release branch
#23903 opened Mar 5, 2025
[TensorRT EP] support TensorRT 10.9-GA
#23905 opened Mar 5, 2025
[webgpu] Optimize MatMulNBits for f16 Block32 prefill performance
#23908 opened Mar 6, 2025
[WIP][Native WebGPU] Remove explicit split operator in GQA
#23909 opened Mar 6, 2025
[WebGPU] Direct CPU->GPU buffer upload for UMA
#23910 opened Mar 6, 2025
Fix CUDA EP Abs and Sign bfloat16 support
#23914 opened Mar 6, 2025
[WebGPU EP] Implements Gelu, BiasSplitGelu, and QuickGelu
#23920 opened Mar 6, 2025
Bump SixLabors.ImageSharp from 2.1.9 to 2.1.10 in /csharp/sample/Microsoft.ML.OnnxRuntime.FasterRcnnSample
#23924 opened Mar 6, 2025
Extend CMAKE_CUDA_FLAGS with all Blackwell compute capacity
#23928 opened Mar 7, 2025
[WIP] DepthToSpace for WebGPU EP
#23929 opened Mar 7, 2025
update transformer version to 4.48.0
#23932 opened Mar 7, 2025
VCPKG improvement: set VCPKG_OSX_DEPLOYMENT_TARGET
#23933 opened Mar 7, 2025
[Native WebGPU] Added ReduceMax and ReduceSum
#23934 opened Mar 7, 2025
[js] Add API for accessing metadata of a model's input/output
#23937 opened Mar 7, 2025
[Fix] Dependencies find_package Eigen error
#23939 opened Mar 7, 2025
Add support for custom position ids and attention mask to GQA CPU operator
#23944 opened Mar 7, 2025
Qnn weight sharing improvement
#23945 opened Mar 7, 2025
Allow using a different version of flatbuffers when building with vcpkg
#23946 opened Mar 7, 2025

43 Issues closed by 29 people

[Build] Released asset for v1.20.1 doesn't work on macOS Sequoia
#23922 closed Mar 7, 2025
What's the right way to construct custom ops with the same name but different output types?
#23891 closed Mar 7, 2025
[Performance] Keep Onnx awake while in idle mode
#23461 closed Mar 6, 2025
[Documentation] Unclear how to run `run_benchmark.py`
#23889 closed Mar 5, 2025
[Build] ONNX Run Time on Conda Forge - Add CUDA Support
#23904 closed Mar 5, 2025
[Build] NuGet Package missing header files
#23884 closed Mar 5, 2025
[Build] how to compile ios static library
#23835 closed Mar 4, 2025
ort.InferenceSession fails silently
#23869 closed Mar 4, 2025
[Build] Android compatibility with WebGPU
#23565 closed Mar 4, 2025
Memory leakage from ONNXRuntime environment on Linux machine using C.
#23798 closed Mar 4, 2025
[Web] Shall we accept Uint16Array for 'float16' if Float16Array is available
#23817 closed Mar 3, 2025
When will v1.20.0 be released for onnxruntime-openvino
#22783 closed Mar 3, 2025
[Build] Windows MSVC DNNL build requires <chrono> include
#23854 closed Feb 28, 2025
Selecting XNNPACK as execution provider for Android following the documentation example results in program termination
#23826 closed Feb 28, 2025
[Build] Android build Failure on ONNX Runtime 1.20.2 compiler doesn't support BFLOAT16
#23851 closed Feb 28, 2025
[Build] mp11 not found
#23821 closed Feb 27, 2025
Cuda execution provider is not available
#23833 closed Feb 27, 2025
[Build] Linux i686 32 bit support
#23823 closed Feb 27, 2025
Can't load CUDA on .NET project
#23810 closed Feb 26, 2025
[Bug] MIGraphX EP seeing HipMemcpy via onnxruntime::GPUDataTransfer::CopyTensor that break multi stream execution
#16774 closed Feb 25, 2025
[Performance] Why Does Increasing the Number of CPU Cores Not Improve Performance?
#23747 closed Feb 25, 2025
ONNX runtime inference silently defaults to CPUExecutionProvider, even though GPU is visible [Kaggle Workbook]
#23612 closed Feb 23, 2025
[Web] [Node] onnxruntime-node failing to load in workers
#23790 closed Feb 23, 2025
[Performance] Memory Usage During Session Creating Doubled
#23775 closed Feb 22, 2025
latest version of onnxruntime-gpu fail to use the pip installed cuda libraries
#23643 closed Feb 21, 2025
[Build] Unable to cross-compile ONNX Runtime 1.17.1 for ARM Cortex A53
#23152 closed Feb 20, 2025
C# Cannot initialize InferenceSession on arm64
#23716 closed Feb 19, 2025
[Web] Package path ./webgpu is not exported from package
#23720 closed Feb 19, 2025
Ep Context Model generated with external data is still dependent on the same data file
#23358 closed Feb 19, 2025
Support Python 3.12
#23738 closed Feb 18, 2025
OnnxRuntime does not see CUDAExecutionProvider with CUDA 12.4, CUDNN 9.
#23736 closed Feb 18, 2025
[Build] ONNXRuntime v1.18 C++ build from source with TensorRT failing
#23661 closed Feb 16, 2025
[Build]
#23719 closed Feb 16, 2025
AccessViolationException occurs during multiple calls
#23713 closed Feb 15, 2025
[WebGPU/js] `failed to inference ONNX model: Error: [WebGPU] Kernel "[Add] /decoder/F0.1/Add" failed. Error: Can't perform binary op on the given tensors.` - Kokoro TTS
#23403 closed Feb 13, 2025
[WebGPU EP] Incorrect Alignment for Single u32 in Uniform Buffer
#23677 closed Feb 13, 2025
[Feature Request] System.Numerics.Tensors support
#23605 closed Feb 11, 2025
[Build] [MIGraphX EP][ ROCm EP] LD_LIBRARY_PATH/RPATH not set correctly for newer wheels generated within a pyenv environment
#23584 closed Feb 11, 2025
[Build]
#23638 closed Feb 11, 2025
[Build] json dependency update request
#23512 closed Feb 10, 2025
[Build] thrust::unary_function eprecated in cuda 12.8
#23499 closed Feb 10, 2025
Model Unsupported model IR version: 11, max supported IR version: 10
#23602 closed Feb 8, 2025
Error when creating OrtLoraAdapter (GetDataTransfer Expecting on device allocator for LoraAdapter)
#23620 closed Feb 8, 2025

78 Issues opened by 69 people

[Web] WASM sigmoid producing numbers below 0 or above 1
#23943 opened Mar 7, 2025
Error when I use cuda_runtime.h and OpenVINO EP at the same time
#23941 opened Mar 7, 2025
[Feature Request] Add more options to load models at InferenceSession constructor
#23940 opened Mar 7, 2025
Bad Allocation Error in ONNX Runtime on Windows x86 CPU When Processing Multiple Images Sequentially
#23938 opened Mar 7, 2025
ConvInteger segfaults when x_zero_point is the empty string
#23927 opened Mar 7, 2025
[Feature Request] Multi-Head Latent Attention(DeepSeek) support on CPU/NPU
#23925 opened Mar 6, 2025
[Web] Facing this error in WebGPU: Model warmup failed: Error: input 'detection' is missing in 'feeds'.
#23921 opened Mar 6, 2025
Public and open source contains header references to "confidential and proprietary" Microsoft code.
#23917 opened Mar 6, 2025
[Build] memory leaked
#23915 opened Mar 6, 2025
[Build] onnxruntime with tag 1.20.* build failed on Windows after VS upgrade to 17.13.*
#23911 opened Mar 6, 2025
[Documentation] Memory Leak in TensorRTProvider example
#23901 opened Mar 5, 2025
[C++, Linux] Segmentation fault when run OrtApi::Run
#23897 opened Mar 5, 2025
[OpenVINO GPU] OpenVINO EP shouldn't override the "ACCURACY" precision to "FP32"
#23895 opened Mar 5, 2025
Xnnpack execution provider Resize::IsOnnxNodeSupported causes crash for models where Resize layer scales tensor is an empty tensor
#23886 opened Mar 4, 2025
[DO NOT UNPIN] ORT 1.21.0 Release Candidates available for testing
#23885 opened Mar 4, 2025
Half of the length that correct output shape
#23883 opened Mar 4, 2025
When using the int8 quantization model to convert to onnx, an error occurs during runtime
#23879 opened Mar 4, 2025
The Pad operator has a calculation error in the "reflect" mode.
#23878 opened Mar 4, 2025
Abs node runs into error with bf16 tensor
#23875 opened Mar 3, 2025
Multi GPU support
#23874 opened Mar 3, 2025
[OpenVINO] SessionOptionsAppendExecutionProvider_OpenVINO API loads NULL config file
#23871 opened Mar 3, 2025
preprocess issues around MeanReduce/Reshape nodes and negative axes
#23868 opened Mar 3, 2025
[Performance] Why does inference occupy so much memory?
#23867 opened Mar 3, 2025
[Build] Openvino fails to build with AUTO:GPU,CPU
#23866 opened Mar 3, 2025
Attention fusion broken for BART 🤖
#23864 opened Mar 3, 2025
[Build] Build failure on Windows 11 with CUDA/cuDNN: nvcc subprocess error during CUDA compilation (v1.20.2)
#23844 opened Feb 27, 2025
[Build] CUDA version linkage
#23841 opened Feb 27, 2025
[Mobile] Dynamic Shape Challenge: Enabling LLM on QNN-HTP
#23832 opened Feb 27, 2025
[CPU EP] GatherND crashes with division by zero when batch dimensions mismatch between input and indices
#23828 opened Feb 27, 2025
[Build] ORT, DML, OpenVINO Python wheel build - "OpenVINOExecutionProvider doesn't support memcpy"
#23824 opened Feb 26, 2025
[Build] ONNX Runtime Support for Cortex-M33 and Cortex-M7
#23822 opened Feb 26, 2025
[Tests] 1 test fails: OptimizerInitializerTest.LoadExternalData: it throws a different type.
#23816 opened Feb 26, 2025
[Build] Docker build failure with ROCm 6.0 using official Dockerfile for v1.19.2: Segmentation fault in clang++ during composable_kernel compilation
#23807 opened Feb 25, 2025
Blank output issue with CUDAExecutionProvider - Onnx Model Converted to fp16
#23797 opened Feb 24, 2025
[Build] Cross-compile for Android on Windows error
#23796 opened Feb 24, 2025
[Performance]Do onednn executors depend on Intel platform
#23795 opened Feb 24, 2025
[nodejs-binding] Crash during InferenceSession initialization: "Check failed: node->IsInUse()"
#23794 opened Feb 24, 2025
Why the output of the ONNX MatMul node never be the same as what PyTorch gives?
#23792 opened Feb 23, 2025
Is DML being deprecated?
#23783 opened Feb 21, 2025
Onnxruntime react-native in mobile issue: [java.lang.ClassCastException: java.lang.String[][] cannot be cast to java.lang.String[]]
#23782 opened Feb 21, 2025
Microsoft.ML.OnnxRuntime.QNN 1.20.1 includes unnecessary filew in win-arm64.
#23781 opened Feb 21, 2025
the memory usage not release
#23774 opened Feb 21, 2025
Can load Fluxonnx Modal Components using InferenceSession
#23770 opened Feb 20, 2025
[Build] WASM static lib build fails: no member named 'Negate' in 'onnxruntime::MLFloat16'
#23769 opened Feb 20, 2025
Assistance with adjusting default Arena Allocator C/C++ API
#23768 opened Feb 20, 2025
[Web] Getting Started link on onnxruntime.ai website broken
#23764 opened Feb 20, 2025
the memory leak using valgrind
#23762 opened Feb 20, 2025
[Mobile] [urgent] iOS application crash at CreateEnv (pointer being freed was not allocated)
#23759 opened Feb 20, 2025
OpenVino Runtime Exception. Unexpected: CPU plug-in doesn't support If operation with dynamic rank. Operation name: input.15
#23757 opened Feb 19, 2025
[Feature Request] com.microsoft.Xxhash3
#23754 opened Feb 19, 2025
[Build] no match for ‘operator=’ (operand types are ‘OrtMemoryInfo’ and ‘const OrtDevice') in memory_info.cc line 44 when onnxruntime_ENAABLE_MEMORY_PROFILE is enabled
#23750 opened Feb 19, 2025
[Build] Error building with ACL EP on aarch64 linux (Raspberry Pi 5)
#23741 opened Feb 18, 2025
Tensor Backing Buffer Mismatch Detected in Buffer Reuse
#23739 opened Feb 18, 2025
Onnxruntime using OpenVINO for older version Intel UHD630
#23735 opened Feb 18, 2025
Adding Execution Provider into ONNX RT
#23732 opened Feb 18, 2025
terminate called after throwing an instance of 'Ort::Exception' what(): Invalid input name: serving_default_input_1:0
#23730 opened Feb 18, 2025
Kernel error for T5-style beam search with FP-16 subgraphs
#23728 opened Feb 17, 2025
My system also has different versions of onnxruntime.dll. I have put the correct one in the same directory as the exe file, but I still get an error
#23722 opened Feb 16, 2025
Question about the ONNX Runtime 1.20.2 binary release
#23721 opened Feb 16, 2025
OnnxRuntimeGenAIException: CUDA execution provider is not enabled in this build.
#23715 opened Feb 15, 2025
[Web] [Feature Request] Ability to abort
#23703 opened Feb 14, 2025
[Documentation] Link to C API for configuring telemetry in Privacy.md is dead
#23701 opened Feb 14, 2025
Adding an Execution Provider to ONNX Runtime Upstream
#23700 opened Feb 14, 2025
[Training] GRU and Squeeze artefact generation error
#23698 opened Feb 14, 2025
[Documentation] Clarify Lifetime Requirements of inputs to Ort::IoBinding
#23689 opened Feb 13, 2025
[Build] CMake Error at onnxruntime_unittests.cmake:1026 (find_path): Could not find onnx_SOURCE_DIR using the following files: onnx/onnx-ml.proto3, onnx/onnx-ml.proto Call Stack (most recent call first): CMakeLists.txt:1789 (include)
#23684 opened Feb 13, 2025
[Documentation] I/O Binding Needs Detail
#23682 opened Feb 13, 2025
1.20.2 is released but still not in pypi - pip throws errror
#23681 opened Feb 13, 2025
[Feature Request] What does ONNX Runtime do when the model does not fit in the memory?
#23664 opened Feb 12, 2025
[Build] Android x86_64 Cross Compiling on Mac OS
#23648 opened Feb 11, 2025
session.disable_fallback() has no effect, it always fallback to cpu
#23647 opened Feb 11, 2025
[Build] Inconsistent naming of lib directories
#23642 opened Feb 11, 2025
[Build] 1.20.2 Microsoft.ML.OnnxRuntime.Managed nuget package needs Microsoft.ML.OnnxRuntime 1.20.2 which is not available
#23640 opened Feb 11, 2025
[Performance] Propagate NaNs in the CPU min and max operators introduces performance regression
#23628 opened Feb 10, 2025
With TensorRT EP, the output matrix is all zeros, but with CUDAEP, the output is correct.
#23626 opened Feb 10, 2025
onnxruntime-qnn silently failing when onnx model is not present
#23623 opened Feb 8, 2025
[Build] 1.19.2 fails with eigen error
#23622 opened Feb 8, 2025
TensorRT Provider "Attribute reduction is not supported"
#23618 opened Feb 7, 2025

115 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

[TensorRT] Support Multiple EP Context
#23294 commented on Feb 19, 2025 • 18 new comments
[mobile] Add Android NuGet BrowserStack test to NuGet packaging pipeline
#23580 commented on Mar 6, 2025 • 16 new comments
Whisper Redesigned Solution
#23549 commented on Mar 6, 2025 • 8 new comments
(WIP) bitnet and t-mac
#23540 commented on Mar 3, 2025 • 6 new comments
Cleanup CoreML EP's code to remove COREML_ENABLE_MLPROGRAM
#23490 commented on Feb 27, 2025 • 4 new comments
[Native WebGPU EP] Add packedQKV and do_rotary attribute support to GroupQueryAttention operator
#23386 commented on Mar 6, 2025 • 3 new comments
Migrate yarn to npm
#22116 commented on Mar 6, 2025 • 2 new comments
Add trace event control for ORT Web performance profiling
#23393 commented on Feb 24, 2025 • 2 new comments
Compile lib instead of executable when checking compiler features
#23329 commented on Feb 23, 2025 • 2 new comments
Upgrade current MacOS-13 to 14
#23293 commented on Feb 26, 2025 • 1 new comment
[Performance] 40% slowdown in ONNX Resize Operator on CPU
#23391 commented on Feb 26, 2025 • 0 new comments
Memory creeping up
#23348 commented on Feb 26, 2025 • 0 new comments
Creating TRT Cache much slower on Linux than on Windows
#23380 commented on Feb 26, 2025 • 0 new comments
How to build for multiple execution provider?
#9756 commented on Feb 26, 2025 • 0 new comments
[Build] Non-zero status code
#23497 commented on Feb 27, 2025 • 0 new comments
symbolic_shape_infer.py cannot infer torch.nn.normalize
#23516 commented on Feb 28, 2025 • 0 new comments
[Performance] Multithreading for DequantizeLinear
#23395 commented on Feb 28, 2025 • 0 new comments
[Performance] Preload model before inference
#23513 commented on Mar 1, 2025 • 0 new comments
[Web] WebGPU and WASM Backends Unavailable within Service Worker
#20876 commented on Mar 1, 2025 • 0 new comments
[Build] protocol buffer compiler error MSB8066
#23529 commented on Mar 2, 2025 • 0 new comments
[Performance] Speed-up TensorRT engine compilation
#23546 commented on Mar 3, 2025 • 0 new comments
System.EntryPointNotFoundException: Unable to find an entry point named 'OrtSessionOptionsAppendExecutionProvider_CUDA' in DLL 'onnxruntime'.
#22559 commented on Mar 3, 2025 • 0 new comments
Custom operators is not a registered function/op (python)
#23566 commented on Feb 26, 2025 • 0 new comments
debug result is ok, release get NaN output
#23440 commented on Feb 26, 2025 • 0 new comments
[Web] Declaration is not emitted in onnxruntime-node package
#17979 commented on Feb 26, 2025 • 0 new comments
[Performance] fp16 support and performance
#22242 commented on Feb 25, 2025 • 0 new comments
[Performance] GPU Fallback to CPU Without Error When CUDA DLLs Are Missing
#23372 commented on Feb 25, 2025 • 0 new comments
[Web] cannot load onnx model in a vite/react project, because of error expected magic word 00 61 73 6d, found 3c 21 44 4f @+0
#19556 commented on Feb 25, 2025 • 0 new comments
[Web] `Error: using ceil() in shape computation is not yet supported for AveragePool`
#21206 commented on Feb 25, 2025 • 0 new comments
[Performance] Round node shows huge performance drop on Windows
#23430 commented on Feb 25, 2025 • 0 new comments
SafeIntOnOverflow() Integer overflow error when running inference in an ASGI server
#12288 commented on Feb 25, 2025 • 0 new comments
VCPKG port for onnxruntime and modernize cmake builds
#7150 commented on Feb 25, 2025 • 0 new comments
Quantized ONNX Model Still Has Float32 Input/Output Tensors
#21138 commented on Feb 25, 2025 • 0 new comments
Using separate cuda streams for one session
#23319 commented on Feb 14, 2025 • 0 new comments
Migrate Zip-Nuget Package Pipeline to 1ES
#23609 commented on Mar 7, 2025 • 0 new comments
[WIP][webgpu] Apply dp4a for generation shader
#23585 commented on Feb 10, 2025 • 0 new comments
Add of Sum Gradient
#23568 commented on Feb 12, 2025 • 0 new comments
[WebGPU/JSEP] Support group query attention do_rotary attribute
#23524 commented on Mar 7, 2025 • 0 new comments
Fixing typo in finetune.md application example
#23442 commented on Feb 14, 2025 • 0 new comments
[WebNN EP] Support GroupQueryAttention(GQA)
#23416 commented on Mar 7, 2025 • 0 new comments
qgemm: optimize avxvnni QGEMM inner kernel for M=1
#22952 commented on Feb 20, 2025 • 0 new comments
[js/web] Add Wasm Relaxed SIMD support to wasm backend
#22794 commented on Mar 5, 2025 • 0 new comments
Fix rounding issue in int Resize vs. Resize with QDQ quantization.
#22476 commented on Feb 14, 2025 • 0 new comments
[VitisAI] Add vaip Integration Using FetchContent
#22038 commented on Mar 7, 2025 • 0 new comments
Broken multithreading inference session Onnxruntime-directml >= 1.18
#20713 commented on Mar 7, 2025 • 0 new comments
[Performance] using onnxruntime with ray and also fix for memory footprint too high
#16793 commented on Mar 7, 2025 • 0 new comments
Mixed Precision ValueError: validation failed for model with all nodes in node_block_list
#14235 commented on Mar 7, 2025 • 0 new comments
[Web] BiRefNet_T not working on webgpu
#21968 commented on Mar 7, 2025 • 0 new comments
Failed to load library libonnxruntime_providers_cuda.so I am getting the following erro
#19616 commented on Mar 6, 2025 • 0 new comments
[Build] What version of ArmNN does onnxruntime v1.15.1 work with?
#17763 commented on Mar 6, 2025 • 0 new comments
[TensorRT EP] How can I disable generating cache when using trt execution provider
#22822 commented on Mar 6, 2025 • 0 new comments
[Build] build error for windows
#23166 commented on Mar 6, 2025 • 0 new comments
[Feature Request] Request grid_sample 5D support 🌟
#21382 commented on Mar 5, 2025 • 0 new comments
[Build] aarch64 ACL (20.02) build fails with onnxruntime `v1.13.1`, `1.14.1` and `1.15.0`
#16176 commented on Mar 5, 2025 • 0 new comments
[WebGPU] `Kernel "[GroupQueryAttention] /model/layers.0/attn/GroupQueryAttention" failed. Error: Input "key" is expected to have 3, 4, or 5 dimensions".`
#22987 commented on Mar 4, 2025 • 0 new comments
raise Exception("Incomplete symbolic shape inference") when running "symbolic_shape_infer.py"
#10484 commented on Mar 4, 2025 • 0 new comments
RoiAlign CPU is not aligned to pixel centers (per the Mask RCNN paper and Facebook's Detectron2 implementation)
#6921 commented on Mar 3, 2025 • 0 new comments
[Build] How to build CoreML for running C++ code on MacOS
#23556 commented on Mar 3, 2025 • 0 new comments
[Build] WASM build of `v1.20.1` with `--use_xnnpack` fails
#23460 commented on Feb 13, 2025 • 0 new comments
No speedup from float16 with directml compared to cuda
#23359 commented on Feb 13, 2025 • 0 new comments
Build on Linux with CUDA support
#20330 commented on Feb 13, 2025 • 0 new comments
Linux Failed Build - std::piecewise_construct’ causes a section type conflict
#23345 commented on Feb 13, 2025 • 0 new comments
[Build] libonnxruntime_providers_shared.so statically linked?
#23355 commented on Feb 13, 2025 • 0 new comments
CUDAExecutionProvider doesn't seem to be used during inference of transformers exported model to ONNX runtime GPU
#22325 commented on Feb 13, 2025 • 0 new comments
Memory allocation failures due to incorrect requested buffer size
#18743 commented on Feb 13, 2025 • 0 new comments
Encryption does not work with trt_dump_ep_context_model
#23289 commented on Feb 12, 2025 • 0 new comments
Always getting "Failed to create CUDAExecutionProvider"
#11092 commented on Feb 12, 2025 • 0 new comments
[Build] Compilation error when building Onnxrt 1.20.1 with flag onnxruntime_CUDA_MINIMAL=ON with TRT 10.7.23 and Cudnn 9.6.0.74,
#23504 commented on Feb 11, 2025 • 0 new comments
Cannot resolve operator 'LSTM' with webgl backend
#23083 commented on Feb 11, 2025 • 0 new comments
Add logging to file option
#10586 commented on Feb 10, 2025 • 0 new comments
[Build]
#18570 commented on Feb 10, 2025 • 0 new comments
[Build] Not able to build ONNX Runtime Nuget package on Windows
#23321 commented on Feb 10, 2025 • 0 new comments
[DO NOT UNPIN] ORT 1.20 release candidates available for testing
#22604 commented on Feb 10, 2025 • 0 new comments
RUNTIME_EXCEPTION, 80070057 The parameter is incorrect in v1.17.3
#20464 commented on Feb 9, 2025 • 0 new comments
CoreML failed: Unable to get shape for output
#23262 commented on Feb 9, 2025 • 0 new comments
onnxruntime-python on AWS
#23291 commented on Feb 9, 2025 • 0 new comments
'Microsoft.ML.OnnxRuntime.NativeMethods' threw an exception
#23300 commented on Feb 9, 2025 • 0 new comments
How to implement a custom operator that support multiple compute device (CPU, CUDA)?
#23317 commented on Feb 9, 2025 • 0 new comments
TensorRTExecutionProvider error during session initialization
#22199 commented on Feb 9, 2025 • 0 new comments
[Delivery] Win ARM64 wheels + QNN
#19162 commented on Feb 8, 2025 • 0 new comments
[C#] ML.NET: ArgumentOutOfRangeException thrown in PredictionEngine.Predict
#23230 commented on Feb 8, 2025 • 0 new comments
The trt_engine_decryption_lib_path environment variable renders encryption worthless
#23290 commented on Feb 8, 2025 • 0 new comments
OnnxRuntime and Numerics.Tensors version numbers out-of-date
#23295 commented on Feb 8, 2025 • 0 new comments
Exception during initialization using Intel NPU (Intel AI boost)
#23305 commented on Feb 8, 2025 • 0 new comments
[WebGPU] `Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528.`
#22994 commented on Feb 8, 2025 • 0 new comments
[Performance] FP16 Clip and Handle Bias introduces insufficient optimization.
#23613 commented on Feb 7, 2025 • 0 new comments
[Feature Request] Add official support for onnxruntime-gpu on ARM64/aarch64 platforms
#22903 commented on Feb 24, 2025 • 0 new comments
[Documentation] CudaContext::AllocDeferredCpuMem
#23485 commented on Feb 24, 2025 • 0 new comments
[Build] how to buid on openharmony?
#20895 commented on Feb 24, 2025 • 0 new comments
[QUESTION]: onnxruntime with onednn backend
#23543 commented on Feb 24, 2025 • 0 new comments
RunAsync C# API crashes without any error
#19140 commented on Feb 23, 2025 • 0 new comments
onnxruntime-web is 11-17x times slower than native inference
#11181 commented on Feb 23, 2025 • 0 new comments
[Performance] kokoro onnx performance issues
#23384 commented on Feb 23, 2025 • 0 new comments
Nuget package Microsoft.ML.OnnxRuntime.Gpu version >= 1.17.0 not working
#23462 commented on Feb 22, 2025 • 0 new comments
memory.enable_memory_arena_shrinkage is not working in python
#23339 commented on Feb 21, 2025 • 0 new comments
[Feature Request] Adapters DML support
#23503 commented on Feb 21, 2025 • 0 new comments
Implemented Conv2D, Depth2Space, and Resize. Is anyone interested in merging these changes back?
#23471 commented on Feb 21, 2025 • 0 new comments
slow fp16 performance
#10919 commented on Feb 20, 2025 • 0 new comments
Not able to load QNN Context Binary Model
#23431 commented on Feb 20, 2025 • 0 new comments
[Web] How to use JSEP and WebGPU in static library (missing jsepAlloc or jsepInit)
#23072 commented on Feb 19, 2025 • 0 new comments
[Web] How should I get wasm file?
#19829 commented on Feb 19, 2025 • 0 new comments
[ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running FusedConv node.
#9194 commented on Feb 19, 2025 • 0 new comments
Memory Leak in Onnx Session Release [Web]
#21673 commented on Feb 18, 2025 • 0 new comments
[Build] Issues with Multithreading in the New Versions of onnxruntime-directml
#22867 commented on Feb 18, 2025 • 0 new comments
[Build] build onnxruntime for vsinpu error
#23316 commented on Feb 18, 2025 • 0 new comments
Model having scatterND layer giving different result every time with same input
#23396 commented on Feb 17, 2025 • 0 new comments
[Feature Request] MPS provider
#21271 commented on Feb 17, 2025 • 0 new comments
TensorrtExecutionProvider slower than CUDAExecutionProvider: Faster-rcnn [Performance]
#17434 commented on Feb 16, 2025 • 0 new comments
C# Run Program on NPU (OnnxRuntime + DirectML + NPU)?
#23375 commented on Feb 15, 2025 • 0 new comments
[ROCm] CK Datatype Adaptor - BFloat16
#23390 commented on Feb 15, 2025 • 0 new comments
[Accuracy] MSclap model accuracy issue (CPU vs QNN EP (NPU) )
#23394 commented on Feb 15, 2025 • 0 new comments
[Performance]Why is loading an ONNX model taking so long?
#23338 commented on Feb 14, 2025 • 0 new comments
How to create custom op with fp16 input
#23373 commented on Feb 14, 2025 • 0 new comments
[Build] Better support for vcpkg
#23158 commented on Feb 14, 2025 • 0 new comments
[js/webgpu] ConvTranspose1D slower on Webgpu than Wasm
#23273 commented on Feb 14, 2025 • 0 new comments