-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Insights: microsoft/onnxruntime
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
v1.20.2 ONNX Runtime v1.20.2 [QNN-only]
published
Feb 12, 2025
164 Pull requests merged by 58 people
-
replace usage of gsl::narrow and gsl::narrow_cast in WebGPU EP
#23926 merged
Mar 7, 2025 -
Fix license in example test code.
#23936 merged
Mar 7, 2025 -
Create a packaging pipeline for a custom nuget package
#23918 merged
Mar 7, 2025 -
[AIX] External data handling
#23859 merged
Mar 7, 2025 -
Updated ov version in pipeline (#595)
#23882 merged
Mar 7, 2025 -
Fix ConvInteger handling of optional inputs.
#23935 merged
Mar 7, 2025 -
Updated run_CIs_for_external_pr.py to support the Windows OpenVINO CI pipeline
#23931 merged
Mar 7, 2025 -
fix binplace file in web pipeline
#23930 merged
Mar 7, 2025 -
Enabling L2+ Optimizations for EPs
#23517 merged
Mar 7, 2025 -
Example custom op with output type inferencing
#23916 merged
Mar 7, 2025 -
Support all block sizes that are multiples of 32 for DP4A
#23907 merged
Mar 7, 2025 -
Exclude MAUI projects from GPU C# packaging builds
#23923 merged
Mar 7, 2025 -
[WebGPU EP] SoftMax Implementation
#23538 merged
Mar 7, 2025 -
Adding OpenVINO Windows CI Pipeline
#23919 merged
Mar 7, 2025 -
enable WebGPU EP in WebAssembly build
#23913 merged
Mar 6, 2025 -
[JSEP/WebGPU] Fixed error in softmax dispatch.
#23906 merged
Mar 6, 2025 -
WebGPU: Remove deprecated subgroups-f16 from WebGPU native and JS EP
#23898 merged
Mar 6, 2025 -
Ensure that the 'cmake_minimum_required' is version 3.5 or greater
#23888 merged
Mar 6, 2025 -
[WebNN] Accept Float16Array for float16 data type if it is available
#23894 merged
Mar 6, 2025 -
[webgpu] support Pad operator
#23141 merged
Mar 6, 2025 -
[webgpu] Restore MatMulNBits workgroup size for Phi-3.5
#23349 merged
Mar 6, 2025 -
Round 2 of cherry-picks into rel-1.21.0
#23899 merged
Mar 6, 2025 -
[js/web] improve workaround for bundlers
#23902 merged
Mar 6, 2025 -
Dynamo export and improve benchmark script for SAM2 encoder
#23887 merged
Mar 5, 2025 -
[WebGPU EP] introduce BiasAdd contrib op
#23861 merged
Mar 5, 2025 -
[WebGPU-EP Native] Add ReduceMean
#23860 merged
Mar 5, 2025 -
Fix enable_pix_capture build for WebGPU
#23857 merged
Mar 5, 2025 -
Fix formatting in snapdragon.md
#23900 merged
Mar 5, 2025 -
Add snapdragon tutorial
#23890 merged
Mar 5, 2025 -
[QNN-EP]: Fix inference failures while running with htp_shared_memory
#23892 merged
Mar 5, 2025 -
[TensorRT EP] Add doc for trt_op_types_to_exclude
#23893 merged
Mar 5, 2025 -
[QNN EP Docs] Update docs for building QNN EP as shared or static library
#23873 merged
Mar 5, 2025 -
Enable QNN EP weight sharing generation using public API
#23702 merged
Mar 5, 2025 -
Doc update relate to EPContext model default name
#23865 merged
Mar 4, 2025 -
Add dawn to ThirdPartyNotices
#23876 merged
Mar 4, 2025 -
Allow using extended minimal build for several EPs
#23834 merged
Mar 4, 2025 -
Change gsl::byte to std::byte
#23872 merged
Mar 4, 2025 -
[OpenVINO] Fix a build warning
#23877 merged
Mar 4, 2025 -
[js/webgpu] Reland the optimization of ConvTranspose
#23858 merged
Mar 4, 2025 -
[Doc] Update CUDA option prefer_nhwc
#23812 merged
Mar 4, 2025 -
[js/common] allows using Uint16Array as data for float16 tensor
#23827 merged
Mar 3, 2025 -
Make Nuget QNN package pipeline 1ES compliant
#23805 merged
Mar 3, 2025 -
Change the logic to generate the default ep context file name
#23788 merged
Mar 3, 2025 -
Quant tool: Consistent
get_qdq_config
andget_qnn_qdq_config
behavior#23856 merged
Mar 2, 2025 -
Fix typos in csharp/src/Microsoft.ML.OnnxRuntime/
#23848 merged
Mar 1, 2025 -
Fix typo: change
Upample
toUpsample
.#23838 merged
Mar 1, 2025 -
Model Builder API
#23223 merged
Feb 28, 2025 -
Cherry-picks into rel-1.21.0
#23846 merged
Feb 28, 2025 -
Fix flash attention for GQA (Phi4)
#23850 merged
Feb 28, 2025 -
Revert changes onn mac-react-native-ci-pipeline.yml
#23845 merged
Feb 27, 2025 -
[Mlas] Unblock hardcoded matmul blocking size
#23815 merged
Feb 27, 2025 -
Increase npm package pipeline ReactNative_CI_iOS timeout to 120 mins
#23825 merged
Feb 27, 2025 -
[ORT/CI_Pipeline] Use --enable_generic_interface in ORT builds for EP testing
#23801 merged
Feb 27, 2025 -
Quant tool: Add
nodes_to_exclude
inget_qnn_qdq_config
#23779 merged
Feb 27, 2025 -
Update onnxruntime_external_deps.cmake: add missing EXCLUDE_FROM_ALL
#23829 merged
Feb 27, 2025 -
[OVEP] Update support for Contrib Ops
#23789 merged
Feb 27, 2025 -
upgrade emsdk to 4.0.4
#23819 merged
Feb 27, 2025 -
[webgpu] Fix alignment issues in shader code
#23776 merged
Feb 27, 2025 -
[TensorRT EP] update oss parser to latest
#23710 merged
Feb 27, 2025 -
[ARM CPU] Fix flaky hgemmb ut
#23814 merged
Feb 27, 2025 -
Make Nuget CUDA package pipeline 1ES compliant
#23804 merged
Feb 26, 2025 -
Upgrade React Native to 0.73
#23575 merged
Feb 26, 2025 -
[webgpu] support resize operator
#23780 merged
Feb 26, 2025 -
Conveting npm packaging pipeline to 1ES
#23767 merged
Feb 26, 2025 -
Make Nuget package pipeline 1ES compliant
#23803 merged
Feb 26, 2025 -
[QNN EP] Re-enable several disabled QNN-EP UTs
#23799 merged
Feb 26, 2025 -
[VitisAI] add new interfece
#23777 merged
Feb 25, 2025 -
[QNN EP] Use absolute path of libcdsprpc.dll on Windows so it doesn't need to be copied anywhere.
#23791 merged
Feb 24, 2025 -
Create dependencies.md for improving build document
#23786 merged
Feb 23, 2025 -
Bump version from 1.21 to 1.22
#23787 merged
Feb 23, 2025 -
[webgpu] Enable FlashAttention for GQA
#23761 merged
Feb 22, 2025 -
[WebNN] Fix missing parameter
#23778 merged
Feb 22, 2025 -
Uploaded deepseek blog, ready for post.
#23740 merged
Feb 22, 2025 -
Set build user's uid when creating Migraphx/ROCM docker images
#23657 merged
Feb 21, 2025 -
[TensorRT EP] Add new provider option to exclude ops from running on TRT
#23705 merged
Feb 21, 2025 -
Update cmake_cuda_architecture to control package size
#23671 merged
Feb 21, 2025 -
[webgpu] Implement SubGroupMatrix based MatMulNBits for Metal
#23729 merged
Feb 21, 2025 -
[Optimizer] Fix exception for Q -> DQ sequence with different scale types
#23771 merged
Feb 21, 2025 -
OVEP: Bug Fixes, Refactoring, and Contrib Ops Update
#23742 merged
Feb 21, 2025 -
Shape inference: GatherBlockQuantized dispatcher
#23748 merged
Feb 21, 2025 -
[QNN EP] Passthrough EP Parameters in Node
#23468 merged
Feb 20, 2025 -
[JSEP] fix scatter-nd jsep kernel
#23755 merged
Feb 20, 2025 -
[onnxruntime/build] Add CI testing for ORT build with generic interface
#23530 merged
Feb 20, 2025 -
Rope imbedding kernel to use avx2
#23694 merged
Feb 20, 2025 -
Add a new build flag to build.py for using with vcpkg
#23723 merged
Feb 20, 2025 -
Capacity aware partitioning
#22766 merged
Feb 20, 2025 -
[ARM CPU] Enable FP16 kernels for GQA op
#23746 merged
Feb 20, 2025 -
[webgpu] Use components for VxAttentionScore
#23726 merged
Feb 20, 2025 -
[AIX]eigen update fix and test failures fix
#23751 merged
Feb 20, 2025 -
Add condition to gpu wheel build flag
#23760 merged
Feb 20, 2025 -
Fix security vulnerability with Whisper export
#23743 merged
Feb 20, 2025 -
[Doc] Update CUDA and cuDNN installation and preload
#23708 merged
Feb 20, 2025 -
[QNN EP] Include QNN error handle value in fallback error message.
#23756 merged
Feb 20, 2025 -
[AIX] cmake cleanup
#23752 merged
Feb 20, 2025 -
Replace Linux A10 pools with A100
#23547 merged
Feb 19, 2025 -
Quantization tool: Use
nanmin
,nanmax
,nanmean
in calibrator#23749 merged
Feb 19, 2025 -
[VitisAI] fix deinit vitisai ep
#23725 merged
Feb 19, 2025 -
[QNN EP] Build Python 3.13 packages for QNN
#23706 merged
Feb 19, 2025 -
[CUDA] Update preload_dlls to coexist with PyTorch
#23744 merged
Feb 19, 2025 -
Bump esbuild from 0.19.3 to 0.25.0 in /js
#23639 merged
Feb 19, 2025 -
Add migration guide
#23482 merged
Feb 18, 2025 -
Update Eigen to the latest
#23717 merged
Feb 18, 2025 -
[QNN] MatMulAddFusion and Reshape Related Fusion
#22494 merged
Feb 18, 2025 -
Increase the python version requirement in CMakeLists.txt from 3.8 to 3.10
#23718 merged
Feb 18, 2025 -
[js/web] remove "types" from "exports" field in package.json
#23733 merged
Feb 18, 2025 -
Fix ACL option parsing
#23586 merged
Feb 17, 2025 -
Fix attention fusion in conformer encoder
#23711 merged
Feb 16, 2025 -
[CUDA] Preload dependent DLLs
#23674 merged
Feb 15, 2025 -
Create DeepSeek-R1-Distill-Qwen-python.md
#23709 merged
Feb 15, 2025 -
[webgpu] Use workgroup_idx instead of workgroup_id.x
#23696 merged
Feb 14, 2025 -
VCPKG improvements
#23688 merged
Feb 14, 2025 -
[webgpu] Fix MatMulNBits prefill shader synchronization
#23663 merged
Feb 14, 2025 -
Update Wheel location for GenAI
#23690 merged
Feb 14, 2025 -
Add execution provider arg
#23692 merged
Feb 14, 2025 -
Add extra requires for cuda/cudnn DLLs to onnxruntime-gpu python package
#23659 merged
Feb 14, 2025 -
Remove duplicated mimalloc def
#23695 merged
Feb 14, 2025 -
Add new kernels
#23220 merged
Feb 14, 2025 -
Change Execution Provider arg
#23691 merged
Feb 14, 2025 -
[VitisAI] fix throw on dfs
#23678 merged
Feb 14, 2025 -
Bump ruff from 0.9.4 to 0.9.5
#23624 merged
Feb 14, 2025 -
[QNN EP] Dump QNN json graph
#22843 merged
Feb 14, 2025 -
Exclude node quantization in RTN
#23683 merged
Feb 13, 2025 -
[ARM CPU] add notrans hgemm mlas kernel
#23668 merged
Feb 13, 2025 -
WIP: DP4AMatMul fix matmul for subgoup size 64 GPUs
#23637 merged
Feb 13, 2025 -
Avoid compiling wil on non-Windows platforms
#23675 merged
Feb 13, 2025 -
Update Dockerfile.manylinux2_28_rocm
#23650 merged
Feb 13, 2025 -
Set ANDROID_AVD_HOME env variable
#23672 merged
Feb 13, 2025 -
Mapping ORT verbose log level to QNN verbose log level
#23673 merged
Feb 13, 2025 -
Remove training CIs from external PR CI list.
#23425 merged
Feb 13, 2025 -
Enable Relocatable Device Code (RDC) to build ORT with cuda 12.8
#23562 merged
Feb 13, 2025 -
Fix broken links in website
#23654 merged
Feb 12, 2025 -
[CUDA] Not link CUDNN sub libs
#23656 merged
Feb 12, 2025 -
Update deploy pages action version
#23670 merged
Feb 12, 2025 -
add op_types_to_quantize to get_qnn_qdq_config
#23458 merged
Feb 12, 2025 -
Revert "remove --use_vcpkg flag for Python-CUDA-Packaging-Pipeline"
#23651 merged
Feb 12, 2025 -
[webgpu] fixes buffer handle leak in cache manager
#23655 merged
Feb 12, 2025 -
disable codeql for cuda gpu pipelines
#23652 merged
Feb 12, 2025 -
Upgrade emsdk version to v4.0.3
#23633 merged
Feb 12, 2025 -
Update upload pages artifact action version
#23653 merged
Feb 11, 2025 -
[WebNN] Add op support validation for decomposed WebNN ops
#23370 merged
Feb 11, 2025 -
Add VCPKG's prerequisites to AMD GPU EPs docker files
#23636 merged
Feb 11, 2025 -
Enable averagepool tests
#23595 merged
Feb 11, 2025 -
Fix an installation issue related to absl
#23641 merged
Feb 11, 2025 -
Remove unused local variables
#23634 merged
Feb 11, 2025 -
Update win-ort-main to tip main 250211
#23646 merged
Feb 11, 2025 -
[WebNN EP] Automatically move input CPU tensors to ml-tensor
#23073 merged
Feb 11, 2025 -
use correct total length to fix static kv_cache performance
#23615 merged
Feb 11, 2025 -
remove --use_vcpkg flag for Python-CUDA-Packaging-Pipeline
#23631 merged
Feb 11, 2025 -
Add python_requires to package metadata
#23604 merged
Feb 11, 2025 -
[QNN EP] Add QNN EP to ARM64X build targets
#23635 merged
Feb 11, 2025 -
[webgpu] no longer need pass-in gpu adapter for custom context
#23593 merged
Feb 10, 2025 -
Fix logic for selecting alternate name for blob
#23617 merged
Feb 10, 2025 -
[VitisAI] Add vaip Integration Using FetchContent (Cherry-pick of PR#22038 to win-ort-main branch)
#23608 merged
Feb 10, 2025 -
[ARM CPU] Add fp16 mlas kernels for exp, tanh, softmax, logsoftmax, softcap
#23597 merged
Feb 10, 2025 -
Update pybind and json to the latest
#23589 merged
Feb 10, 2025 -
Migrate iOS release pipeline to 1 ES
#23606 merged
Feb 10, 2025 -
Increase timeout for Windows TensorRT CI
#23625 merged
Feb 10, 2025 -
[ORT 1.20.2 Release] Cherry pick 1st round
#23574 merged
Feb 10, 2025 -
fix on trtCudaVersion
#23616 merged
Feb 8, 2025 -
update run CI script
#23621 merged
Feb 8, 2025 -
[WebGPU] Support PIX Capture for WebGPU EP
#23192 merged
Feb 8, 2025 -
Fix for C4267 warning
#23610 merged
Feb 8, 2025 -
Validate the context_file_path before EP compile graphs
#23611 merged
Feb 8, 2025 -
[webgpu] Use pushErrorScope()/popErrorScope() once for an inference run
#23438 merged
Feb 7, 2025
52 Pull requests opened by 33 people
-
Enable multithreading on FP16 to FP32 cast operator
#23619 opened
Feb 7, 2025 -
Integrate KleidiAI for MatMulNBits via MlasQNBitGemm
#23627 opened
Feb 10, 2025 -
WIP: Enable FA for GQA
#23630 opened
Feb 10, 2025 -
Reenable some averagepool tests that fail
#23649 opened
Feb 11, 2025 -
Test CUDNN_FRONTEND_SKIP_JSON_LIB=ON
#23660 opened
Feb 12, 2025 -
Make quantize shader work for all gpus
#23676 opened
Feb 13, 2025 -
Cleanup onnx test runner's code
#23693 opened
Feb 13, 2025 -
[WIP] migrate WebGPU EP to WebAssembly to replace JSEP
#23697 opened
Feb 14, 2025 -
[webgpu]Add MaxPool and AveragePool
#23714 opened
Feb 15, 2025 -
[VitisAI] export Graph::SetName to VitisA IEP
#23731 opened
Feb 18, 2025 -
Updated MetaHuman testimonial from Epic Games.
#23737 opened
Feb 18, 2025 -
Investigate crash due to empty size
#23753 opened
Feb 19, 2025 -
Subgroupmatmul memory optimzations
#23758 opened
Feb 19, 2025 -
Enable test_gathernd_example_int32_batch_dim1
#23763 opened
Feb 20, 2025 -
Check for continuous decoding in MHA
#23766 opened
Feb 20, 2025 -
RoPE fp16 avx
#23772 opened
Feb 21, 2025 -
[webgpu] Optimize MatMulNBits f16 prefill shader for subgroup size 32
#23773 opened
Feb 21, 2025 -
NHWC DepthToSpace U8 and its transformation
#23784 opened
Feb 21, 2025 -
Make python package pipeline 1ES compliant
#23800 opened
Feb 24, 2025 -
Make python CUDA package pipeline 1ES compliant
#23802 opened
Feb 24, 2025 -
Make Cuda packaging pipeline 1ES compliant
#23806 opened
Feb 24, 2025 -
[WIP] Flash attention for generation
#23808 opened
Feb 25, 2025 -
Add Snapdragon NPU tutorial
#23813 opened
Feb 25, 2025 -
Add OpenCL EP
#23830 opened
Feb 27, 2025 -
[WebNN] Better int64 integration
#23831 opened
Feb 27, 2025 -
[mobile/reactnative] Remove namespace from AndroidManifest.XML to resolve warning
#23847 opened
Feb 27, 2025 -
[VitisAI] Just for internal test
#23849 opened
Feb 28, 2025 -
[OpenVINO]Session Options Appended After AppendExecutionProvider
#23852 opened
Feb 28, 2025 -
Synchronize patch files, fix resource compiler invocations in some situations
#23855 opened
Feb 28, 2025 -
Bump ruff from 0.9.5 to 0.9.9
#23863 opened
Mar 3, 2025 -
Move Linux DNNL/OpenVino pipelines to onnxruntime-Ubuntu2204-AMD-CPU machine pool
#23870 opened
Mar 3, 2025 -
[QNN EP] Add example that uses a custom CPU allocator for a QNN session
#23880 opened
Mar 4, 2025 -
[VitisAI EP] export InferShapes to VitisAIEP
#23881 opened
Mar 4, 2025 -
Pick Jian's pipeline changes to the 1.21 release branch
#23903 opened
Mar 5, 2025 -
[TensorRT EP] support TensorRT 10.9-GA
#23905 opened
Mar 5, 2025 -
[webgpu] Optimize MatMulNBits for f16 Block32 prefill performance
#23908 opened
Mar 6, 2025 -
[WIP][Native WebGPU] Remove explicit split operator in GQA
#23909 opened
Mar 6, 2025 -
[WebGPU] Direct CPU->GPU buffer upload for UMA
#23910 opened
Mar 6, 2025 -
Fix CUDA EP Abs and Sign bfloat16 support
#23914 opened
Mar 6, 2025 -
[WebGPU EP] Implements Gelu, BiasSplitGelu, and QuickGelu
#23920 opened
Mar 6, 2025 -
Extend CMAKE_CUDA_FLAGS with all Blackwell compute capacity
#23928 opened
Mar 7, 2025 -
[WIP] DepthToSpace for WebGPU EP
#23929 opened
Mar 7, 2025 -
update transformer version to 4.48.0
#23932 opened
Mar 7, 2025 -
VCPKG improvement: set VCPKG_OSX_DEPLOYMENT_TARGET
#23933 opened
Mar 7, 2025 -
[Native WebGPU] Added ReduceMax and ReduceSum
#23934 opened
Mar 7, 2025 -
[js] Add API for accessing metadata of a model's input/output
#23937 opened
Mar 7, 2025 -
[Fix] Dependencies find_package Eigen error
#23939 opened
Mar 7, 2025 -
Add support for custom position ids and attention mask to GQA CPU operator
#23944 opened
Mar 7, 2025 -
Qnn weight sharing improvement
#23945 opened
Mar 7, 2025 -
Allow using a different version of flatbuffers when building with vcpkg
#23946 opened
Mar 7, 2025
43 Issues closed by 29 people
-
[Build] Released asset for v1.20.1 doesn't work on macOS Sequoia
#23922 closed
Mar 7, 2025 -
What's the right way to construct custom ops with the same name but different output types?
#23891 closed
Mar 7, 2025 -
[Performance] Keep Onnx awake while in idle mode
#23461 closed
Mar 6, 2025 -
[Documentation] Unclear how to run `run_benchmark.py`
#23889 closed
Mar 5, 2025 -
[Build] ONNX Run Time on Conda Forge - Add CUDA Support
#23904 closed
Mar 5, 2025 -
[Build] NuGet Package missing header files
#23884 closed
Mar 5, 2025 -
[Build] how to compile ios static library
#23835 closed
Mar 4, 2025 -
ort.InferenceSession fails silently
#23869 closed
Mar 4, 2025 -
[Build] Android compatibility with WebGPU
#23565 closed
Mar 4, 2025 -
Memory leakage from ONNXRuntime environment on Linux machine using C.
#23798 closed
Mar 4, 2025 -
[Web] Shall we accept Uint16Array for 'float16' if Float16Array is available
#23817 closed
Mar 3, 2025 -
When will v1.20.0 be released for onnxruntime-openvino
#22783 closed
Mar 3, 2025 -
[Build] Windows MSVC DNNL build requires <chrono> include
#23854 closed
Feb 28, 2025 -
[Build] Android build Failure on ONNX Runtime 1.20.2 compiler doesn't support BFLOAT16
#23851 closed
Feb 28, 2025 -
[Build] mp11 not found
#23821 closed
Feb 27, 2025 -
Cuda execution provider is not available
#23833 closed
Feb 27, 2025 -
[Build] Linux i686 32 bit support
#23823 closed
Feb 27, 2025 -
Can't load CUDA on .NET project
#23810 closed
Feb 26, 2025 -
[Performance] Why Does Increasing the Number of CPU Cores Not Improve Performance?
#23747 closed
Feb 25, 2025 -
[Web] [Node] onnxruntime-node failing to load in workers
#23790 closed
Feb 23, 2025 -
[Performance] Memory Usage During Session Creating Doubled
#23775 closed
Feb 22, 2025 -
latest version of onnxruntime-gpu fail to use the pip installed cuda libraries
#23643 closed
Feb 21, 2025 -
[Build] Unable to cross-compile ONNX Runtime 1.17.1 for ARM Cortex A53
#23152 closed
Feb 20, 2025 -
C# Cannot initialize InferenceSession on arm64
#23716 closed
Feb 19, 2025 -
[Web] Package path ./webgpu is not exported from package
#23720 closed
Feb 19, 2025 -
Ep Context Model generated with external data is still dependent on the same data file
#23358 closed
Feb 19, 2025 -
Support Python 3.12
#23738 closed
Feb 18, 2025 -
OnnxRuntime does not see CUDAExecutionProvider with CUDA 12.4, CUDNN 9.
#23736 closed
Feb 18, 2025 -
[Build] ONNXRuntime v1.18 C++ build from source with TensorRT failing
#23661 closed
Feb 16, 2025 -
[Build]
#23719 closed
Feb 16, 2025 -
AccessViolationException occurs during multiple calls
#23713 closed
Feb 15, 2025 -
[WebGPU EP] Incorrect Alignment for Single u32 in Uniform Buffer
#23677 closed
Feb 13, 2025 -
[Feature Request] System.Numerics.Tensors support
#23605 closed
Feb 11, 2025 -
[Build]
#23638 closed
Feb 11, 2025 -
[Build] json dependency update request
#23512 closed
Feb 10, 2025 -
[Build] thrust::unary_function eprecated in cuda 12.8
#23499 closed
Feb 10, 2025 -
Model Unsupported model IR version: 11, max supported IR version: 10
#23602 closed
Feb 8, 2025 -
Error when creating OrtLoraAdapter (GetDataTransfer Expecting on device allocator for LoraAdapter)
#23620 closed
Feb 8, 2025
78 Issues opened by 69 people
-
[Web] WASM sigmoid producing numbers below 0 or above 1
#23943 opened
Mar 7, 2025 -
Error when I use cuda_runtime.h and OpenVINO EP at the same time
#23941 opened
Mar 7, 2025 -
[Feature Request] Add more options to load models at InferenceSession constructor
#23940 opened
Mar 7, 2025 -
Bad Allocation Error in ONNX Runtime on Windows x86 CPU When Processing Multiple Images Sequentially
#23938 opened
Mar 7, 2025 -
ConvInteger segfaults when x_zero_point is the empty string
#23927 opened
Mar 7, 2025 -
[Feature Request] Multi-Head Latent Attention(DeepSeek) support on CPU/NPU
#23925 opened
Mar 6, 2025 -
[Web] Facing this error in WebGPU: Model warmup failed: Error: input 'detection' is missing in 'feeds'.
#23921 opened
Mar 6, 2025 -
Public and open source contains header references to "confidential and proprietary" Microsoft code.
#23917 opened
Mar 6, 2025 -
[Build] memory leaked
#23915 opened
Mar 6, 2025 -
[Build] onnxruntime with tag 1.20.* build failed on Windows after VS upgrade to 17.13.*
#23911 opened
Mar 6, 2025 -
[Documentation] Memory Leak in TensorRTProvider example
#23901 opened
Mar 5, 2025 -
[C++, Linux] Segmentation fault when run OrtApi::Run
#23897 opened
Mar 5, 2025 -
[OpenVINO GPU] OpenVINO EP shouldn't override the "ACCURACY" precision to "FP32"
#23895 opened
Mar 5, 2025 -
[DO NOT UNPIN] ORT 1.21.0 Release Candidates available for testing
#23885 opened
Mar 4, 2025 -
Half of the length that correct output shape
#23883 opened
Mar 4, 2025 -
When using the int8 quantization model to convert to onnx, an error occurs during runtime
#23879 opened
Mar 4, 2025 -
The Pad operator has a calculation error in the "reflect" mode.
#23878 opened
Mar 4, 2025 -
Abs node runs into error with bf16 tensor
#23875 opened
Mar 3, 2025 -
Multi GPU support
#23874 opened
Mar 3, 2025 -
[OpenVINO] SessionOptionsAppendExecutionProvider_OpenVINO API loads NULL config file
#23871 opened
Mar 3, 2025 -
preprocess issues around MeanReduce/Reshape nodes and negative axes
#23868 opened
Mar 3, 2025 -
[Performance] Why does inference occupy so much memory?
#23867 opened
Mar 3, 2025 -
[Build] Openvino fails to build with AUTO:GPU,CPU
#23866 opened
Mar 3, 2025 -
Attention fusion broken for BART 🤖
#23864 opened
Mar 3, 2025 -
[Build] Build failure on Windows 11 with CUDA/cuDNN: nvcc subprocess error during CUDA compilation (v1.20.2)
#23844 opened
Feb 27, 2025 -
[Build] CUDA version linkage
#23841 opened
Feb 27, 2025 -
[Mobile] Dynamic Shape Challenge: Enabling LLM on QNN-HTP
#23832 opened
Feb 27, 2025 -
[CPU EP] GatherND crashes with division by zero when batch dimensions mismatch between input and indices
#23828 opened
Feb 27, 2025 -
[Build] ORT, DML, OpenVINO Python wheel build - "OpenVINOExecutionProvider doesn't support memcpy"
#23824 opened
Feb 26, 2025 -
[Build] ONNX Runtime Support for Cortex-M33 and Cortex-M7
#23822 opened
Feb 26, 2025 -
[Tests] 1 test fails: OptimizerInitializerTest.LoadExternalData: it throws a different type.
#23816 opened
Feb 26, 2025 -
Blank output issue with CUDAExecutionProvider - Onnx Model Converted to fp16
#23797 opened
Feb 24, 2025 -
[Build] Cross-compile for Android on Windows error
#23796 opened
Feb 24, 2025 -
[Performance]Do onednn executors depend on Intel platform
#23795 opened
Feb 24, 2025 -
[nodejs-binding] Crash during InferenceSession initialization: "Check failed: node->IsInUse()"
#23794 opened
Feb 24, 2025 -
Why the output of the ONNX MatMul node never be the same as what PyTorch gives?
#23792 opened
Feb 23, 2025 -
Is DML being deprecated?
#23783 opened
Feb 21, 2025 -
Microsoft.ML.OnnxRuntime.QNN 1.20.1 includes unnecessary filew in win-arm64.
#23781 opened
Feb 21, 2025 -
the memory usage not release
#23774 opened
Feb 21, 2025 -
Can load Fluxonnx Modal Components using InferenceSession
#23770 opened
Feb 20, 2025 -
[Build] WASM static lib build fails: no member named 'Negate' in 'onnxruntime::MLFloat16'
#23769 opened
Feb 20, 2025 -
Assistance with adjusting default Arena Allocator C/C++ API
#23768 opened
Feb 20, 2025 -
[Web] Getting Started link on onnxruntime.ai website broken
#23764 opened
Feb 20, 2025 -
the memory leak using valgrind
#23762 opened
Feb 20, 2025 -
[Mobile] [urgent] iOS application crash at CreateEnv (pointer being freed was not allocated)
#23759 opened
Feb 20, 2025 -
[Feature Request] com.microsoft.Xxhash3
#23754 opened
Feb 19, 2025 -
[Build] Error building with ACL EP on aarch64 linux (Raspberry Pi 5)
#23741 opened
Feb 18, 2025 -
Tensor Backing Buffer Mismatch Detected in Buffer Reuse
#23739 opened
Feb 18, 2025 -
Onnxruntime using OpenVINO for older version Intel UHD630
#23735 opened
Feb 18, 2025 -
Adding Execution Provider into ONNX RT
#23732 opened
Feb 18, 2025 -
Kernel error for T5-style beam search with FP-16 subgraphs
#23728 opened
Feb 17, 2025 -
Question about the ONNX Runtime 1.20.2 binary release
#23721 opened
Feb 16, 2025 -
OnnxRuntimeGenAIException: CUDA execution provider is not enabled in this build.
#23715 opened
Feb 15, 2025 -
[Web] [Feature Request] Ability to abort
#23703 opened
Feb 14, 2025 -
[Documentation] Link to C API for configuring telemetry in Privacy.md is dead
#23701 opened
Feb 14, 2025 -
Adding an Execution Provider to ONNX Runtime Upstream
#23700 opened
Feb 14, 2025 -
[Training] GRU and Squeeze artefact generation error
#23698 opened
Feb 14, 2025 -
[Documentation] Clarify Lifetime Requirements of inputs to Ort::IoBinding
#23689 opened
Feb 13, 2025 -
[Documentation] I/O Binding Needs Detail
#23682 opened
Feb 13, 2025 -
1.20.2 is released but still not in pypi - pip throws errror
#23681 opened
Feb 13, 2025 -
[Feature Request] What does ONNX Runtime do when the model does not fit in the memory?
#23664 opened
Feb 12, 2025 -
[Build] Android x86_64 Cross Compiling on Mac OS
#23648 opened
Feb 11, 2025 -
session.disable_fallback() has no effect, it always fallback to cpu
#23647 opened
Feb 11, 2025 -
[Build] Inconsistent naming of lib directories
#23642 opened
Feb 11, 2025 -
[Performance] Propagate NaNs in the CPU min and max operators introduces performance regression
#23628 opened
Feb 10, 2025 -
With TensorRT EP, the output matrix is all zeros, but with CUDAEP, the output is correct.
#23626 opened
Feb 10, 2025 -
onnxruntime-qnn silently failing when onnx model is not present
#23623 opened
Feb 8, 2025 -
[Build] 1.19.2 fails with eigen error
#23622 opened
Feb 8, 2025 -
TensorRT Provider "Attribute reduction is not supported"
#23618 opened
Feb 7, 2025
115 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[TensorRT] Support Multiple EP Context
#23294 commented on
Feb 19, 2025 • 18 new comments -
[mobile] Add Android NuGet BrowserStack test to NuGet packaging pipeline
#23580 commented on
Mar 6, 2025 • 16 new comments -
Whisper Redesigned Solution
#23549 commented on
Mar 6, 2025 • 8 new comments -
(WIP) bitnet and t-mac
#23540 commented on
Mar 3, 2025 • 6 new comments -
Cleanup CoreML EP's code to remove COREML_ENABLE_MLPROGRAM
#23490 commented on
Feb 27, 2025 • 4 new comments -
[Native WebGPU EP] Add packedQKV and do_rotary attribute support to GroupQueryAttention operator
#23386 commented on
Mar 6, 2025 • 3 new comments -
Migrate yarn to npm
#22116 commented on
Mar 6, 2025 • 2 new comments -
Add trace event control for ORT Web performance profiling
#23393 commented on
Feb 24, 2025 • 2 new comments -
Compile lib instead of executable when checking compiler features
#23329 commented on
Feb 23, 2025 • 2 new comments -
Upgrade current MacOS-13 to 14
#23293 commented on
Feb 26, 2025 • 1 new comment -
[Performance] 40% slowdown in ONNX Resize Operator on CPU
#23391 commented on
Feb 26, 2025 • 0 new comments -
Memory creeping up
#23348 commented on
Feb 26, 2025 • 0 new comments -
Creating TRT Cache much slower on Linux than on Windows
#23380 commented on
Feb 26, 2025 • 0 new comments -
How to build for multiple execution provider?
#9756 commented on
Feb 26, 2025 • 0 new comments -
[Build] Non-zero status code
#23497 commented on
Feb 27, 2025 • 0 new comments -
symbolic_shape_infer.py cannot infer torch.nn.normalize
#23516 commented on
Feb 28, 2025 • 0 new comments -
[Performance] Multithreading for DequantizeLinear
#23395 commented on
Feb 28, 2025 • 0 new comments -
[Performance] Preload model before inference
#23513 commented on
Mar 1, 2025 • 0 new comments -
[Web] WebGPU and WASM Backends Unavailable within Service Worker
#20876 commented on
Mar 1, 2025 • 0 new comments -
[Build] protocol buffer compiler error MSB8066
#23529 commented on
Mar 2, 2025 • 0 new comments -
[Performance] Speed-up TensorRT engine compilation
#23546 commented on
Mar 3, 2025 • 0 new comments -
System.EntryPointNotFoundException: Unable to find an entry point named 'OrtSessionOptionsAppendExecutionProvider_CUDA' in DLL 'onnxruntime'.
#22559 commented on
Mar 3, 2025 • 0 new comments -
Custom operators is not a registered function/op (python)
#23566 commented on
Feb 26, 2025 • 0 new comments -
debug result is ok, release get NaN output
#23440 commented on
Feb 26, 2025 • 0 new comments -
[Web] Declaration is not emitted in onnxruntime-node package
#17979 commented on
Feb 26, 2025 • 0 new comments -
[Performance] fp16 support and performance
#22242 commented on
Feb 25, 2025 • 0 new comments -
[Performance] GPU Fallback to CPU Without Error When CUDA DLLs Are Missing
#23372 commented on
Feb 25, 2025 • 0 new comments -
[Web] cannot load onnx model in a vite/react project, because of error expected magic word 00 61 73 6d, found 3c 21 44 4f @+0
#19556 commented on
Feb 25, 2025 • 0 new comments -
[Web] `Error: using ceil() in shape computation is not yet supported for AveragePool`
#21206 commented on
Feb 25, 2025 • 0 new comments -
[Performance] Round node shows huge performance drop on Windows
#23430 commented on
Feb 25, 2025 • 0 new comments -
SafeIntOnOverflow() Integer overflow error when running inference in an ASGI server
#12288 commented on
Feb 25, 2025 • 0 new comments -
VCPKG port for onnxruntime and modernize cmake builds
#7150 commented on
Feb 25, 2025 • 0 new comments -
Quantized ONNX Model Still Has Float32 Input/Output Tensors
#21138 commented on
Feb 25, 2025 • 0 new comments -
Using separate cuda streams for one session
#23319 commented on
Feb 14, 2025 • 0 new comments -
Migrate Zip-Nuget Package Pipeline to 1ES
#23609 commented on
Mar 7, 2025 • 0 new comments -
[WIP][webgpu] Apply dp4a for generation shader
#23585 commented on
Feb 10, 2025 • 0 new comments -
Add of Sum Gradient
#23568 commented on
Feb 12, 2025 • 0 new comments -
[WebGPU/JSEP] Support group query attention do_rotary attribute
#23524 commented on
Mar 7, 2025 • 0 new comments -
Fixing typo in finetune.md application example
#23442 commented on
Feb 14, 2025 • 0 new comments -
[WebNN EP] Support GroupQueryAttention(GQA)
#23416 commented on
Mar 7, 2025 • 0 new comments -
qgemm: optimize avxvnni QGEMM inner kernel for M=1
#22952 commented on
Feb 20, 2025 • 0 new comments -
[js/web] Add Wasm Relaxed SIMD support to wasm backend
#22794 commented on
Mar 5, 2025 • 0 new comments -
Fix rounding issue in int Resize vs. Resize with QDQ quantization.
#22476 commented on
Feb 14, 2025 • 0 new comments -
[VitisAI] Add vaip Integration Using FetchContent
#22038 commented on
Mar 7, 2025 • 0 new comments -
Broken multithreading inference session Onnxruntime-directml >= 1.18
#20713 commented on
Mar 7, 2025 • 0 new comments -
[Performance] using onnxruntime with ray and also fix for memory footprint too high
#16793 commented on
Mar 7, 2025 • 0 new comments -
Mixed Precision ValueError: validation failed for model with all nodes in node_block_list
#14235 commented on
Mar 7, 2025 • 0 new comments -
[Web] BiRefNet_T not working on webgpu
#21968 commented on
Mar 7, 2025 • 0 new comments -
Failed to load library libonnxruntime_providers_cuda.so I am getting the following erro
#19616 commented on
Mar 6, 2025 • 0 new comments -
[Build] What version of ArmNN does onnxruntime v1.15.1 work with?
#17763 commented on
Mar 6, 2025 • 0 new comments -
[TensorRT EP] How can I disable generating cache when using trt execution provider
#22822 commented on
Mar 6, 2025 • 0 new comments -
[Build] build error for windows
#23166 commented on
Mar 6, 2025 • 0 new comments -
[Feature Request] Request grid_sample 5D support 🌟
#21382 commented on
Mar 5, 2025 • 0 new comments -
[Build] aarch64 ACL (20.02) build fails with onnxruntime `v1.13.1`, `1.14.1` and `1.15.0`
#16176 commented on
Mar 5, 2025 • 0 new comments -
[WebGPU] `Kernel "[GroupQueryAttention] /model/layers.0/attn/GroupQueryAttention" failed. Error: Input "key" is expected to have 3, 4, or 5 dimensions".`
#22987 commented on
Mar 4, 2025 • 0 new comments -
raise Exception("Incomplete symbolic shape inference") when running "symbolic_shape_infer.py"
#10484 commented on
Mar 4, 2025 • 0 new comments -
RoiAlign CPU is not aligned to pixel centers (per the Mask RCNN paper and Facebook's Detectron2 implementation)
#6921 commented on
Mar 3, 2025 • 0 new comments -
[Build] How to build CoreML for running C++ code on MacOS
#23556 commented on
Mar 3, 2025 • 0 new comments -
[Build] WASM build of `v1.20.1` with `--use_xnnpack` fails
#23460 commented on
Feb 13, 2025 • 0 new comments -
No speedup from float16 with directml compared to cuda
#23359 commented on
Feb 13, 2025 • 0 new comments -
Build on Linux with CUDA support
#20330 commented on
Feb 13, 2025 • 0 new comments -
Linux Failed Build - std::piecewise_construct’ causes a section type conflict
#23345 commented on
Feb 13, 2025 • 0 new comments -
[Build] libonnxruntime_providers_shared.so statically linked?
#23355 commented on
Feb 13, 2025 • 0 new comments -
CUDAExecutionProvider doesn't seem to be used during inference of transformers exported model to ONNX runtime GPU
#22325 commented on
Feb 13, 2025 • 0 new comments -
Memory allocation failures due to incorrect requested buffer size
#18743 commented on
Feb 13, 2025 • 0 new comments -
Encryption does not work with trt_dump_ep_context_model
#23289 commented on
Feb 12, 2025 • 0 new comments -
Always getting "Failed to create CUDAExecutionProvider"
#11092 commented on
Feb 12, 2025 • 0 new comments -
[Build] Compilation error when building Onnxrt 1.20.1 with flag onnxruntime_CUDA_MINIMAL=ON with TRT 10.7.23 and Cudnn 9.6.0.74,
#23504 commented on
Feb 11, 2025 • 0 new comments -
Cannot resolve operator 'LSTM' with webgl backend
#23083 commented on
Feb 11, 2025 • 0 new comments -
Add logging to file option
#10586 commented on
Feb 10, 2025 • 0 new comments -
[Build]
#18570 commented on
Feb 10, 2025 • 0 new comments -
[Build] Not able to build ONNX Runtime Nuget package on Windows
#23321 commented on
Feb 10, 2025 • 0 new comments -
[DO NOT UNPIN] ORT 1.20 release candidates available for testing
#22604 commented on
Feb 10, 2025 • 0 new comments -
RUNTIME_EXCEPTION, 80070057 The parameter is incorrect in v1.17.3
#20464 commented on
Feb 9, 2025 • 0 new comments -
CoreML failed: Unable to get shape for output
#23262 commented on
Feb 9, 2025 • 0 new comments -
onnxruntime-python on AWS
#23291 commented on
Feb 9, 2025 • 0 new comments -
'Microsoft.ML.OnnxRuntime.NativeMethods' threw an exception
#23300 commented on
Feb 9, 2025 • 0 new comments -
How to implement a custom operator that support multiple compute device (CPU, CUDA)?
#23317 commented on
Feb 9, 2025 • 0 new comments -
TensorRTExecutionProvider error during session initialization
#22199 commented on
Feb 9, 2025 • 0 new comments -
[Delivery] Win ARM64 wheels + QNN
#19162 commented on
Feb 8, 2025 • 0 new comments -
[C#] ML.NET: ArgumentOutOfRangeException thrown in PredictionEngine.Predict
#23230 commented on
Feb 8, 2025 • 0 new comments -
The trt_engine_decryption_lib_path environment variable renders encryption worthless
#23290 commented on
Feb 8, 2025 • 0 new comments -
OnnxRuntime and Numerics.Tensors version numbers out-of-date
#23295 commented on
Feb 8, 2025 • 0 new comments -
Exception during initialization using Intel NPU (Intel AI boost)
#23305 commented on
Feb 8, 2025 • 0 new comments -
[WebGPU] `Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528.`
#22994 commented on
Feb 8, 2025 • 0 new comments -
[Performance] FP16 Clip and Handle Bias introduces insufficient optimization.
#23613 commented on
Feb 7, 2025 • 0 new comments -
[Feature Request] Add official support for onnxruntime-gpu on ARM64/aarch64 platforms
#22903 commented on
Feb 24, 2025 • 0 new comments -
[Documentation] CudaContext::AllocDeferredCpuMem
#23485 commented on
Feb 24, 2025 • 0 new comments -
[Build] how to buid on openharmony?
#20895 commented on
Feb 24, 2025 • 0 new comments -
[QUESTION]: onnxruntime with onednn backend
#23543 commented on
Feb 24, 2025 • 0 new comments -
RunAsync C# API crashes without any error
#19140 commented on
Feb 23, 2025 • 0 new comments -
onnxruntime-web is 11-17x times slower than native inference
#11181 commented on
Feb 23, 2025 • 0 new comments -
[Performance] kokoro onnx performance issues
#23384 commented on
Feb 23, 2025 • 0 new comments -
Nuget package Microsoft.ML.OnnxRuntime.Gpu version >= 1.17.0 not working
#23462 commented on
Feb 22, 2025 • 0 new comments -
memory.enable_memory_arena_shrinkage is not working in python
#23339 commented on
Feb 21, 2025 • 0 new comments -
[Feature Request] Adapters DML support
#23503 commented on
Feb 21, 2025 • 0 new comments -
Implemented Conv2D, Depth2Space, and Resize. Is anyone interested in merging these changes back?
#23471 commented on
Feb 21, 2025 • 0 new comments -
slow fp16 performance
#10919 commented on
Feb 20, 2025 • 0 new comments -
Not able to load QNN Context Binary Model
#23431 commented on
Feb 20, 2025 • 0 new comments -
[Web] How to use JSEP and WebGPU in static library (missing jsepAlloc or jsepInit)
#23072 commented on
Feb 19, 2025 • 0 new comments -
[Web] How should I get wasm file?
#19829 commented on
Feb 19, 2025 • 0 new comments -
[ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running FusedConv node.
#9194 commented on
Feb 19, 2025 • 0 new comments -
Memory Leak in Onnx Session Release [Web]
#21673 commented on
Feb 18, 2025 • 0 new comments -
[Build] Issues with Multithreading in the New Versions of onnxruntime-directml
#22867 commented on
Feb 18, 2025 • 0 new comments -
[Build] build onnxruntime for vsinpu error
#23316 commented on
Feb 18, 2025 • 0 new comments -
Model having scatterND layer giving different result every time with same input
#23396 commented on
Feb 17, 2025 • 0 new comments -
[Feature Request] MPS provider
#21271 commented on
Feb 17, 2025 • 0 new comments -
TensorrtExecutionProvider slower than CUDAExecutionProvider: Faster-rcnn [Performance]
#17434 commented on
Feb 16, 2025 • 0 new comments -
C# Run Program on NPU (OnnxRuntime + DirectML + NPU)?
#23375 commented on
Feb 15, 2025 • 0 new comments -
[ROCm] CK Datatype Adaptor - BFloat16
#23390 commented on
Feb 15, 2025 • 0 new comments -
[Accuracy] MSclap model accuracy issue (CPU vs QNN EP (NPU) )
#23394 commented on
Feb 15, 2025 • 0 new comments -
[Performance]Why is loading an ONNX model taking so long?
#23338 commented on
Feb 14, 2025 • 0 new comments -
How to create custom op with fp16 input
#23373 commented on
Feb 14, 2025 • 0 new comments -
[Build] Better support for vcpkg
#23158 commented on
Feb 14, 2025 • 0 new comments -
[js/webgpu] ConvTranspose1D slower on Webgpu than Wasm
#23273 commented on
Feb 14, 2025 • 0 new comments