Pulse · microsoft/onnxruntime · GitHub

July 4, 2025 – July 11, 2025

Overview

70 Active pull requests

47 Active issues

Could not load contribution data

Please try again later

1 Release published by 1 person

v1.22.1 ONNX Runtime v1.22.1
published Jul 8, 2025

44 Pull requests merged by 28 people

QNN-EP: DSPQueue Polling
#25361 merged Jul 11, 2025
[EP ABI] Add Node_GetEpType API
#25350 merged Jul 11, 2025
[WebNN] Fix bug in Float16Array availability check
#25354 merged Jul 11, 2025
add --client_package_build option
#25351 merged Jul 11, 2025
[webgpu] a few optimization to WGSL template
#25333 merged Jul 10, 2025
[EP ABI] Add Graph_GetGraphView API to get a OrtGraph from a subset of nodes
#25191 merged Jul 10, 2025
Make TRT plugins optional
#25261 merged Jul 10, 2025
[WebNN] Refactor webnn op input rank check and add validation for ops
#25185 merged Jul 10, 2025
Added creation of QDQ for TopK node
#25309 merged Jul 10, 2025
[QNN EP] Fix pool with reshape name conflicts
#25332 merged Jul 10, 2025
Add PackageVersion parameter to NuGet packaging stage
#25315 merged Jul 9, 2025
[CPU] GQA supports head_sink input for smooth softmax
#25269 merged Jul 9, 2025
[MLAS] DequantizeLinear int8/uint8
#24818 merged Jul 9, 2025
[web] Fix "npm run pull:wasm" script
#25330 merged Jul 9, 2025
Move buffer release or cache from OnRefresh to ReleaseBuffer in BucketCacheManager
#25276 merged Jul 9, 2025
Update vcpkg.json: remove optional-lite
#25339 merged Jul 9, 2025
[EP ABI] Utility to serialize OrtGraph to GraphProto
#25292 merged Jul 9, 2025
[webgpu] Move the early return after copying for ScatterND
#25345 merged Jul 9, 2025
[webgpu] Update wgsl_templates README.md
#25336 merged Jul 9, 2025
GatherBlockQuantized supports zero points and 8 bits for uint8 dtype
#25214 merged Jul 9, 2025
fix build break when multi EP is enabled (inference_session_test.cc)
#25329 merged Jul 8, 2025
Bump ruff from 0.11.13 to 0.12.2, clang-format from 19.1.7 to 20.1.7
#25301 merged Jul 8, 2025
[WebGPU] allow WGSL template generation
#25130 merged Jul 8, 2025
FIX printDebugInfo in ov_interface.cc
#25298 merged Jul 8, 2025
fix 0 tensor issue in matmul and scatter_nd
#25326 merged Jul 8, 2025
FIX c++17 compatibility in backend_utils.h
#25299 merged Jul 8, 2025
Update deprecated CCCL API
#25246 merged Jul 8, 2025
Delete .github/workflows/stale.yml
#25316 merged Jul 8, 2025
[EP ABI] Infer OrtDevice for plugin EP from registered OrtMemoryInfo
#25308 merged Jul 8, 2025
Fix cuda 12.9 windows build
#25317 merged Jul 8, 2025
1. Fix Nv EP Build Break:wq
#25311 merged Jul 8, 2025
[webgpu] support smooth softmax for non-FA GQA implementation
#25285 merged Jul 7, 2025
Exclude EPContext Op from Common Subexpression Elimination graph optimization
#25296 merged Jul 7, 2025
Add RotaryEmbeddings(23) - CUDA
#25178 merged Jul 7, 2025
Migrate stale bot workflow to updateStaleIssues.yml policy
#21660 merged Jul 7, 2025
[webgpu] Optimize DP4AMatMulNBitsSmallMProgram for intel
#25192 merged Jul 7, 2025
fix webgpu dequantize_linear ut
#25271 merged Jul 7, 2025
Add a new ORT API GetSessionOptionConfigEntries
#25277 merged Jul 7, 2025
[webgpu] a few optimizations to graph capture implementation
#25305 merged Jul 7, 2025
[WebNN] Always create a new constant for zero_points
#25286 merged Jul 7, 2025
[webgpu] Enable graph capture
#24900 merged Jul 7, 2025
Fix INT32 bias overflow in QOperator INT8 symmetric quantization by adjusting weight scale and requantizing
#25278 merged Jul 6, 2025
Add OrtEpFactory::GetVersion and store EP version in EP metadata.
#25272 merged Jul 5, 2025
[OVEP] OpenVINO EP Features Release 1.23
#25262 merged Jul 4, 2025

26 Pull requests opened by 23 people

FIX: dxcore include when compiling with older Windows SDK
#25297 opened Jul 6, 2025
Add a new operator attribute type `ORT_OP_ATTR_BYTES` to the ORT C API
#25300 opened Jul 7, 2025
Update updateStaleIssues.yml: remove the reopen issue logic
#25318 opened Jul 7, 2025
[CPU] GQA supports attention scores output
#25319 opened Jul 7, 2025
Convert Initializers to OrtValues Phase 2
#25320 opened Jul 8, 2025
Iraut/update nv trt rtx ep doc
#25321 opened Jul 8, 2025
Bump transformers from 4.50.0 to 4.51.0 in /onnxruntime/python/tools/transformers/models/stable_diffusion/requirements
#25322 opened Jul 8, 2025
[NvTensorRTRTX EP]Disable Fast GELU operator in base model used for NV EP Unit Tests
#25323 opened Jul 8, 2025
Bump transformers from 4.48.0 to 4.52.1 in /onnxruntime/python/tools/transformers/models/llama
#25328 opened Jul 8, 2025
[EP ABI] Get EP compiled model compatibility
#25331 opened Jul 8, 2025
[MIGraphx EP] Sync AMD changes upstream
#25338 opened Jul 9, 2025
Remove arm 32 references
#25341 opened Jul 9, 2025
[QNN EP] Add EP-aware Reshape handler for Transpose optimization.
#25344 opened Jul 9, 2025
Update python bindings to be able to use a shared allocator and/or IDataTransfer registered by a plugin EP in the Environment
#25346 opened Jul 9, 2025
Support read-only allocator for use with initializers
#25348 opened Jul 9, 2025
Add patch for WebGPU on Android to handle fp16 in uniforms
#25349 opened Jul 9, 2025
add build matrix for wgsl template
#25352 opened Jul 10, 2025
[webgpu] Apply template to `MatMulNBitsWideTile`
#25353 opened Jul 10, 2025
Add Compile API to set the location for the context binary file
#25356 opened Jul 10, 2025
[CUDA] Update Flash Attention to support head_sink for smooth softmax in GQA
#25358 opened Jul 10, 2025
Fix SigLIP casual mask bug
#25360 opened Jul 10, 2025
[EP ABI] Update to use Node_GetEpName
#25363 opened Jul 11, 2025
[JSEP] Fix inputShape index OOB in slice.ts
#25364 opened Jul 11, 2025
Add vendor id to OrtEpFactory
#25365 opened Jul 11, 2025
[webgpu] Enable per-run control for graph capture
#25367 opened Jul 11, 2025
Enable CUDA Graph in nv_tensorrt_rtx EP
#25368 opened Jul 11, 2025

28 Issues closed by 13 people

The computer and android reasoning results are inconsistent
#17016 closed Jul 11, 2025
[Mobile] kokora.int8 Efficiency Below Expectations on iPhone 15
#25366 closed Jul 11, 2025
[Build] how to correctly disable FORTIFY_SOURCE
#25337 closed Jul 9, 2025
The code was deployed to the China Azure Function App (Windows), but an error occurred: "System.DllNotFoundException: Unable to load DLL 'onnxruntime' or one of its dependencies: A dynamic link library (DLL) initialization routine failed. (0x8007045A)"
#24702 closed Jul 9, 2025
[Performance] gpu inference is much slower than cpu
#17489 closed Jul 9, 2025
[Build] 'cudafe++' died with status 0xC0000005 (ACCESS_VIOLATION) when build onnxruntime_providers_cuda
#25239 closed Jul 9, 2025
`convert_onnx_models_to_ort` doesn't work on `large_model` onnx files (zip with tensor values stored externally)
#14697 closed Jul 8, 2025
[Build] How to build static lib?
#24704 closed Jul 8, 2025
Initializer duplication method in QDQQuantizer ignores existing `value_info` tensor with same name
#24705 closed Jul 8, 2025
Inference session crashes using ONNX runtime.
#20043 closed Jul 8, 2025
[WebGPU] `Error: [WebGPU] Kernel "[MaxPool] /sincnet/pool1d.0/MaxPool" failed. Error: length of specified kernel shapes should be 2 less than length of input dimensions`
#21386 closed Jul 8, 2025
ReduceMean(opset 18): axes input is empty and ‘noop_with_empty_axes’ is true, but ReduceMean does not act as an identity op.
#16586 closed Jul 8, 2025
[Web] WASM sigmoid producing numbers below 0 or above 1
#23943 closed Jul 7, 2025
[Web/WebGPU] Can't append execution provider: JS (v1.15.0)
#16137 closed Jul 7, 2025
Component Governance Alert on `cmake/external/protobuf`
#10758 closed Jul 7, 2025
Regression in TreeEnsembleRegressor if the provided graph is a DAG
#24636 closed Jul 7, 2025
Segmentation fault in `AppendExecutionProvider_CUDA_V2` when no GPU is available
#24652 closed Jul 7, 2025
failed to build 1.21.0/onnxruntime/core/mlas: Assembler messages: Error: no such instruction: `{vex} vpdpbusds
#24653 closed Jul 7, 2025
Bug related to setting provider options for OpenVINO using Java API
#24658 closed Jul 7, 2025
Is class Sigmoid op supported by CUDA 12.6?
#24670 closed Jul 7, 2025
AveragePool v19+ ignores `end` padding in computation when count_include_pad=1
#24681 closed Jul 7, 2025
When will v1.14 of the onnxruntime-openvino package be available?
#14773 closed Jul 7, 2025
Crosscompiling using VS2017 from Windows for Raspberrypi4
#7962 closed Jul 7, 2025
How to run YOLO with onnxruntime
#6236 closed Jul 7, 2025
Build error in build.py
#5980 closed Jul 7, 2025
DirectML error: The parameter is incorrect with KBNet S
#21583 closed Jul 7, 2025
[Web] Ability to create/use multiple wasm web workers
#15735 closed Jul 7, 2025
Cannot run Microsoft.SemanticKernel.Connectors.Onnx on RaspberryPi OS Lite
#25290 closed Jul 7, 2025

19 Issues opened by 16 people

[Feature Request] The MatMulNBits matmul_nbits_quantizer does not support 3D weight tensors.
#25362 opened Jul 11, 2025
[Web] Unable to build using WebGPU - `error: handleI64Signatures: signature too long for emwgpuWaitAny`
#25359 opened Jul 10, 2025
[Feature Request] add webgpu support for a PowerPreference session option
#25357 opened Jul 10, 2025
CudaProvider jumped completely when Tensorrt and Openvino provided
#25347 opened Jul 9, 2025
[Mobile] Microsoft.ML.OnnxRuntime.Managed MacCatalyst regression in 1.22.1. System.TypeInitializationException when accessing library APIs.
#25342 opened Jul 9, 2025
QNN EP fails on model that can run as a QNN context binary
#25335 opened Jul 9, 2025
QNN EP appears to accept a model that QNN cannot execute
#25334 opened Jul 9, 2025
[Bug] [Node.js binding] Memory leak after releasing inference session
#25325 opened Jul 8, 2025
[Web] WebGPU Device Promise not defined
#25324 opened Jul 8, 2025
[Feature Request] Add a more detailed OrtStatus for diagnosing model compilation incompatibilities
#25314 opened Jul 7, 2025
[Feature Request] API for callers to determine if a compiled model is compatible with a given device
#25313 opened Jul 7, 2025
[Feature Request] Add mechanism for helping identify what driver version was used in compiling a model
#25312 opened Jul 7, 2025
[Feature Request] CPU EP `Where` data type registration, add int8 and uint32
#25306 opened Jul 7, 2025
OpenVINO EP fails to run models with in-memory external data
#25304 opened Jul 7, 2025
[Feature Request] `GetEpDevices()` returns a sorted EP devices list
#25302 opened Jul 7, 2025
GetShape crashes on Linux
#25295 opened Jul 5, 2025
[Mobile] TypeError: A bool tensor's data must be type of function Uint8Array() { [native code] }
#25294 opened Jul 5, 2025
[Build] /install-utils.js Error: Failed to download build list. HTTP status code = 302
#25293 opened Jul 5, 2025
Missing win-arm libraries
#25291 opened Jul 5, 2025

1,279 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Attention Operator (CPU)
#25156 commented on Jul 11, 2025 • 58 new comments
Plugin EP data transfer and Stream support.
#25254 commented on Jul 11, 2025 • 14 new comments
[WebGPU EP] extend concat to handle large number of inputs
#25177 commented on Jul 10, 2025 • 6 new comments
Add Int4 and UInt4 support for Cast
#24973 commented on Jul 10, 2025 • 5 new comments
Fix Sign and Clip operation on int64 tensors
#25280 commented on Jul 7, 2025 • 3 new comments
KleidiAI SGEMM/IGEMM/Quantized MatMul - Modular MLAS API Changes for KleidiAI
#25187 commented on Jul 11, 2025 • 1 new comment
[MLAS] Add 8-bit weights ARM64 Gemm implementation
#25110 commented on Jul 7, 2025 • 1 new comment
Use NPU in NXP iMX8MP?
#11854 commented on Jul 7, 2025 • 0 new comments
What's the meaning of the hole of tracing file
#11850 commented on Jul 7, 2025 • 0 new comments
Incompatible dimensions for matrix multiplication Error in StarNet model when doing InferenceSession
#11846 commented on Jul 7, 2025 • 0 new comments
[ONNXRuntimeError] FuseReluClip failure
#11836 commented on Jul 7, 2025 • 0 new comments
in cmake/CMakeList.txt all avx related option all set off, do we need do anything to use avx features?
#11833 commented on Jul 7, 2025 • 0 new comments
[Bug] Mixing negative and positive paddings causes segfault/uninitialized memory values produced in reflected pad
#11828 commented on Jul 7, 2025 • 0 new comments
Issue importing onnxruntime
#11815 commented on Jul 7, 2025 • 0 new comments
When I use onnxruntime to run onnx model on GPU, it sucks up too much video memory. Is that normal?
#11809 commented on Jul 7, 2025 • 0 new comments
I do not get any performance improvement after using TensorRT provider for object detection model
#11806 commented on Jul 7, 2025 • 0 new comments
Failed to build onnxruntime on Apple Sillion
#11805 commented on Jul 7, 2025 • 0 new comments
How to use batch run？
#11852 commented on Jul 7, 2025 • 0 new comments
What is the meaning of src_arg_index and dst_arg_index in EdgeEndToMatch structure?
#11856 commented on Jul 7, 2025 • 0 new comments
Wrong output shape due to MergeShape failure
#11870 commented on Jul 7, 2025 • 0 new comments
Not clear quantization pipeline for tensorrt ep
#11873 commented on Jul 7, 2025 • 0 new comments
Pytorch -> Onnx custom Yolov5 model works in python but not in JS
#11874 commented on Jul 7, 2025 • 0 new comments
[ONNXRuntimeError] Load model from *** failed: Unsuported type proto value case
#11889 commented on Jul 7, 2025 • 0 new comments
Quantize specific ops per-tensor while per_channel=True
#11890 commented on Jul 7, 2025 • 0 new comments
Bug: MatMul fails for input shapes of [0, k] and [k, ]
#11895 commented on Jul 7, 2025 • 0 new comments
onnx and onnxruntime disagree on input with no known rank
#11891 commented on Jul 7, 2025 • 0 new comments
Immense GPU memory consumption
#11903 commented on Jul 7, 2025 • 0 new comments
Inference time for qunatized onnx models, TensorRT> CUDA> CPU. Is this expected?
#11201 commented on Jul 7, 2025 • 0 new comments
build for c#
#11648 commented on Jul 7, 2025 • 0 new comments
output shape can not be specified in com.microsoft::GridSample op
#11652 commented on Jul 7, 2025 • 0 new comments
Installing ORTModule torch extension reports TypeError
#11663 commented on Jul 7, 2025 • 0 new comments
when set inter_op_num=0 with ORT_PARALLEL model the performance is very bad than inter_op_num=1?
#11668 commented on Jul 7, 2025 • 0 new comments
How to implement a new operator inference function?
#11678 commented on Jul 7, 2025 • 0 new comments
[web] `ort.InferenceSession.create` silently hangs/fails on iOS/iPad browsers if COEP/COOP headers are set
#11679 commented on Jul 7, 2025 • 0 new comments
which onnxruntime-gpu version is compatible for CUDA 11.1 ?
#11685 commented on Jul 7, 2025 • 0 new comments
Real-ESRGAN slow onnxruntime inference compared to Pytorch one
#11688 commented on Jul 7, 2025 • 0 new comments
Linux CI pipelines can't test unreleased versions of ONNX
#11693 commented on Jul 7, 2025 • 0 new comments
Dynamic quantization of Albert model
#11701 commented on Jul 7, 2025 • 0 new comments
Low level profiling for onnxrt Conv kernel(default backend)
#11702 commented on Jul 7, 2025 • 0 new comments
CUDA EP spending lots of time idling
#11706 commented on Jul 7, 2025 • 0 new comments
Race condition when setting do_copy_in_default_stream to false
#11713 commented on Jul 7, 2025 • 0 new comments
Reading back multidimensional output in C++
#11718 commented on Jul 7, 2025 • 0 new comments
how to get the remaining GPU memory to get the batch size?
#11735 commented on Jul 7, 2025 • 0 new comments
ssd_mobilenet_v1 infer error for TensorRT Execution Provider
#11736 commented on Jul 7, 2025 • 0 new comments
build rknpu backend error
#11738 commented on Jul 7, 2025 • 0 new comments
Pip installed Transformer Benchmark cannot run on TF
#11751 commented on Jul 7, 2025 • 0 new comments
Converted ONNX model works in Python but not in C++
#11761 commented on Jul 7, 2025 • 0 new comments
create op
#12017 commented on Jul 7, 2025 • 0 new comments
Resize with mode linear is missing output elements
#12019 commented on Jul 7, 2025 • 0 new comments
Microsoft.ML.OnnxRuntime.Tests.InferenceTest.TestPreTrainedModels should get opset version from the model file
#12040 commented on Jul 7, 2025 • 0 new comments
Builds C# bindings and creates nuget package
#12042 commented on Jul 7, 2025 • 0 new comments
GlobalAveragePool on large size of ones miscalculates
#12043 commented on Jul 7, 2025 • 0 new comments
Using onnxruntime server for model deployment
#12044 commented on Jul 7, 2025 • 0 new comments
Support pasts as inputs in gpt2 beam search operator
#12047 commented on Jul 7, 2025 • 0 new comments
Build wasm static library bug because of missing `testdata` folder.
#12048 commented on Jul 7, 2025 • 0 new comments
Performance in parallel session Run()
#12049 commented on Jul 7, 2025 • 0 new comments
Builds C# bindings and creates nuget package for vs2019 install
#12061 commented on Jul 7, 2025 • 0 new comments
ONNXRuntimeError for "Where" node when the input is too long
#12065 commented on Jul 7, 2025 • 0 new comments
Performance issue with beam search in onnxruntime
#12078 commented on Jul 7, 2025 • 0 new comments
Support for cmake's FetchContent()
#12081 commented on Jul 7, 2025 • 0 new comments
TensorRT Provider Vs TensorRT Native
#12083 commented on Jul 7, 2025 • 0 new comments
Resize with mode linear always produces 0.5 on GPU regardless of the input
#12091 commented on Jul 7, 2025 • 0 new comments
Resize with `nearest` mode have inconsistent results compared to PyTorch and TVM
#12098 commented on Jul 7, 2025 • 0 new comments
onnxruntime tensorrt sometime cost verg log time
#12120 commented on Jul 7, 2025 • 0 new comments
How do I call the same model in CUDA with many various inputs?
#12126 commented on Jul 7, 2025 • 0 new comments
Error in symbloc_shape_infer.py: assert name in self.sympy_data_ or ...
#12127 commented on Jul 7, 2025 • 0 new comments
GPU inference result not stable
#13178 commented on Jul 7, 2025 • 0 new comments
ConvTranspose with auto_pad attribute
#11927 commented on Jul 7, 2025 • 0 new comments
how to get inference time with c# onnxruntime-gpu-1.6.0
#11946 commented on Jul 7, 2025 • 0 new comments
excute dnnl provider error
#11947 commented on Jul 7, 2025 • 0 new comments
windows11+onnxruntime1.8.0+vs2019 inferencing crash
#11950 commented on Jul 7, 2025 • 0 new comments
Multi thread of single session Python vs C++ (end with core dumped)
#11951 commented on Jul 7, 2025 • 0 new comments
Inference_GPT2-OneStepSearch_OnnxRuntime_CPU.ipynb Error
#11959 commented on Jul 7, 2025 • 0 new comments
Question about quantize Gemm OP
#11961 commented on Jul 7, 2025 • 0 new comments
Got segmentation fault error when using 'InferenceSession' API
#11964 commented on Jul 7, 2025 • 0 new comments
how to configure lobal/shared threadpool with multithread, in c#API?
#11966 commented on Jul 7, 2025 • 0 new comments
set gpu option failed
#11967 commented on Jul 7, 2025 • 0 new comments
quant onnx model slower than pytorch with mish6 activation, howerver faster with relu6
#11975 commented on Jul 7, 2025 • 0 new comments
inference time is not stable
#11983 commented on Jul 7, 2025 • 0 new comments
Any interest in hosting the Rust bindings
#11992 commented on Jul 7, 2025 • 0 new comments
inference is different on linux and windows
#11993 commented on Jul 7, 2025 • 0 new comments
Inconsistent result to NumPy and PyTorch when consecutively casting a float tensor to int32 and then to bool
#11994 commented on Jul 7, 2025 • 0 new comments
failed to initialize a session in the GPU environment
#11996 commented on Jul 7, 2025 • 0 new comments
The test time of sess.run does not match the time of profile
#11997 commented on Jul 7, 2025 • 0 new comments
build C#api with cuda 11.0 /cudnn 8.0
#11999 commented on Jul 7, 2025 • 0 new comments
Issue with NeMo MTEncDecModel model in ONNX IOBinding
#12003 commented on Jul 7, 2025 • 0 new comments
how to build onnxruntime from source with dnnl?
#12011 commented on Jul 7, 2025 • 0 new comments
auto_set_affinity can't be set to true for parallel executor
#11205 commented on Jul 7, 2025 • 0 new comments
[web] ~100 seconds to load model/InferenceSession
#11217 commented on Jul 7, 2025 • 0 new comments
NonZero shape inference behavior with scalar input mismatches ONNX and PyTorch
#11232 commented on Jul 7, 2025 • 0 new comments
Unhandled exception at 0x00007FFABE6A9538 (cudnn_cnn_infer64_8.dll) in Onnx.exe
#11235 commented on Jul 7, 2025 • 0 new comments
[React Native .ort Model Loading Error] "Error: Can't load a model: No content provider: ..."
#11239 commented on Jul 7, 2025 • 0 new comments
I want use gpu on my jetson nx2 platform with c++, how should i do?
#11240 commented on Jul 7, 2025 • 0 new comments
Non-zero status code returned while running Slice node. Name:'Slice_24' Status Message: slice.cc:153 FillVectorsFromInput Starts must be a 1-D array
#11257 commented on Jul 7, 2025 • 0 new comments
Unsupported If operator in gradient builder for Hugging Face Transformers RoBERTa model
#11268 commented on Jul 7, 2025 • 0 new comments
optimize_model : new model types
#11270 commented on Jul 7, 2025 • 0 new comments
The onnx model of IMDN is slower than the original pytorch model and output many warnings
#11274 commented on Jul 7, 2025 • 0 new comments
pulled master 1.12 quantization get unexpected result
#11277 commented on Jul 7, 2025 • 0 new comments
LSTM export ONNX:Non-zero status code returned while running ScatterElements node. Name:'ScatterElements_880'
#11278 commented on Jul 7, 2025 • 0 new comments
Why gpt2-xl (based transformer-xl) onnx slower than the originer pytorch
#11293 commented on Jul 7, 2025 • 0 new comments
is the effect of onnx on Bert affected by python version?
#11295 commented on Jul 7, 2025 • 0 new comments
TVM EP and TensorRT EP do not support dynamic inputs
#11333 commented on Jul 7, 2025 • 0 new comments
MacOS M1 binary compilation and possibility to fine tune a model in C++
#11343 commented on Jul 7, 2025 • 0 new comments
CUDAExecutionProvider optimized model adds incompatible node resulting in Failed to find kernel for MemcpyToHost
#11348 commented on Jul 7, 2025 • 0 new comments
Lower performance on Inceptionv3/4 model with TensorRT EP than TensorRT directly
#11356 commented on Jul 7, 2025 • 0 new comments
CUDAExecutionProvider not releasing memory after terminate session
#11362 commented on Jul 7, 2025 • 0 new comments
Incorrect TypeInferenceError on UNDEFINED tensor type
#6370 commented on Jul 7, 2025 • 0 new comments
Cuda EP parallelization issues for batches
#11047 commented on Jul 7, 2025 • 0 new comments
C++ API, "tried creating tensor with negative value in shape" error when 'permute' and 'reshape' functions are used
#11069 commented on Jul 7, 2025 • 0 new comments
Inference session creation freezes
#11087 commented on Jul 7, 2025 • 0 new comments
compile with cuda error:Couldn't find CUDA library root.
#11090 commented on Jul 7, 2025 • 0 new comments
Always getting "Failed to create CUDAExecutionProvider"
#11092 commented on Jul 7, 2025 • 0 new comments
Performance reduction due to copying of output OrtValues to numpy arrays
#11099 commented on Jul 7, 2025 • 0 new comments
Using DnnlExecutionProvider for inference is much slower than using CPUExecutionProvider.
#11122 commented on Jul 7, 2025 • 0 new comments
Different detection output values for C++ and Python with onnxruntime
#11123 commented on Jul 7, 2025 • 0 new comments
docker container linux run onnxruntime infer core dumped
#11135 commented on Jul 7, 2025 • 0 new comments
[question] yolov5-onnx-float16 not improve on GPU
#11151 commented on Jul 7, 2025 • 0 new comments
How to use Flask with onnxruntime
#11156 commented on Jul 7, 2025 • 0 new comments
Instruction level profiling in onnxruntime
#11159 commented on Jul 7, 2025 • 0 new comments
No c++ header files for building custom op
#11169 commented on Jul 7, 2025 • 0 new comments
A normal output of convolution layer multiplies infinity will result in NaN
#11173 commented on Jul 7, 2025 • 0 new comments
Build from source issue on Windows
#11178 commented on Jul 7, 2025 • 0 new comments
onnxruntime-web is 11-17x times slower than native inference
#11181 commented on Jul 7, 2025 • 0 new comments
Custom Op does not support dynamic input/output number
#11186 commented on Jul 7, 2025 • 0 new comments
Saving GPT2LMHeadModel_ConfigurableOneStepSearch error.
#11198 commented on Jul 7, 2025 • 0 new comments
How to compress the sparse matrix in onnx model
#11200 commented on Jul 7, 2025 • 0 new comments
how to use c sharp to call libonnxruntime.dll? i build the onnxruntime dynamic dll, did it can be encapsulated c++ dll in order to c sharp called
#11550 commented on Jul 7, 2025 • 0 new comments
can c sharp call onnxruntime c++ dll don't use c# third lib？ i create onnxruntime c++ project,but i want to call the dll with c sharp
#11551 commented on Jul 7, 2025 • 0 new comments
T5-Large Export Results in ProtoBuf Error due to 2GB External Data when using padded inputs
#11558 commented on Jul 7, 2025 • 0 new comments
CUDA failure 100: no CUDA-capable device is detected ; error when inferencing on a GPUVM
#11561 commented on Jul 7, 2025 • 0 new comments
Specify CPUs to use for parallel inference when external CPU pinning is used
#11563 commented on Jul 7, 2025 • 0 new comments
[js/web] Inference is Broken in Safari when Cross Origin Isolation is active
#11567 commented on Jul 7, 2025 • 0 new comments
Header missmatch C/C++ - mac
#11570 commented on Jul 7, 2025 • 0 new comments
The effect of turning optimization on and off on quantized model performance
#11576 commented on Jul 7, 2025 • 0 new comments
ONNXRUNTIME + OpenVINO on ARM64
#11582 commented on Jul 7, 2025 • 0 new comments
cpu and gpu results is not the same
#11590 commented on Jul 7, 2025 • 0 new comments
did it can build onnxruntime with any cuda version by source code ? is not relate to onnxtuntime version?
#11584 commented on Jul 7, 2025 • 0 new comments
CUDNN failure 4: CUDNN_STATUS_INTERNAL_ERROR ; error when inferencing on a GPUVM
#11592 commented on Jul 7, 2025 • 0 new comments
issues with pybind11 repository while installing
#11595 commented on Jul 7, 2025 • 0 new comments
Bad performance for QDQ model with openvino EP
#11604 commented on Jul 7, 2025 • 0 new comments
Unable to build onnxruntime_v1.10.0 C++ api with --enable_memory_profile --enable_cuda_profiling flags
#11607 commented on Jul 7, 2025 • 0 new comments
Shape inference fails
#11614 commented on Jul 7, 2025 • 0 new comments
building ——libonnxruntime_providers_cuda.so Error running link command: No such file or directory
#11621 commented on Jul 7, 2025 • 0 new comments
how to set providers with onnx runtime-gpu1.70 ?
#11624 commented on Jul 7, 2025 • 0 new comments
using multithread to call onnxruntime inference,
#11628 commented on Jul 7, 2025 • 0 new comments
which tags should i download of onnxruntime-gpu 1.6 for c#
#11646 commented on Jul 7, 2025 • 0 new comments
ONNX Runtime compatibility for Jetson AGX Xavier
#11378 commented on Jul 7, 2025 • 0 new comments
About running onnxruntime in singularity container
#11397 commented on Jul 7, 2025 • 0 new comments
Benchmark code using torch.onnx.export
#11399 commented on Jul 7, 2025 • 0 new comments
About building onnxruntime singularity container with DockerFile
#11409 commented on Jul 7, 2025 • 0 new comments
Static quantization+per_channel is wrong for MobileNetV3
#11415 commented on Jul 7, 2025 • 0 new comments
Can I quantize TreeEnsembleClassifier op?
#11436 commented on Jul 7, 2025 • 0 new comments
Onnx T5 fp16 conversion without past_key_values
#11438 commented on Jul 7, 2025 • 0 new comments
How to run a double input onnx model
#11453 commented on Jul 7, 2025 • 0 new comments
InferenceSession giving different results than the original sklearn SVC model
#11490 commented on Jul 7, 2025 • 0 new comments
C#, How to access the different output layer of inference (semantic segmentation)
#11502 commented on Jul 7, 2025 • 0 new comments
[Documentation Request]
#11505 commented on Jul 7, 2025 • 0 new comments
onnxruntime error
#11509 commented on Jul 7, 2025 • 0 new comments
[Documentation Request] tensorAt for Csharp?
#11510 commented on Jul 7, 2025 • 0 new comments
About Convolution Implementation
#11517 commented on Jul 7, 2025 • 0 new comments
Tile fails for scalars on CPU
#11523 commented on Jul 7, 2025 • 0 new comments
How to release a session properly?
#11529 commented on Jul 7, 2025 • 0 new comments
Fail to convert model with reusable blocks
#11530 commented on Jul 7, 2025 • 0 new comments
CPUExecutionProvider outputs wrong value for a quantized model
#11532 commented on Jul 7, 2025 • 0 new comments
TensorRT EP session creation fails with invalid weights type of Int8 when ORT_TENSORRT_INT8_ENABLE set to 1
#11535 commented on Jul 7, 2025 • 0 new comments
Using a model with float input types causes space issue
#11541 commented on Jul 7, 2025 • 0 new comments
[Performance] inference time much slower (1529ms vs. 20 ms) on GPU vs CPU.
#13199 commented on Jul 7, 2025 • 0 new comments
[Performance] Performance issue on Linux vs Windows for BERT model.
#13224 commented on Jul 7, 2025 • 0 new comments
Contrib IRFFT operator output dimensions calculation
#13236 commented on Jul 7, 2025 • 0 new comments
Onnx create session takes a long time.
#13240 commented on Jul 7, 2025 • 0 new comments
Inference time spikes in UNET onnx
#13258 commented on Jul 7, 2025 • 0 new comments
[Performance] Too Slow when i do inference
#13265 commented on Jul 7, 2025 • 0 new comments
[Mobile] .Net target Arm64
#13295 commented on Jul 7, 2025 • 0 new comments
[ONNXRuntimeError] : 1 : FAIL : This is an invalid model. Error: the graph is not acyclic.
#13322 commented on Jul 7, 2025 • 0 new comments
onnx Pad operator with negative pads value outputs 'nan'
#13332 commented on Jul 7, 2025 • 0 new comments
[Build] Upgrade to latest protobuf
#13335 commented on Jul 7, 2025 • 0 new comments
[Performance] Comparing ONNX CPU execution profiles of two FasterRCNN checkpoints
#13341 commented on Jul 7, 2025 • 0 new comments
[Build] ONNX Runtime Build Error ZCU102 (DPUCZDX8G)
#13351 commented on Jul 7, 2025 • 0 new comments
quantize_dynamic results in initializer error
#13358 commented on Jul 7, 2025 • 0 new comments
[Performance] CNN Inference has latency spikes with TensorRT EP
#13366 commented on Jul 7, 2025 • 0 new comments
Onnxruntime crashes if setting cpu affinity fails in Ort::Session constructor
#13367 commented on Jul 7, 2025 • 0 new comments
Using GPU in c++
#13380 commented on Jul 7, 2025 • 0 new comments
Can't run qdq model with TRT EP
#13381 commented on Jul 7, 2025 • 0 new comments
Whether the .trt model can be loaded
#13394 commented on Jul 7, 2025 • 0 new comments
Does ORT support quantize
#13413 commented on Jul 7, 2025 • 0 new comments
Why the performance of onednn is worse than the common version
#12315 commented on Jul 7, 2025 • 0 new comments
How to set cpu_num to a specific value?
#12819 commented on Jul 7, 2025 • 0 new comments
AttentionPastState_dynamic test fails during building with CUDA EP from source
#12820 commented on Jul 7, 2025 • 0 new comments
Memory management
#12824 commented on Jul 7, 2025 • 0 new comments
error: package directory 'onnxruntime/backend' does not exist [Build]
#12922 commented on Jul 7, 2025 • 0 new comments
[Web] Failed to compile shader on WebGL
#12927 commented on Jul 7, 2025 • 0 new comments
Disabling optimization produces incorrect results on CUDAExecutionProvider in 1.12
#12946 commented on Jul 7, 2025 • 0 new comments
[Performance] Dynamic model input prediction is slow
#12955 commented on Jul 7, 2025 • 0 new comments
Why is there not ParallelExecutionPlan like SequentialExecutionPlan in the ParallelExecutor of onnxruntime?
#13036 commented on Jul 7, 2025 • 0 new comments
onnxruntime calculate gradients but no need for training
#13057 commented on Jul 7, 2025 • 0 new comments
onnxruntime-gpu, cudaoptions, result is different
#13061 commented on Jul 7, 2025 • 0 new comments
onnxruntime-node crash the electron app[Web]
#13086 commented on Jul 7, 2025 • 0 new comments
what's the differences between onnxruntime with openvino backend VS openvino directly?
#13087 commented on Jul 7, 2025 • 0 new comments
[Performance] a problem for Ort::IoBinding
#13090 commented on Jul 7, 2025 • 0 new comments
[Performance] ONNX Runtime GPT2 Model Running Significantly Slower than PyTorch
#13105 commented on Jul 7, 2025 • 0 new comments
[Test issue] Updated Ignore
#13109 commented on Jul 7, 2025 • 0 new comments
[Performance] Multithreading performance tails off after 3 threads, possible memory issue
#13138 commented on Jul 7, 2025 • 0 new comments
Failed to create CUDAExecutionProvider
#13139 commented on Jul 7, 2025 • 0 new comments
Onnxruntime fails on GPU loading inference with int8 models
#13168 commented on Jul 7, 2025 • 0 new comments
Multilingual-MiniLM-L12-H384 ONNX inference in NodeJS
#13171 commented on Jul 7, 2025 • 0 new comments
[Build]
#13554 commented on Jul 7, 2025 • 0 new comments
ORT fails on CPU looking for LayerNormalization node, for mixed-precision ONNX
#13556 commented on Jul 7, 2025 • 0 new comments
[TVM] Exception during initialization
#13572 commented on Jul 7, 2025 • 0 new comments
unable to build onnxruntime for openvino execution provider to get nuget packages
#13577 commented on Jul 7, 2025 • 0 new comments
Does Microsoft.ML.OnnxRuntime have a dependency on System.CodeDom.dll ?
#13604 commented on Jul 7, 2025 • 0 new comments
[Build]
#13606 commented on Jul 7, 2025 • 0 new comments
[C++] Model output image different in C++ ORT vs. Python ORT & PyTorch
#13614 commented on Jul 7, 2025 • 0 new comments
[Performance] Operators assigned to CPU instead of CUDA
#13615 commented on Jul 7, 2025 • 0 new comments
Dimension Padding problem in reduction_ops.cc
#13654 commented on Jul 7, 2025 • 0 new comments
[Performance] onnxruntime session uses 5x more system memory if torch is imported
#13662 commented on Jul 7, 2025 • 0 new comments
GPT2 Static Quantization Failed. Non-zero status code returned while running Reshape node. Name:'past_0_ReduceMax_Reshape'
#13667 commented on Jul 7, 2025 • 0 new comments
Help in running onnxruntime with SNPE as execution provider
#13693 commented on Jul 7, 2025 • 0 new comments
GPU with device_id=0 is always occupied no matter what device_id is specified when run the inference
#13697 commented on Jul 7, 2025 • 0 new comments
onnxruntime-gpu get warning "Serializing optimized model with Graph Optimization level greater than ORT_ENABLE_EXTENDED and the NchwcTransformer enabled".
#13709 commented on Jul 7, 2025 • 0 new comments
[DML] reproducible bug on DML provider
#13714 commented on Jul 7, 2025 • 0 new comments
[Build] Avoid NEON when building on Raspberry Pi 4
#13718 commented on Jul 7, 2025 • 0 new comments
[Web] Uncaught (in promise) TypeError: cannot resolve operator 'Erf' with opsets: ai.onnx v15
#13729 commented on Jul 7, 2025 • 0 new comments
[Web] NPM package include ts files in the output
#13736 commented on Jul 7, 2025 • 0 new comments
[Web]
#13749 commented on Jul 7, 2025 • 0 new comments
High Output Difference between ONNX model with different optimizer settings
#18959 commented on Jul 7, 2025 • 0 new comments
ONNX Runtime Inference on GPU: Failed to create CUDAExecutionProvider
#13414 commented on Jul 7, 2025 • 0 new comments
Consecutive casting leads to wrong result
#13418 commented on Jul 7, 2025 • 0 new comments
Parameters are optimized out even if it is a needed return value
#13425 commented on Jul 7, 2025 • 0 new comments
[Web] Is it possible to use both webgl backend and wasm backend in onnxruntime-web
#13435 commented on Jul 7, 2025 • 0 new comments
run_with_iobinding is not outputting the expected result for batched input data for T5 model running on ort CUDA EP
#13463 commented on Jul 7, 2025 • 0 new comments
GPU Arena blocked session->Run()
#13464 commented on Jul 7, 2025 • 0 new comments
Consecutive call to Ort::Session::Run() crashes
#13476 commented on Jul 7, 2025 • 0 new comments
did onnxruntime-gpu surport call CUDA code or call custom kernel funtion to preprocess Image?
#13491 commented on Jul 7, 2025 • 0 new comments
[Performance]
#13492 commented on Jul 7, 2025 • 0 new comments
ORT fails on Slice() when indices are of different integer types
#13497 commented on Jul 7, 2025 • 0 new comments
Init provider bridge failed when put onnxruntime folder under path which contains other Unicode character
#13499 commented on Jul 7, 2025 • 0 new comments
[Performance]
#13500 commented on Jul 7, 2025 • 0 new comments
[Performance] C# Gpu memory allocation
#13504 commented on Jul 7, 2025 • 0 new comments
Removing the semantic segmentation's bounding box
#13513 commented on Jul 7, 2025 • 0 new comments
How to transfer the Ort::Value obtained to cuda code for post-processing, such as a .cu file?
#13528 commented on Jul 7, 2025 • 0 new comments
Unable to use LSTM with mask of dynamic shape with TensorrtExecutionProvider
#16885 commented on Jul 7, 2025 • 0 new comments
[Training] Whether onnxruntime training can be used in Megatron.
#13532 commented on Jul 7, 2025 • 0 new comments
How can I load a model larger than 2G in memory
#13543 commented on Jul 7, 2025 • 0 new comments
Zero Result with DirectML Execution Provider
#13545 commented on Jul 7, 2025 • 0 new comments
Inference speed: Swintransformer torch vs onnxruntime-gpu
#13550 commented on Jul 7, 2025 • 0 new comments
ONNXRT default CPU EP vs Openvino EP Performance
#12316 commented on Jul 7, 2025 • 0 new comments
onnx graph partition optimize
#12318 commented on Jul 7, 2025 • 0 new comments
Wrong native library directory name for M1 Mac in the Java package
#12324 commented on Jul 7, 2025 • 0 new comments
MetaCommand exception from DirectML EP
#12328 commented on Jul 7, 2025 • 0 new comments
window10 ort with openvino backend error
#12334 commented on Jul 7, 2025 • 0 new comments
unsafe exception code in C++ API, wrongly declaring exceptions, incomplete constructors
#12338 commented on Jul 7, 2025 • 0 new comments
Unable to build Onnxruntime 1.12.0 with OpenVINO 2020.3 on Windows 10
#12342 commented on Jul 7, 2025 • 0 new comments
Quantized ONNX model output
#12346 commented on Jul 7, 2025 • 0 new comments
Performance gains by ONNX inconsistent
#12348 commented on Jul 7, 2025 • 0 new comments
Integer quantization fails on Transformer-based vision model
#12362 commented on Jul 7, 2025 • 0 new comments
Setting Openvino EP to run on one core with one thread
#12365 commented on Jul 7, 2025 • 0 new comments
Unable to build tensorrt docker image
#12373 commented on Jul 7, 2025 • 0 new comments
Accept dictionary of tensor as input (python api)
#12380 commented on Jul 7, 2025 • 0 new comments
Fail to build onnxRT with oneDNN using official build command
#12382 commented on Jul 7, 2025 • 0 new comments
Segmentation fault
#12386 commented on Jul 7, 2025 • 0 new comments
While loading the onnx file with InferenceSession getting session ID 11 error
#12402 commented on Jul 7, 2025 • 0 new comments
Failed to build with ACL(and ARMnn)
#12407 commented on Jul 7, 2025 • 0 new comments
Can't build with OpenVINO 2022.1 ("onnxruntime_providers_shared" does not exist)
#12411 commented on Jul 7, 2025 • 0 new comments
`Env(OrtLoggingLevel, const char* logid, OrtLoggingFunction, ...` fails to pass `logid` param to log function
#12414 commented on Jul 7, 2025 • 0 new comments
Inference time vs torch w/regard to batch_size and BatchNorm
#12130 commented on Jul 7, 2025 • 0 new comments
When will Attention OP extra_add_qk input support automatic broadcast
#12149 commented on Jul 7, 2025 • 0 new comments
Query regarding timings under ONNXRT profiler
#12150 commented on Jul 7, 2025 • 0 new comments
Hi Does ONNX Runtime support FP16 and INT8 inference on Intel OneDNN ExecutionProvider?
#12160 commented on Jul 7, 2025 • 0 new comments
Eager mode generator support non-tensor return types
#12163 commented on Jul 7, 2025 • 0 new comments
symbolic_shape_infer.py not working with models quantized with 🤗 Optimum for TensorRT
#12173 commented on Jul 7, 2025 • 0 new comments
upgrading pip and wheels kills CUDAExecutionProvider
#12185 commented on Jul 7, 2025 • 0 new comments
why first session.run is too slower than after
#12197 commented on Jul 7, 2025 • 0 new comments
Performance issue of ConvInteger
#12206 commented on Jul 7, 2025 • 0 new comments
How to release memory after Inference session run in Python
#12207 commented on Jul 7, 2025 • 0 new comments
Regarding the dynamism for custom op in ONNXRT
#12211 commented on Jul 7, 2025 • 0 new comments
Quantized Model Running Slow Using Cuda as EP
#12229 commented on Jul 7, 2025 • 0 new comments
Exported beam search model consumes a lot of more memory
#12246 commented on Jul 7, 2025 • 0 new comments
Mismatch in the order of the column names in the benchmarking script for transformer models
#12265 commented on Jul 7, 2025 • 0 new comments
LoadLibrary failed with error 126 (DirectML)
#12269 commented on Jul 7, 2025 • 0 new comments
TRT EP failed to create model session with CUDA custom op
#12282 commented on Jul 7, 2025 • 0 new comments
Since ORT 1.12 ort.InferenceSession throws error when the last provider is not capable
#12287 commented on Jul 7, 2025 • 0 new comments
SafeIntOnOverflow() Integer overflow error when running inference in an ASGI server
#12288 commented on Jul 7, 2025 • 0 new comments
Resize op can't work well under Cubic mode with ORT 1.12.
#12302 commented on Jul 7, 2025 • 0 new comments
Details regarding ONNXRuntime inference with OpenVino Backend
#12305 commented on Jul 7, 2025 • 0 new comments
[TEST FAILED] Several tests fails while running onnxruntime_test_all on armv7 based device
#16387 commented on Jul 7, 2025 • 0 new comments
Does ortvalue_from_numpy support directml?
#15421 commented on Jul 7, 2025 • 0 new comments
Confusing exception about supported types
#12648 commented on Jul 7, 2025 • 0 new comments
get kill signal when quantize the ONNX model using quantize_static
#12652 commented on Jul 7, 2025 • 0 new comments
Enable Global Shared Threadpool and Memory Allocator For C#
#12654 commented on Jul 7, 2025 • 0 new comments
Non-zero status code returned while running TopK node. (ssdlite320_mobilenet_v3_large)
#12669 commented on Jul 7, 2025 • 0 new comments
Mixed Precision ValueError: validation failed for model with all nodes in node_block_list
#14235 commented on Jul 7, 2025 • 0 new comments
[CPU EP] GatherND crashes with division by zero when batch dimensions mismatch between input and indices
#23828 commented on Jul 7, 2025 • 0 new comments
perf_view shows nothing after json load
#15927 commented on Jul 7, 2025 • 0 new comments
GPU Memory allocation with multiple cuda stream
#12920 commented on Jul 7, 2025 • 0 new comments
Wrong Results for FP16 Models in CUDAExecutionProvider and TensorRTExecutionProvider
#12726 commented on Jul 7, 2025 • 0 new comments
`static inline Ort::Env onnx_env{nullptr}` easily leads to nullptr deref on app exit
#12736 commented on Jul 7, 2025 • 0 new comments
SystemError : 13 for transformers optimizer
#12745 commented on Jul 7, 2025 • 0 new comments
BatchNormalization produces all zeros for 1D input
#12754 commented on Jul 7, 2025 • 0 new comments
How to set the priority of ONNX in GPU?
#12760 commented on Jul 7, 2025 • 0 new comments
onnxruntime-linux-x64-gpu-1.12.1
#12766 commented on Jul 7, 2025 • 0 new comments
Asynchrononus Inference
#12768 commented on Jul 7, 2025 • 0 new comments
I want to use tensorrt as the back-end of onnx
#12781 commented on Jul 7, 2025 • 0 new comments
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : CUDA error executing cudaSetDevice(GetDeviceId())
#12785 commented on Jul 7, 2025 • 0 new comments
cast op not support multithread
#12786 commented on Jul 7, 2025 • 0 new comments
CUDA support for longer-input models like BigBird
#12463 commented on Jul 7, 2025 • 0 new comments
I found that the OnnxRuntime used almost all of the instruction sets for the convolutional computations and I wanted to optimize for that
#12479 commented on Jul 7, 2025 • 0 new comments
How to exit abnormally in the Python Operator (PyOp)
#12481 commented on Jul 7, 2025 • 0 new comments
QDQ + Add nodes are not fused into QLinearAdd when the graph is optimized
#12487 commented on Jul 7, 2025 • 0 new comments
performance is poor when onnxruntime C++ run in intel cpu
#12489 commented on Jul 7, 2025 • 0 new comments
LSTM Y output is inconsistent with TF inference result when seq_len is effective
#12492 commented on Jul 7, 2025 • 0 new comments
Clarify NMS sorting strategy
#12493 commented on Jul 7, 2025 • 0 new comments
Attributes in nested function calls are zeroed out
#12506 commented on Jul 7, 2025 • 0 new comments
Computing loss within onnxrunitme inference (GPT2 model)
#12526 commented on Jul 7, 2025 • 0 new comments
java deploy in k8s Failed to load library libonnxruntime_providers_cuda.so with error
#12540 commented on Jul 7, 2025 • 0 new comments
engine decryption does not work in TensorRT EP
#12551 commented on Jul 7, 2025 • 0 new comments
Add execution provider selection for quantize_static
#12573 commented on Jul 7, 2025 • 0 new comments
Document beamsearch
#12584 commented on Jul 7, 2025 • 0 new comments
Name:'MatMul_32007' Status Message: matmul_helper.h:61 Compute MatMul dimension mismatch
#12594 commented on Jul 7, 2025 • 0 new comments
Run the onnx model converted from seq2seq and report an error
#12608 commented on Jul 7, 2025 • 0 new comments
Where is the definition of session.Run() in onnxruntime C++ api
#12623 commented on Jul 7, 2025 • 0 new comments
cuda_provider_options.h include non existing file?
#12636 commented on Jul 7, 2025 • 0 new comments
when the model support dynamic batch ，the input shape [ -1,-1,80], how can warm up? because of the dynamic batch , I do know the warmup batchsize number ,can it use min_batchsize and max _batchsize to warmup?
#12637 commented on Jul 7, 2025 • 0 new comments
The quantization model reduces the accuracy compared to the TRT
#12638 commented on Jul 7, 2025 • 0 new comments
Failed to create TensorrtExecutionProvider using onnxruntime-gpu
#12639 commented on Jul 7, 2025 • 0 new comments
[C-Api] Dynamic Shape Error: Non-zero status code returned while running Sigmoid node.
#6372 commented on Jul 7, 2025 • 0 new comments
onnxruntime v1.6.0 on Jetson Nano - Illegal Instruction (core dumped)
#6375 commented on Jul 7, 2025 • 0 new comments
How to extract dimension of inputs in onnxruntime/core/providers/cpu/math/matmul.cc
#6396 commented on Jul 7, 2025 • 0 new comments
Which executor to build when using: Intel® Deep Learning Boost (Intel® DL Boost)
#6400 commented on Jul 7, 2025 • 0 new comments
[question] Configure GPU arena with Python bindings
#6411 commented on Jul 7, 2025 • 0 new comments
Onnxruntime error when Relu-layer follows Dense-layer without activation and biases
#6423 commented on Jul 7, 2025 • 0 new comments
Reshape `requested_shape` forced to have leading dimension 1 when it should be -1
#6424 commented on Jul 7, 2025 • 0 new comments
Build failure in `orttraining_pybind_state.cc` when building with `--enable_training` and `--build_wheel`
#6536 commented on Jul 7, 2025 • 0 new comments
NaN in AveragePooling
#6543 commented on Jul 7, 2025 • 0 new comments
Loss of accuracy when GPT-2 based model is exported to ONNX
#6549 commented on Jul 7, 2025 • 0 new comments
Custom Op Registration and Implementation
#6564 commented on Jul 7, 2025 • 0 new comments
Inference error using migraohx-onnxruntime
#6605 commented on Jul 7, 2025 • 0 new comments
/onnxruntime/core/mlas/lib/quantize.cpp:50:62: error: ‘vminnmq_f32’ was not declared in this scope
#6638 commented on Jul 7, 2025 • 0 new comments
Failed to add Microsoft.AI.MachineLearning NuGet package to .NET Framework 4.6.1 projects
#6662 commented on Jul 7, 2025 • 0 new comments
INT8 quantized model is very slow
#6732 commented on Jul 7, 2025 • 0 new comments
Shape inference error for Range node
#6737 commented on Jul 7, 2025 • 0 new comments
onnxruntime-gpu (cudaexecutionprovider) usage of cudnn autotuner
#6744 commented on Jul 7, 2025 • 0 new comments
Unable to compile on Linux with CUDA
#6749 commented on Jul 7, 2025 • 0 new comments
Onnxruntime inference with Integrated GPU Failed
#6755 commented on Jul 7, 2025 • 0 new comments
Check if GPU is available
#15942 commented on Jul 9, 2025 • 0 new comments
Onnx Batch Processing
#6044 commented on Jul 7, 2025 • 0 new comments
How to extract the size of a map type in c++?
#6077 commented on Jul 7, 2025 • 0 new comments
could the checkpoint of bert convert to onnx model? I have a bug that 'BertForPreTraining' object has no attribute 'layers, output'
#6089 commented on Jul 7, 2025 • 0 new comments
how to implement execution provider (EP) that allow onnx run on my hardware?
#6110 commented on Jul 7, 2025 • 0 new comments
32bit vs 64bit when compiling or something else?
#6144 commented on Jul 7, 2025 • 0 new comments
GPU memory consumption keeps increasing with multithreading in Java
#6181 commented on Jul 7, 2025 • 0 new comments
Not support rtx 3000 series
#6213 commented on Jul 7, 2025 • 0 new comments
sample c++ program just print "hello" does not start
#6243 commented on Jul 7, 2025 • 0 new comments
Cannot create OnnxTensor with UINT8 type.
#6261 commented on Jul 7, 2025 • 0 new comments
Referencing Microsoft.ML.OnnxRuntime and Microsoft.ML.OnnxRuntime.GPU in a c# project.
#6264 commented on Jul 7, 2025 • 0 new comments
Could onnxruntime be compiled into wasm using emsdk?
#6275 commented on Jul 7, 2025 • 0 new comments
Performance shaking
#6301 commented on Jul 7, 2025 • 0 new comments
[Bug] Wrong implementation in LpPool
#6302 commented on Jul 7, 2025 • 0 new comments
Memory corruption when using OnnxRuntime with OpenVINO on the Intel MyriadX and Raspberry Pi 4B
#6304 commented on Jul 7, 2025 • 0 new comments
[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Error while inferencing DLRM onnx model
#6319 commented on Jul 7, 2025 • 0 new comments
Error: Running double precision model exported from pyTorch
#6320 commented on Jul 7, 2025 • 0 new comments
The output for GPT is NAN when fp16=True
#6328 commented on Jul 7, 2025 • 0 new comments
ROCm build seems broken: `error: ‘ncclComm_t’ does not name a type`
#6358 commented on Jul 7, 2025 • 0 new comments
Implementation of ONNX Functions
#6360 commented on Jul 7, 2025 • 0 new comments
Run a model containing CustomOp with TensorRT provider fails
#7314 commented on Jul 7, 2025 • 0 new comments
C# console app crash upon appending OpenVino execution provider
#7330 commented on Jul 7, 2025 • 0 new comments
Cannot save Tensorrt .engine model in v1.7.1
#7339 commented on Jul 7, 2025 • 0 new comments
openvino continued package by pyinstaller external dll issue
#7346 commented on Jul 7, 2025 • 0 new comments
Resize Operator rounds-down instead of round-to-even for int32/uint8
#7368 commented on Jul 7, 2025 • 0 new comments
How to compile the framework that can run in Windows XP？
#7444 commented on Jul 7, 2025 • 0 new comments
Please, update the docs. Provider parameter "cuda_mem_limit" was renamed to "gpu_mem_limit" in nightly build.
#7457 commented on Jul 7, 2025 • 0 new comments
How to release gpu memory without exiting the process？
#7463 commented on Jul 7, 2025 • 0 new comments
Running inference using GPU or TensorRT on Jetson
#7484 commented on Jul 7, 2025 • 0 new comments
Problem compiling ONNX RT with CUDA and TensorRT on Windows
#7562 commented on Jul 7, 2025 • 0 new comments
Use of torch InstanceNorm2d and dynamic tensor size causes crash
#7572 commented on Jul 7, 2025 • 0 new comments
onnxruntime build is not compatible with onnx build. Protobuf loaded twice.
#7597 commented on Jul 7, 2025 • 0 new comments
Large GPU memory usage with EXHAUSTIVE cuDNN search
#7612 commented on Jul 7, 2025 • 0 new comments
Enable CUDA provider option configuration in Java
#7613 commented on Jul 7, 2025 • 0 new comments
Publish the providers with the release build
#7628 commented on Jul 7, 2025 • 0 new comments
Build fails with --use_rknpu
#7614 commented on Jul 7, 2025 • 0 new comments
int8 quantization on GPU support? (transformers)
#7634 commented on Jul 7, 2025 • 0 new comments
Does onnxruntime support bert with relative position embedding
#7713 commented on Jul 7, 2025 • 0 new comments
quantize model can‘t run on gpu ?
#7745 commented on Jul 7, 2025 • 0 new comments
Loading a Keras model with custom layers into Microsoft.ML
#10419 commented on Jul 7, 2025 • 0 new comments
Onnxruntime.gpu is as slower than cpu mode
#6799 commented on Jul 7, 2025 • 0 new comments
Multiple input and multiple output models that create tensors in loops can cause serious crashes
#6821 commented on Jul 7, 2025 • 0 new comments
ONNXRuntime Inference with Finetuned BERT Model outputting odd results
#6830 commented on Jul 7, 2025 • 0 new comments
Unable to build onnxruntime with "--build_wheel" and "--enable_pybind" options
#6841 commented on Jul 7, 2025 • 0 new comments
[JAVA Bindings + Android arm64-v8a] ONNXRuntime build documentation
#6923 commented on Jul 7, 2025 • 0 new comments
dynamic shape input is much slower than fixed shape input in gpu
#6978 commented on Jul 7, 2025 • 0 new comments
CUDA header requested but missing in DNNL part of ORT 1.7.1
#7005 commented on Jul 7, 2025 • 0 new comments
Build fail for docker on MacOS. -NO GPU.
#7052 commented on Jul 7, 2025 • 0 new comments
Large Memory Allocations When Loading RandomForestRegressor Model
#7067 commented on Jul 7, 2025 • 0 new comments
Non-zero status code returned while running BatchNormalization node
#7095 commented on Jul 7, 2025 • 0 new comments
Memory and timing issue with onnxruntime python API with TensorFlow model
#7106 commented on Jul 7, 2025 • 0 new comments
Compile error in header onnxruntime_cxx_api.h when update ONNX runtime from 1.5.2 to 1.7.1
#7142 commented on Jul 7, 2025 • 0 new comments
Batch inference
#7178 commented on Jul 7, 2025 • 0 new comments
Segmentation fault when running onnxruntime inside docker with cpuset restrictions
#7207 commented on Jul 7, 2025 • 0 new comments
Significant difference in the performance of pytorch and exported onnx models
#7212 commented on Jul 7, 2025 • 0 new comments
TensorrtExecutionProvider slower than CUDAExecutionProvider: Transformers
#7230 commented on Jul 7, 2025 • 0 new comments
The speed of running the onnx model is 6x slower than the pytorch model on Jetson TX2
#7233 commented on Jul 7, 2025 • 0 new comments
[Python API + ARM64] Running ResNet50 on ARM board using ACL Error and Performance Issue
#7234 commented on Jul 7, 2025 • 0 new comments
ACL (32bit) Execution Provider fails on gemm node
#7255 commented on Jul 7, 2025 • 0 new comments
onnxruntime gpu version can't installed, how to fix it?
#7272 commented on Jul 7, 2025 • 0 new comments
System.ExecutionEngineException creating Microsoft.ML.OnnxRuntime.SessionOptions
#23263 commented on Jul 9, 2025 • 0 new comments
onnxruntime produces invalid results due to the wrong shape inference for the clip operator
#24971 commented on Jul 9, 2025 • 0 new comments
Color contrast of focus on learn more about ONNX Runtime&Generative AI, quick start, tutorials, Install ONNX Runtime, Hardware Accelartion,ONNX Runtime youtube channel is 1.152:1 less than 3:1: A11y_ONNX Runtime & Ecosystem_Runtime_Non text contrast
#24995 commented on Jul 9, 2025 • 0 new comments
Focus is not visible on Scrolling cards under Trusted By section: A11y_ONNX Runtime & Ecosystem_Runtime_Focus Visible
#24996 commented on Jul 9, 2025 • 0 new comments
The behavior of Gather/GatherElements/GatherND when the indices values are out-of-bounds
#25251 commented on Jul 9, 2025 • 0 new comments
MaxPool produces results with wrong shape
#25234 commented on Jul 9, 2025 • 0 new comments
Multi-threaded GPU inferencing failing with whisper-small: Non-zero status code returned while running DecoderMaskedMultiHeadAttention node
#21413 commented on Jul 9, 2025 • 0 new comments
Significant loading time with TensorRT compared to not
#4018 commented on Jul 9, 2025 • 0 new comments
preprocess issues around MeanReduce/Reshape nodes and negative axes
#23868 commented on Jul 8, 2025 • 0 new comments
Reshape with a `0` dimension produces incorrect shape
#15203 commented on Jul 8, 2025 • 0 new comments
Creating ORT inference session from onnx model gives segmentation fault
#24087 commented on Jul 8, 2025 • 0 new comments
[Web] `Error: using ceil() in shape computation is not yet supported for AveragePool`
#21206 commented on Jul 8, 2025 • 0 new comments
[Build] .pc file asks for -lonnxruntime but onnxruntime.a isn't installed
#23959 commented on Jul 8, 2025 • 0 new comments
Onnx Runtime for Java is packaged with 200MB onnxruntime.pdb in the win-x64 native package
#12084 commented on Jul 8, 2025 • 0 new comments
[OpenVINO] SessionOptionsAppendExecutionProvider_OpenVINO API loads NULL config file
#23871 commented on Jul 8, 2025 • 0 new comments
[Performance] Increased memory usage when loading from bytes
#21165 commented on Jul 8, 2025 • 0 new comments
[Web] `Error: [WebGPU] Kernel "[Conv] /text_encoder/encoder/layers.0/feed_forward/conv_2/Conv" failed. Error: FILTER_IN_CHANNEL should be equal to DATA_CHANNEL`
#21108 commented on Jul 8, 2025 • 0 new comments
[Performance] Dynamic Shape performance
#13198 commented on Jul 8, 2025 • 0 new comments
[Build] Mismatched library directory in linux-x64 package: lib and lib64
#22267 commented on Jul 7, 2025 • 0 new comments
Use env. allocators for initializers (#25108)
#25281 commented on Jul 11, 2025 • 0 new comments
Upgrade xnnpack to latest
#25275 commented on Jul 11, 2025 • 0 new comments
[ARM CPU] SVE support for Elementwise kernels
#25238 commented on Jul 10, 2025 • 0 new comments
[webgpu] extend cast version to 23
#25235 commented on Jul 9, 2025 • 0 new comments
[Don't review][webgpu] Support sg_size=32 for dp4 shader
#25184 commented on Jul 7, 2025 • 0 new comments
[QNN_EP] Implement Efficient Mode API
#25146 commented on Jul 11, 2025 • 0 new comments
Compile API: support for OrtModel input and write output to stream
#24740 commented on Jul 7, 2025 • 0 new comments
Prototype getting EP graph partitioning info from OrtSession
#24688 commented on Jul 9, 2025 • 0 new comments
Migrate issue labeler workflow to issueLabeler.yml policy
#21659 commented on Jul 7, 2025 • 0 new comments
Output mismatch in QDQ model with optimizations enabled vs disabled (CPU Execution Provider)
#25259 commented on Jul 11, 2025 • 0 new comments
Memory safety for Nvidia GPU time-slicing
#24943 commented on Jul 10, 2025 • 0 new comments
[Build] Build fails: 'error : no operator "+=" matches these operands' with nv_bfloat16
#25162 commented on Jul 10, 2025 • 0 new comments
[CUDA] Acquiring a CUDA allocator without loading a session.
#19420 commented on Jul 10, 2025 • 0 new comments
[Performance] ONNX Runtime: Concat and Slice ops fallback to CPU even with float32 and static shapes
#24999 commented on Jul 10, 2025 • 0 new comments
[Bug] Invalid type for QuantizeLinear dtype post-ORT optimizations
#25001 commented on Jul 10, 2025 • 0 new comments
AMD GPU-NPU
#25142 commented on Jul 10, 2025 • 0 new comments
how to release gpu memory when keep onnxruntime session around.
#9509 commented on Jul 10, 2025 • 0 new comments
[OpenVINO EP] GetCapability shouldn't override the NPU device type as CPU
#25164 commented on Jul 10, 2025 • 0 new comments
[CPU EP] Fail to run some WPT WebNN argMin/argMax conformance tests of uint32/uint64 types by default CPU EP
#25183 commented on Jul 10, 2025 • 0 new comments
[Build] Unable to build ONNX Runtime 1.22 due to dependency update
#25098 commented on Jul 9, 2025 • 0 new comments
Performance comparison
#5834 commented on Jul 7, 2025 • 0 new comments
IOBindings in C++ API are missing a way to SynchronizeInputs.
#5857 commented on Jul 7, 2025 • 0 new comments
How to compile with vs2019, with the platform tool set "Visual Studio 2015 - Windows XP (v140_xp)"， i want use it in xp system
#5859 commented on Jul 7, 2025 • 0 new comments
Quantized model much slower than full precision model
#5865 commented on Jul 7, 2025 • 0 new comments
Performance issue with operator Where on CPU
#5896 commented on Jul 7, 2025 • 0 new comments
Performance issue with operators SVMRegressor and SVMClassifier for RBF kernel on CPU
#5898 commented on Jul 7, 2025 • 0 new comments
failed:/onnxruntime_src/onnxruntime/core/graph/model_load_utils.h:47 void onnxruntime::model_load_utils::ValidateOpsetForDomain
#5905 commented on Jul 7, 2025 • 0 new comments
Support GCN
#5910 commented on Jul 7, 2025 • 0 new comments
EyeLike with dynamic shape results in error
#5917 commented on Jul 7, 2025 • 0 new comments
Can't train mnist in parallel
#5918 commented on Jul 7, 2025 • 0 new comments
could not open "tensorrt_provider_factory.h", "mkldnn_provider_factory.h"
#5925 commented on Jul 7, 2025 • 0 new comments
Dynamic shape got wrong output
#5928 commented on Jul 7, 2025 • 0 new comments
Issue with Multi-GPU and GPU memory limit
#5939 commented on Jul 7, 2025 • 0 new comments
Drop support for Python 3.5
#5961 commented on Jul 7, 2025 • 0 new comments
can not get expected speed in onnxruntime
#5953 commented on Jul 7, 2025 • 0 new comments
Error using onnx model containing Bidirectional layer with MatMulAddFusion
#5955 commented on Jul 7, 2025 • 0 new comments
No opset import for domain 'com.microsoft'
#5971 commented on Jul 7, 2025 • 0 new comments
"undefined symbol" error occured, when I use ort.SessionOptions.register_custom_ops_library
#5984 commented on Jul 7, 2025 • 0 new comments
Under TRT EP, custom op cannot fall back to CUDA EP
#6002 commented on Jul 7, 2025 • 0 new comments
Inconsistent inference time between C Python API [Megatron-LM]
#6025 commented on Jul 7, 2025 • 0 new comments
Microsoft.AI.MachineLearning cannot be used in UWP app on on Windows 10 ARM64
#4686 commented on Jul 7, 2025 • 0 new comments
Debugging capability of onnxruntime in Visual Studio 2019 incapacitated
#4812 commented on Jul 7, 2025 • 0 new comments
[WinML] [C++/WinRT] Clarify how to share Ort::Env environments with WinRT/WinML instances
#4971 commented on Jul 7, 2025 • 0 new comments
C Sharp API for openvino doesn't run on GPU
#5011 commented on Jul 7, 2025 • 0 new comments
onxruntime-gpu installation issues
#5020 commented on Jul 7, 2025 • 0 new comments
program stucks when multi processes
#5093 commented on Jul 7, 2025 • 0 new comments
Exception thrown from Dispose method (When missing dependency)
#5250 commented on Jul 7, 2025 • 0 new comments
DLRM model failure to execute on GPU
#5295 commented on Jul 7, 2025 • 0 new comments
Running quantized models on GPU
#5359 commented on Jul 7, 2025 • 0 new comments
Can Session::Run be const?
#5558 commented on Jul 7, 2025 • 0 new comments
ML.NET issue while Using yolov4 onnx model
#5593 commented on Jul 7, 2025 • 0 new comments
Passing Non-Const pointer to Session::Run() using CPP Api
#5597 commented on Jul 7, 2025 • 0 new comments
How to reduce memory used?
#5711 commented on Jul 7, 2025 • 0 new comments
openvino build failed nuget
#5749 commented on Jul 7, 2025 • 0 new comments
How to loading a pytorch model with input shape of (None, 32) using the C# inference ?
#5781 commented on Jul 7, 2025 • 0 new comments
Any support for double type tensor when loading pytorch onnx model ?
#5782 commented on Jul 7, 2025 • 0 new comments
memory keep increasing with dynamic input shape of network
#5796 commented on Jul 7, 2025 • 0 new comments
Memory usage with Cuda ExecutionProvider
#5801 commented on Jul 7, 2025 • 0 new comments
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : The node is not placed on any Execution Provider. OneHot(11) (node while/cond_5/one_hot).
#5825 commented on Jul 7, 2025 • 0 new comments
Non-zero status code returned while running Div node
#5830 commented on Jul 7, 2025 • 0 new comments
ort-web Error: invalid input shape. when using webgl backend and there is a torch.nn.BatchNorm1d layer in the network
#10437 commented on Jul 7, 2025 • 0 new comments
cast BatchNorm2d to int32
#10440 commented on Jul 7, 2025 • 0 new comments
TensorRT input: 717 has no shape specified.
#10443 commented on Jul 7, 2025 • 0 new comments
raise Exception("Incomplete symbolic shape inference") when running "symbolic_shape_infer.py"
#10484 commented on Jul 7, 2025 • 0 new comments
C++ OnnxRuntime-GPU Slower than Python OnnxRuntime-GPU/C++ OnnxRuntime-CPU
#10492 commented on Jul 7, 2025 • 0 new comments
slower after graph optimization!
#10538 commented on Jul 7, 2025 • 0 new comments
onnxruntime and onnxruntime-gpu produce different output for ReduceL1 operator
#10542 commented on Jul 7, 2025 • 0 new comments
Run maskrcnn onnx from pytorch and inference on c++ with gpu sometimes will error
#10543 commented on Jul 7, 2025 • 0 new comments
Exception in DirectML on second inference run
#10546 commented on Jul 7, 2025 • 0 new comments
Unit Tests failure while building on Windows with CUDA EP
#10561 commented on Jul 7, 2025 • 0 new comments
Building Error
#10600 commented on Jul 7, 2025 • 0 new comments
OpenVINO Execution provider's CPU Utility is low
#10601 commented on Jul 7, 2025 • 0 new comments
How to use OpenVINO GetAvailableDevices?
#10602 commented on Jul 7, 2025 • 0 new comments
why it take 200 seconds to run onnxruntime.InferenceSession
#10608 commented on Jul 7, 2025 • 0 new comments
Building OnnxRuntime v1.10.0 with CUDAExecutionProvider for sm_75 GPU fails in CUDA10.2 environment
#10610 commented on Jul 7, 2025 • 0 new comments
C + + onnxruntime GPU is ten times slower than CPU
#10611 commented on Jul 7, 2025 • 0 new comments
Optimization for T5 transformer models.
#10613 commented on Jul 7, 2025 • 0 new comments
[E:onnxruntime:, sequential_executor.cc:346 Execute] Non-zero status code returned while running Add node. Name:'Add_1363' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/math/element_wise_ops.h:505 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 9 by 505
#10618 commented on Jul 7, 2025 • 0 new comments
about providers and providers_options in InferenceSession
#10620 commented on Jul 7, 2025 • 0 new comments
Awful performance with LASER model when using TensorRT provider
#8315 commented on Jul 7, 2025 • 0 new comments
Sigmoid fails and output all zeros
#10154 commented on Jul 7, 2025 • 0 new comments
Why does onnxruntime run slower on C++?
#10155 commented on Jul 7, 2025 • 0 new comments
`InferenceSession` initialization hangs
#10166 commented on Jul 7, 2025 • 0 new comments
TensorRT EP failed to set INT8 dynamic range.
#10206 commented on Jul 7, 2025 • 0 new comments
how to use docker and onnxruntime deploy onnx model on GPU?
#10257 commented on Jul 7, 2025 • 0 new comments
Inconsistent inference timing on CPU
#10270 commented on Jul 7, 2025 • 0 new comments
Inference: Time in GPU is similar in CPU. GPU not speed up
#10271 commented on Jul 7, 2025 • 0 new comments
multiple InferenceSession slowdown inference speed
#10273 commented on Jul 7, 2025 • 0 new comments
DnnlExecutionProvider is not visible in python API
#10275 commented on Jul 7, 2025 • 0 new comments
add QLinearMatMul do not quantize per channel flag to quantize_static extra options
#10283 commented on Jul 7, 2025 • 0 new comments
onnxruntime inference is around 5 times slower than pytorch when using GPU
#10303 commented on Jul 7, 2025 • 0 new comments
Bug: pthread sent an error! undefined:undefined: ortWasmThreaded is not defined
#10311 commented on Jul 7, 2025 • 0 new comments
Quantized int8 onnx GPT2 model returns different tokens whether using past_key_values or not for the same sentence
#10322 commented on Jul 7, 2025 • 0 new comments
Onnxruntime multithread options [C++ CPU]
#10330 commented on Jul 7, 2025 • 0 new comments
Issues when trying to use Onnxruntime and Tensorrt execution provider in a java application
#10352 commented on Jul 7, 2025 • 0 new comments
build onnxruntime error linux
#10364 commented on Jul 7, 2025 • 0 new comments
Error happened while building onnxruntime
#10378 commented on Jul 7, 2025 • 0 new comments
Question about hidden states in onnx DistilGPT2
#10382 commented on Jul 7, 2025 • 0 new comments
Is TensorRT execution provider caching is thread-safe
#10412 commented on Jul 7, 2025 • 0 new comments
Different inference results from python and C#
#10863 commented on Jul 7, 2025 • 0 new comments
Does WebGL fail when network inputs are not dimensions in powers of two?
#10873 commented on Jul 7, 2025 • 0 new comments
TensorRT conversion support on Huggingface transformers quantized models.
#10888 commented on Jul 7, 2025 • 0 new comments
onnxruntime/capi/onnxruntime_inference_collection.py", line 370, in _create_inference_session sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model) onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from onnx_data/cpm_large_opt.onnx failed:Protobuf parsing failed.
#10892 commented on Jul 7, 2025 • 0 new comments
python3 -m onnxruntime_tools.transformers.optimizer when opt_level=1 comes error for BERT
#10893 commented on Jul 7, 2025 • 0 new comments
1 : Fail : Non-zero status code returned while running FusedConv node.
#10894 commented on Jul 7, 2025 • 0 new comments
After using onnxruntime.transformers.optimizer to optimize onnx, the optimized model fail to tensorrt
#10905 commented on Jul 7, 2025 • 0 new comments
TensorRT Execution [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization
#10914 commented on Jul 7, 2025 • 0 new comments
slow fp16 performance
#10919 commented on Jul 7, 2025 • 0 new comments
onnxruntime TensorRT Related Questions
#10930 commented on Jul 7, 2025 • 0 new comments
MinGW support (MSYS2)
#10976 commented on Jul 7, 2025 • 0 new comments
docker can't clone git repository for ARM64
#10991 commented on Jul 7, 2025 • 0 new comments
Xor with broadcasting computes error
#11000 commented on Jul 7, 2025 • 0 new comments
Inconsistent behavior between CPU and GPU on ReLU operator when input is NaN
#11010 commented on Jul 7, 2025 • 0 new comments
0xc00007b error, could not startup exe at all with onnxruntime1.7 win-x64 cpu on win10
#11016 commented on Jul 7, 2025 • 0 new comments
Failed to build onnxruntime-vitisai docker container due to missing NO_PUBKEY
#11017 commented on Jul 7, 2025 • 0 new comments
Huggingface Transformers Shape Inference Issue
#11019 commented on Jul 7, 2025 • 0 new comments
kalid-onnxruntime Fatal error: Gemm is not a registered function/op
#11021 commented on Jul 7, 2025 • 0 new comments
Updating state of the network
#11026 commented on Jul 7, 2025 • 0 new comments
Can't constant fold SequenceEmpty node
#11041 commented on Jul 7, 2025 • 0 new comments
How to use mimalloc in Linux?
#10629 commented on Jul 7, 2025 • 0 new comments
CPU & CUDA execution provider produce different value
#10636 commented on Jul 7, 2025 • 0 new comments
No libonnxruntime_providers_cuda.so generated?
#10639 commented on Jul 7, 2025 • 0 new comments
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION
#10657 commented on Jul 7, 2025 • 0 new comments
Get wrong result when use webgl backend
#10673 commented on Jul 7, 2025 • 0 new comments
onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs initializer_node_arg != nullptr was false.
#10677 commented on Jul 7, 2025 • 0 new comments
Need help on the following from wiki listed roadmap.
#10689 commented on Jul 7, 2025 • 0 new comments
Output shape is mismatched with ONNX SPEC about Resize_tf_crop_and_size with scale input
#10727 commented on Jul 7, 2025 • 0 new comments
gpu onnxruntime lib
#10731 commented on Jul 7, 2025 • 0 new comments
Onnx model consumes huge CPU memory
#10742 commented on Jul 7, 2025 • 0 new comments
inference qdq model failed with TRT EP.
#10743 commented on Jul 7, 2025 • 0 new comments
build on windows cup is fine，but cuda not
#10745 commented on Jul 7, 2025 • 0 new comments
Is there a version of onnxruntime that is compatible with windows 7?
#10749 commented on Jul 7, 2025 • 0 new comments
can build on windows with Geforce 1060 card, cuda 11.0 cudnn 8.0.2 successfully?
#10763 commented on Jul 7, 2025 • 0 new comments
very slow in inference
#10764 commented on Jul 7, 2025 • 0 new comments
ONNX models give slower inference in Python Multiprocessing
#10786 commented on Jul 7, 2025 • 0 new comments
Inference time of onnxruntime gpu increases at very high batch sizes
#10789 commented on Jul 7, 2025 • 0 new comments
Transformer optimizer outputs confusing error
#10838 commented on Jul 7, 2025 • 0 new comments
C++ is 10x slower compared with Python, CPU only
#10849 commented on Jul 7, 2025 • 0 new comments
Windows 32 bit performance much slower than 64bit?
#10855 commented on Jul 7, 2025 • 0 new comments
Inference Speed is slow on GPU
#8316 commented on Jul 7, 2025 • 0 new comments
After 8bit quantization, the GPU inference speed is very slow
#8330 commented on Jul 7, 2025 • 0 new comments
GPUs operate slower than CPUs
#8362 commented on Jul 7, 2025 • 0 new comments
error using C# tensorRT EP builded from source
#8367 commented on Jul 7, 2025 • 0 new comments
Why cuda provider allocator must be threadlocal?
#8378 commented on Jul 7, 2025 • 0 new comments
ERROR running model inference:Non-zero status code returned while running Cast node
#8424 commented on Jul 7, 2025 • 0 new comments
Implement Split for double or float64 data type
#8382 commented on Jul 7, 2025 • 0 new comments
Found regression on ORT 1.8.1
#8513 commented on Jul 7, 2025 • 0 new comments
Does the onnxruntime.quantization.quantize_dynamic support GPU quantization?
#8524 commented on Jul 7, 2025 • 0 new comments
gpu memory can not release.
#8544 commented on Jul 7, 2025 • 0 new comments
Build failure of onnxruntime Docker container with Vitis-AI
#8596 commented on Jul 7, 2025 • 0 new comments
PrepareForCompute Non concat axis dimensions must match: Axis 0 has mismatched dimensions of 1 and 0
#8685 commented on Jul 7, 2025 • 0 new comments
error with torch.sum or torch.tensor.mean operator on GPU
#8742 commented on Jul 7, 2025 • 0 new comments
Symbolic shape inference error for loop node & seq(tensor)
#8755 commented on Jul 7, 2025 • 0 new comments
onnxruntime Jetson tx2 cuda
#8771 commented on Jul 7, 2025 • 0 new comments
AttributeError: module 'onnxruntime' has no attribute 'set_default_logger_severity'
#8789 commented on Jul 7, 2025 • 0 new comments
IsNaN and Split have no double implementations
#8791 commented on Jul 7, 2025 • 0 new comments
Readily available Python wheels for ARM?
#8874 commented on Jul 7, 2025 • 0 new comments
cannot import name ‘get_all_providers‘
#8907 commented on Jul 7, 2025 • 0 new comments
TensorRT execution provider SEGFAULT
#7757 commented on Jul 7, 2025 • 0 new comments
CUDA kernel not found in registries for Op type: Pad
#7779 commented on Jul 7, 2025 • 0 new comments
ACL and ArmNN v21.02 EP has problem with GEMM
#7784 commented on Jul 7, 2025 • 0 new comments
get error when using a model with custom op
#7788 commented on Jul 7, 2025 • 0 new comments
Force fallback to CPU execution for Gather, Unsqueeze, Concat nodes - onnxruntime-gpu 1.7.0, opset 12 and 13
#7792 commented on Jul 7, 2025 • 0 new comments
How to get sparse tensor input in custom op?
#7838 commented on Jul 7, 2025 • 0 new comments
Non-zero status code returned while running FusedConv node. Name:'fused ' onnxruntime::OpKernelContext::Input Missing Input: input
#7853 commented on Jul 7, 2025 • 0 new comments
Build failure in onnxruntime/test/featurizers_ops/truncated_svdtransformer_test.cc
#7878 commented on Jul 7, 2025 • 0 new comments
onnxruntime::BroadcastIterator::Init axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 15 by 64
#7888 commented on Jul 7, 2025 • 0 new comments
undefined reference to `onnx::optimization::GetAvailablePasses() on Nvidia Jetson NX
#7970 commented on Jul 7, 2025 • 0 new comments
Runtime exception during initialization of SparkML model (One falsenode is pointing either to itself, either to another tree.)
#8008 commented on Jul 7, 2025 • 0 new comments
Running multiple input node onnx model using onnxrntime C/C++ API
#8019 commented on Jul 7, 2025 • 0 new comments
Memory leak in free-dimention model in C++
#8053 commented on Jul 7, 2025 • 0 new comments
CUDAExecutionProvider does not handle Clip on float16 tensor.
#8070 commented on Jul 7, 2025 • 0 new comments
Why ReduceSum get shape 0 for an empty input?
#8146 commented on Jul 7, 2025 • 0 new comments
System memory leak on cuda GPU backend.
#8147 commented on Jul 7, 2025 • 0 new comments
Does ONNX Runtime and its execution providers support FP16 inference?
#8173 commented on Jul 7, 2025 • 0 new comments
Reflect padding output seems incorrect when padding size larger than input dimension
#8265 commented on Jul 7, 2025 • 0 new comments
ai.onnxruntime.OrtException: Error code - ORT_FAIL - message: OrtSessionOptionsAppendExecutionProvider_Cuda: Failed to load shared library
#8283 commented on Jul 7, 2025 • 0 new comments
Eigen::ThreadPoolInterface*, const onnxruntime::ThreadOptions&) pthread_setaffinity_np failed
#8313 commented on Jul 7, 2025 • 0 new comments
yolov5 with the compiled onnxruntime by self，but is so slow, not with the GPU
#9689 commented on Jul 7, 2025 • 0 new comments
Unable to load shared library 'onnxruntime' on MacOS (DllNotFoundException)
#9707 commented on Jul 7, 2025 • 0 new comments
Support for int64 with webgl backend of the web runtime
#9724 commented on Jul 7, 2025 • 0 new comments
ouput of onnx model with custom op in the loop structrue is confusing
#9742 commented on Jul 7, 2025 • 0 new comments
How to build for multiple execution provider?
#9756 commented on Jul 7, 2025 • 0 new comments
Inference is slower when running inside Docker
#9767 commented on Jul 7, 2025 • 0 new comments
[ONNXRuntimeError] : 1 : FAIL : Fatal error: test_custom is not a registered function/op
#9831 commented on Jul 7, 2025 • 0 new comments
non-NEON Compatibility
#9849 commented on Jul 7, 2025 • 0 new comments
Yolov5 ORT train failed with onnxruntime backend
#9936 commented on Jul 7, 2025 • 0 new comments
Support for pip wheel tensorrt
#9986 commented on Jul 7, 2025 • 0 new comments
question about warnup long time
#10017 commented on Jul 7, 2025 • 0 new comments
Importing onnxruntime on AWS Lambdas with ARM64 processor causes crash
#10038 commented on Jul 7, 2025 • 0 new comments
how to forward with a batch images, oncetime?
#10071 commented on Jul 7, 2025 • 0 new comments
when my models input size is 3808, then i forward with yolov5, the memry is break.
#10074 commented on Jul 7, 2025 • 0 new comments
Same Pad_Head value in ORT for SAME_UPPER/SAME_LOWER if get negative odd pad value
#10086 commented on Jul 7, 2025 • 0 new comments
onnxruntime latest version segment fault
#10113 commented on Jul 7, 2025 • 0 new comments
ORTModule import error : with onnxruntime
#10127 commented on Jul 7, 2025 • 0 new comments
BatchNorm fails on CUDA EP with zero length sequences
#10128 commented on Jul 7, 2025 • 0 new comments
Do you have any plan to add 'Round' Operator for gradient builder registry for orttrainer?
#10138 commented on Jul 7, 2025 • 0 new comments
Performance question about some nodes generated by dynamic quantization
#10153 commented on Jul 7, 2025 • 0 new comments
Runetime Error: Decoder with dynamic axes does not work with Encoder output
#8910 commented on Jul 7, 2025 • 0 new comments
How to get the value of tensors in subgraph?
#8929 commented on Jul 7, 2025 • 0 new comments
The model run time become longer when i update the onnxruntime from version 1.7 to version 1.8
#8938 commented on Jul 7, 2025 • 0 new comments
ONNX inference result are different to pytorch model
#8977 commented on Jul 7, 2025 • 0 new comments
Type error when runs an control flow model in ORT
#8999 commented on Jul 7, 2025 • 0 new comments
[E:onnxruntime:, sequential_executor.cc:339 Execute] Non-zero status code returned while running Transpose node. Name:'model/unet3d_segmentation/conv3d_12/Conv3D__165' Status Message: CUDA error cudaErrorInvalidConfiguration:invalid configuration argument
#9083 commented on Jul 7, 2025 • 0 new comments
cross compile but onnx-ml.pb.cc error
#9093 commented on Jul 7, 2025 • 0 new comments
how to input 'None' in cpp-version
#9121 commented on Jul 7, 2025 • 0 new comments
InferenceSession.run in python is inconsistent in terms of performance
#9208 commented on Jul 7, 2025 • 0 new comments
Can't load Cuda Provider on Linux due symbol lookup error
#9309 commented on Jul 7, 2025 • 0 new comments
ONNXRuntime CPU - Memory spiking continuously (Memory leak)
#9313 commented on Jul 7, 2025 • 0 new comments
error: '_Frees_ptr_opt_' has not been declared
#9332 commented on Jul 7, 2025 • 0 new comments
QLinearConv per-channel result is wrong and it's seem overflow when input is big for my model
#9365 commented on Jul 7, 2025 • 0 new comments
ORT execution fails when a gradient builder is not registered for module-local functions
#9375 commented on Jul 7, 2025 • 0 new comments
Relu getting dropped during quantization
#9425 commented on Jul 7, 2025 • 0 new comments
OnnxRuntime Build Failure in Docker
#9530 commented on Jul 7, 2025 • 0 new comments
YAMNet model running on CudaExecutionProvider is 3x slower than running on tensorflow
#9657 commented on Jul 7, 2025 • 0 new comments
Gap in inference time between onnxruntime and torch vanishes when increasing the batch size
#9660 commented on Jul 7, 2025 • 0 new comments
libonnxruntime.so crash
#9684 commented on Jul 7, 2025 • 0 new comments
[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for Conv(1) node with name 'Conv_0'
#9685 commented on Jul 7, 2025 • 0 new comments
[Build] ModuleNotFoundError: No module named 'onnxruntime'
#18966 commented on Jul 7, 2025 • 0 new comments
Error with finding onnxruntime_binding.node on Windows 10 on a bootcamp Macbook
#18971 commented on Jul 7, 2025 • 0 new comments
How to observe arena allocator memory request metrics
#18972 commented on Jul 7, 2025 • 0 new comments
Could not load library cudnn_cnn_infer64_8.dll. Error code 127
#18973 commented on Jul 7, 2025 • 0 new comments
[Build] Failure with OneDNN on Intel MacOS
#18976 commented on Jul 7, 2025 • 0 new comments
Cannot quantize yolov5 float to int8 onnx model
#18987 commented on Jul 7, 2025 • 0 new comments
Encounter unknown exception in initialize using Openvino EP
#19004 commented on Jul 7, 2025 • 0 new comments
ONNX Runtime inference on string input
#19006 commented on Jul 7, 2025 • 0 new comments
[Error: Exception in HostFunction: <unknown>] while running ort models in react-native
#19021 commented on Jul 7, 2025 • 0 new comments
[Performance] It is not possible to use a discrete graphics card with DML.
#19025 commented on Jul 7, 2025 • 0 new comments
[Build] deploying the EfficientAD anomaly detection algorithm, an error occurred while executing the "Run" command
#19030 commented on Jul 7, 2025 • 0 new comments
Freeing tensor data created via CreateTensor
#19034 commented on Jul 7, 2025 • 0 new comments
[Build] Linux x86_64 STATIC Build
#19035 commented on Jul 7, 2025 • 0 new comments
cudaMemcpyAsync throws exception in GPUDataTransfer
#19076 commented on Jul 7, 2025 • 0 new comments
[Training] On device training doesn't work with INT8 Models
#19078 commented on Jul 7, 2025 • 0 new comments
[Performance] The CUDA Stream cannot be set through Python API
#19094 commented on Jul 7, 2025 • 0 new comments
Longformer `convert_to_onnx.py` not working due to missing imports
#19149 commented on Jul 7, 2025 • 0 new comments
[Performance] Why run first inference so slow, although run one time in initialzation?
#19177 commented on Jul 7, 2025 • 0 new comments
ORT 1.17.0 Release Candidates available for testing
#19236 commented on Jul 7, 2025 • 0 new comments
Missprinted condition: head_size != num_heads * head_size
#18675 commented on Jul 7, 2025 • 0 new comments
Parallel inference of multiple models in different threads
#18806 commented on Jul 7, 2025 • 0 new comments
Onnxruntime using OpenVINO as execution provider encountered Exception during initialization problem on model candy.onnx
#18825 commented on Jul 7, 2025 • 0 new comments
[Performance] Java API lacks functionality to control allocator settings.
#18845 commented on Jul 7, 2025 • 0 new comments
[dynamo_export] starts_.size() == ends_.size() + 1 was false. No matching 'start' entry.
#18863 commented on Jul 7, 2025 • 0 new comments
[dynamo_export] MLFloat16 data type is not supported with ScatterElements opset 18 when reduction is 'max'.
#18864 commented on Jul 7, 2025 • 0 new comments
[Web] Non-zero status code returned while running Slice node `webgpu`
#18892 commented on Jul 7, 2025 • 0 new comments
compute_range not available
#18893 commented on Jul 7, 2025 • 0 new comments
the resout of onnx and trt engine is different?why?
#18902 commented on Jul 7, 2025 • 0 new comments
SafeIntOnOverflow() Integer overflow error when inferencing on too many samples with Python
#18905 commented on Jul 7, 2025 • 0 new comments
error 126 Onnx in ComfyUI[Performance] O
#18925 commented on Jul 7, 2025 • 0 new comments
ai.onnxruntime.OrtException: Unsupported type - FLOAT16
#18926 commented on Jul 7, 2025 • 0 new comments
How to use multiple inputs of different types in C++ session
#18932 commented on Jul 7, 2025 • 0 new comments
[Web] onnxruntime-web is not work in nodejs
#18933 commented on Jul 7, 2025 • 0 new comments
[Web] no available backend found. ERR: [wasm] TypeError: _ is not a function, [cpu] Error: previous call to 'initializeWebAssembly()' failed., [xnnpack] Error: previous call to 'initializeWebAssembly()' failed
#18938 commented on Jul 7, 2025 • 0 new comments
C# I need to run the program on NPU （OnnxRuntime + DirectML + NPU），but it failed
#19846 commented on Jul 7, 2025 • 0 new comments
How to set `trt_profile_min_shapes` for inputs with name containing colons?
#18939 commented on Jul 7, 2025 • 0 new comments
OP (Conv) inference results mismatch with PyTorch
#18946 commented on Jul 7, 2025 • 0 new comments
[Build] How to build onnxruntime with openvino statically?
#18950 commented on Jul 7, 2025 • 0 new comments
[Performance] 2x Regression in 1st Inference time cost
#18957 commented on Jul 7, 2025 • 0 new comments
[iOS] Output of type sequence<map<int64,float32>> causes crash on iOS
#19867 commented on Jul 7, 2025 • 0 new comments
[Build] Where is official build for Unity?
#19964 commented on Jul 7, 2025 • 0 new comments
[BUG] [OpenVino EP] Only first result in session is correct.
#19975 commented on Jul 7, 2025 • 0 new comments
Onnx Runtime EntryPointNotFoundException: OrtGetApiBase in Unity Application.
#20048 commented on Jul 7, 2025 • 0 new comments
Layer not supported in one provider (Tensorrt) not working with second provider (CUDA) in an inference problem.
#20058 commented on Jul 7, 2025 • 0 new comments
[Performance] Inference failed or unsupported using quantize_dynamic
#20060 commented on Jul 7, 2025 • 0 new comments
openvino with int8
#20072 commented on Jul 7, 2025 • 0 new comments
Unpredictable onnxruntime-node crash when using Electron
#20084 commented on Jul 7, 2025 • 0 new comments
In Aquatic mode links text “PyTorch and Hugging face” is not clearly visible: A11y_WCP URLs - ONNX Runtime_Home_Learn more about how to use ONNX Runtime with_usability
#20150 commented on Jul 7, 2025 • 0 new comments
multiple tests fail on Windows due to `ORT_ENABLE_STREAM` define logic error
#20180 commented on Jul 7, 2025 • 0 new comments
`convert_float_to_float16` results in `failed in shape inference <class 'Exception'>`
#20189 commented on Jul 7, 2025 • 0 new comments
Whether CUDA12.4 and cudnn9 matches onnxruntime-win-x64-cuda12-1.17.1
#20223 commented on Jul 7, 2025 • 0 new comments
[Training] Can we use ORTModule for inference?
#20281 commented on Jul 7, 2025 • 0 new comments
C API Seg Fault from OrtGetApiBase()->GetApi(ORT_API_VERSION);
#20283 commented on Jul 7, 2025 • 0 new comments
[Performance] ScatterND / GridSample operators are on CPU instead of GPU / CUDA
#20297 commented on Jul 7, 2025 • 0 new comments
DirectML returning empty result with ObjectDetection (Mobilinet V2 FPN Keras)
#20386 commented on Jul 7, 2025 • 0 new comments
[Build] Cmake install debug and release configuration
#20387 commented on Jul 7, 2025 • 0 new comments
[Performance] Profiling on CUDA shows confusing values
#20398 commented on Jul 7, 2025 • 0 new comments
[Performance] Massive Performance slowdown from v1.13.1 -> 1.14.0
#20400 commented on Jul 7, 2025 • 0 new comments
onnxruntime 1.17.3 is missing from cuda 12 artifacts feed
#20409 commented on Jul 7, 2025 • 0 new comments
`shape_inference.quant_pre_process` causes `AttributeError: module 'onnx.helper' has no attribute 'make_sequence_value_info'`
#19323 commented on Jul 7, 2025 • 0 new comments
[Training] How to update running_mean and running_var of BatchNormalization during training
#19370 commented on Jul 7, 2025 • 0 new comments
[Performance] In ONNX Runtime, the CPU consumption does not scale linearly with the number of threads
#19384 commented on Jul 7, 2025 • 0 new comments
Backwards convolution layers in CUDA provider should heed
#19391 commented on Jul 7, 2025 • 0 new comments
InferenceSession.run does not validate rank of scalar inputs
#19434 commented on Jul 7, 2025 • 0 new comments
[Web] Memory Access Out of Bounds Error When Using ONNX Runtime Web Inference in NPM Package (wasm)
#19443 commented on Jul 7, 2025 • 0 new comments
[Performance] CPU inference much slower from GPU runtime
#19451 commented on Jul 7, 2025 • 0 new comments
[On-device Training] Yolo custom loss
#19464 commented on Jul 7, 2025 • 0 new comments
[Performance]
#19479 commented on Jul 7, 2025 • 0 new comments
Errors about using c# and TensorRT
#19489 commented on Jul 7, 2025 • 0 new comments
Accuracy drops a lot when using fp16 with TensorRT EP
#19492 commented on Jul 7, 2025 • 0 new comments
quantize_dynamic : nodes_to_quantize(Gemm) is ignored
#19503 commented on Jul 7, 2025 • 0 new comments
ONNX Runtime OpenVINO EP is way behind
#19688 commented on Jul 7, 2025 • 0 new comments
Observed TDR on a low-end system
#19724 commented on Jul 7, 2025 • 0 new comments
Inconsistent Prediction Outputs for Onnx Model
#19834 commented on Jul 7, 2025 • 0 new comments
import InferenceSesseion and capi._pybind_state.
#19836 commented on Jul 7, 2025 • 0 new comments
[Performance] onnxruntime 1.17.1 version doesnt support CUDA 12.4
#19839 commented on Jul 7, 2025 • 0 new comments
[Performance] Accuracy dropped heavily using onnxruntime to inference a model quantized by QAT
#19850 commented on Jul 7, 2025 • 0 new comments
[Mobile] android prod crash: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR)
#20828 commented on Jul 7, 2025 • 0 new comments
Inference speed problem even if using a high-end Hardware.
#19865 commented on Jul 7, 2025 • 0 new comments
[Performance] Data size of Batch Normalization using cuDNN in inference.
#17406 commented on Jul 7, 2025 • 0 new comments
Yolov8 Static Quantization
#17410 commented on Jul 7, 2025 • 0 new comments
CUDA Stream and Synchronous in custom operato
#17412 commented on Jul 7, 2025 • 0 new comments
[Performance] How much memory it needs to load a 3.4 GB model to GPU through DirectML?
#17413 commented on Jul 7, 2025 • 0 new comments
valgrind memcpy_chk overlap onnxruntime1.15.1
#17431 commented on Jul 7, 2025 • 0 new comments
Extract node info
#17444 commented on Jul 7, 2025 • 0 new comments
[Bug] FP16 conversion yields an unusable model
#17447 commented on Jul 7, 2025 • 0 new comments
[Mobile iOS] Run fp16 onnx model on CoreML EP
#17448 commented on Jul 7, 2025 • 0 new comments
C++ API, Memory Leak instantiating Ort::Sessions
#17451 commented on Jul 7, 2025 • 0 new comments
Failure with OpenvinoEP within ORT
#17499 commented on Jul 7, 2025 • 0 new comments
Resize of doesn't work well while the coordinate_transformation_mode is 'align_corners'.
#17564 commented on Jul 7, 2025 • 0 new comments
Inference speed of Quantized model not increased after static Quantization[Performance]
#17634 commented on Jul 7, 2025 • 0 new comments
DML EP One session but called in different threads. [Performance]
#17686 commented on Jul 7, 2025 • 0 new comments
SkipLayerNormFusion -- High Output Difference Between PyTorch and ONNX Runtime with Extended Optimizations
#17689 commented on Jul 7, 2025 • 0 new comments
[Web]
#17700 commented on Jul 7, 2025 • 0 new comments
[Mobile | iOS] I got "Unknown exception" error.
#17731 commented on Jul 7, 2025 • 0 new comments
[Web] Custom build packages
#17743 commented on Jul 7, 2025 • 0 new comments
[web] following-up work items for supporting uniform buffers
#17860 commented on Jul 7, 2025 • 0 new comments
[Web] Declaration is not emitted in onnxruntime-node package
#17979 commented on Jul 7, 2025 • 0 new comments
[Build] Why does TensorRT EP need the full version of protobuf?
#18040 commented on Jul 7, 2025 • 0 new comments
An error occurred when I used the TensorrtExecutionProvider in onnx runtime
#17047 commented on Jul 7, 2025 • 0 new comments
[Web] Cannot Convert to RGB when using Tensor.fromImage(image,{tensorFormat:'RGB'})
#17094 commented on Jul 7, 2025 • 0 new comments
[Performance] Pytorch Model converted to ONNX with CUDAProvider run slower 3x time than Using Pytorch with GPU
#17116 commented on Jul 7, 2025 • 0 new comments
How to release onnxruntime gpu memory
#17142 commented on Jul 7, 2025 • 0 new comments
[TOOLS]：Using transformers.optimizer optimize large model, segmentation fault (core dumped)
#17212 commented on Jul 7, 2025 • 0 new comments
Onnx model inference Fatal error: ai.onnx.contib:bev_pool_v2(-1) is not a registered function/op
#17214 commented on Jul 7, 2025 • 0 new comments
[C#] Invalid input name error
#17244 commented on Jul 7, 2025 • 0 new comments
AssertionError on num_heads > 0 for bert with specific optimization config
#17254 commented on Jul 7, 2025 • 0 new comments
windows10 x86 x64 inference time varies greatly
#17256 commented on Jul 7, 2025 • 0 new comments
[Performance] Operators assigned to CPU instead of CUDA, CPU thread management problem
#17268 commented on Jul 7, 2025 • 0 new comments
[Web] Error: no available backend found. ERR: [wasm] TypeError: Failed to parse URL from
#17274 commented on Jul 7, 2025 • 0 new comments
[Build] Error: cpuid.h: No such file or directory when cross-compiling ORT 1.15.1 with NNAPI for arm64
#17283 commented on Jul 7, 2025 • 0 new comments
Freeing heap block containing an active critical section
#17345 commented on Jul 7, 2025 • 0 new comments
[Performance] 3X slower inference on onnxruntime than pytorch(huggingface)
#17366 commented on Jul 7, 2025 • 0 new comments
[Performance] Memcpy leads to AllocationError for argmax
#17371 commented on Jul 7, 2025 • 0 new comments
[web/js] need for more methods on tensor object
#17372 commented on Jul 7, 2025 • 0 new comments
[Performance] Quantized model inference on CPU slower/same as FP32
#17389 commented on Jul 7, 2025 • 0 new comments
Default `tensorFormat` should RGBA for HTMLImageElement variant
#17395 commented on Jul 7, 2025 • 0 new comments
[Build] windows dll compilation error with versions above 1.14.0
#17404 commented on Jul 7, 2025 • 0 new comments
[Web] Add binary/where broadcast case when FXC issue got fixed in tint
#17405 commented on Jul 7, 2025 • 0 new comments
[Build]
#18570 commented on Jul 7, 2025 • 0 new comments
# Issue with Rounding Behavior in onnxruntime's Quantizelinear Layer
#18576 commented on Jul 7, 2025 • 0 new comments
Session Run throws an access violation exception when I recreate the session
#18578 commented on Jul 7, 2025 • 0 new comments
[Node.js] Support for loading models with external data in `onnxruntime-node`
#18586 commented on Jul 7, 2025 • 0 new comments
Cuda EP does not compute reduce with empty set correctly?
#18588 commented on Jul 7, 2025 • 0 new comments
[Mobile] Model with large input size cause Segmentation Fault while session->run()
#18595 commented on Jul 7, 2025 • 0 new comments
Session initialization stuck/crash in DMLCreateDevice while using DirectML EP
#18599 commented on Jul 7, 2025 • 0 new comments
Profiling multithreaded runs
#18600 commented on Jul 7, 2025 • 0 new comments
Segmentation Fault when some of node outputs is empty
#18601 commented on Jul 7, 2025 • 0 new comments
What is the recommended setup for running multiple models/sessions in parallel in C++?
#18610 commented on Jul 7, 2025 • 0 new comments
DirectML Resize Node error.
#18613 commented on Jul 7, 2025 • 0 new comments
[Build]
#18617 commented on Jul 7, 2025 • 0 new comments
Could not find an implementation for SkipGroupNorm(1) node with name 'SkipGroupNorm_0'
#18623 commented on Jul 7, 2025 • 0 new comments
Crash in ResizeHelper::Initialize executing a model on ARM64
#18628 commented on Jul 7, 2025 • 0 new comments
ONNXRuntime Segmentation Fault Crash on Inference (iOS and Mac)
#18632 commented on Jul 7, 2025 • 0 new comments
[Performance] dynamic batch infer cost time question
#18639 commented on Jul 7, 2025 • 0 new comments
ORT memory error with the graph from linspace
#18648 commented on Jul 7, 2025 • 0 new comments
Are there any benchmark tools for onnx mobile like Tensorflow Lite?
#18664 commented on Jul 7, 2025 • 0 new comments
Different results of consecutive runs for same input
#18672 commented on Jul 7, 2025 • 0 new comments
Strange condition size_t channel_rindex = is_nchw ? 2 : 2;
#18674 commented on Jul 7, 2025 • 0 new comments
[Web] Which node.js version is supposed to be supported?
#18078 commented on Jul 7, 2025 • 0 new comments
Microsoft.ML.OnnxRuntime.OpenVino Encountered unknown exception in Initialize
#18152 commented on Jul 7, 2025 • 0 new comments
ORT bug in Col2Im CPU 3D cases
#18156 commented on Jul 7, 2025 • 0 new comments
[Mobile|Android] Fatal error: ai.onnx.contrib:SentencepieceTokenizer(-1) is not a registered function/op
#18226 commented on Jul 7, 2025 • 0 new comments
The onnx.helper make_function command strips type information leading to inference errors
#18264 commented on Jul 7, 2025 • 0 new comments
[Web] onnxruntime-web and onnxruntime-node return different results for LSTM model
#18335 commented on Jul 7, 2025 • 0 new comments
[Performance] the speed with SetIntraOpNumThreads(1),SetIntraOpNumThreads(4),SetInterOpNumThreads(1),SetInterOpNumThreads(4)
#18385 commented on Jul 7, 2025 • 0 new comments
[Performance] Does `com.microsoft.Attention` use FlashAttention-2?
#18474 commented on Jul 7, 2025 • 0 new comments
Add ORT Extensions to Java and build with Gradle
#18503 commented on Jul 7, 2025 • 0 new comments
Model Run Session wasting time[Performance]
#18510 commented on Jul 7, 2025 • 0 new comments
Is there any way to convert a qdqmodel to qlinearmodel use ort?
#18511 commented on Jul 7, 2025 • 0 new comments
[Training] qat
#18534 commented on Jul 7, 2025 • 0 new comments
[Performance] GPU op placement control when some ops must be on the CPU
#23154 commented on Jul 7, 2025 • 0 new comments
[Build] manylinux_2_28 support
#18537 commented on Jul 7, 2025 • 0 new comments
RunAsync C# API crashes without any error
#19140 commented on Jul 7, 2025 • 0 new comments
[Build] TRT EP cannot be built without CUDA EP
#18542 commented on Jul 7, 2025 • 0 new comments
[Build] 1.20.2 Microsoft.ML.OnnxRuntime.Managed nuget package needs Microsoft.ML.OnnxRuntime 1.20.2 which is not available
#23640 commented on Jul 7, 2025 • 0 new comments
Call Session class method name Run failed,don't know why
#18548 commented on Jul 7, 2025 • 0 new comments
Does the computation order affect the computation result?
#18564 commented on Jul 7, 2025 • 0 new comments
[Web] How could I get the shape of the output tensor?
#18568 commented on Jul 7, 2025 • 0 new comments
[Build] Building for Mac Catalyst Fails When Installed Via Cocoapods
#23307 commented on Jul 7, 2025 • 0 new comments
Using separate cuda streams for one session
#23319 commented on Jul 7, 2025 • 0 new comments
[Performance] Max operator became 4.5X slower after Fixing NaN propagation for float16 min and max operators.
#23337 commented on Jul 7, 2025 • 0 new comments
memory.enable_memory_arena_shrinkage is not working in python
#23339 commented on Jul 7, 2025 • 0 new comments
Issue loading custom ONNX model with complex-valued operations in ONNX Runtime (C++)
#23341 commented on Jul 7, 2025 • 0 new comments
Memory creeping up
#23348 commented on Jul 7, 2025 • 0 new comments
No speedup from float16 with directml compared to cuda
#23359 commented on Jul 7, 2025 • 0 new comments
[Build] Possibly unintentional or misconfigured dependencies for QNN EP in onnxruntime_python.cmake
#23360 commented on Jul 7, 2025 • 0 new comments
[Performance] GPU Fallback to CPU Without Error When CUDA DLLs Are Missing
#23372 commented on Jul 7, 2025 • 0 new comments
[Performance] 40% slowdown in ONNX Resize Operator on CPU
#23391 commented on Jul 7, 2025 • 0 new comments
[Performance] Round node shows huge performance drop on Windows
#23430 commented on Jul 7, 2025 • 0 new comments
debug result is ok, release get NaN output
#23440 commented on Jul 7, 2025 • 0 new comments
[QUESTION]: onnxruntime with onednn backend
#23543 commented on Jul 7, 2025 • 0 new comments
[Performance] Speed-up TensorRT engine compilation
#23546 commented on Jul 7, 2025 • 0 new comments
Custom operators is not a registered function/op (python)
#23566 commented on Jul 7, 2025 • 0 new comments
[Performance] ORT-WebGPU Average Pooling is working too long in edge case
#23614 commented on Jul 7, 2025 • 0 new comments
TensorRT Provider "Attribute reduction is not supported"
#23618 commented on Jul 7, 2025 • 0 new comments
session.disable_fallback() has no effect, it always fallback to cpu
#23647 commented on Jul 7, 2025 • 0 new comments
[Build] CMake Error at onnxruntime_unittests.cmake:1026 (find_path): Could not find onnx_SOURCE_DIR using the following files: onnx/onnx-ml.proto3, onnx/onnx-ml.proto Call Stack (most recent call first): CMakeLists.txt:1789 (include)
#23684 commented on Jul 7, 2025 • 0 new comments
OnnxRuntimeGenAIException: CUDA execution provider is not enabled in this build.
#23715 commented on Jul 7, 2025 • 0 new comments
[Mobile] Error: Can't load a model: Error Code - ORT_INVALID_PROTOBUF
#22927 commented on Jul 7, 2025 • 0 new comments
[Training] RuntimeError: gradient_builder_base.h:123 onnxruntime::training::ArgDef onnxruntime::training::GradientBuilderBase::O(size_t, bool) const i < node_->OutputDefs().size() was false
#22955 commented on Jul 7, 2025 • 0 new comments
[WebGPU] `Kernel "[GroupQueryAttention] /model/layers.0/attn/GroupQueryAttention" failed. Error: Input "key" is expected to have 3, 4, or 5 dimensions".`
#22987 commented on Jul 7, 2025 • 0 new comments
Remove Python :: 3.7 Python :: 3.8 Python :: 3.9 from pypi metadata
#22993 commented on Jul 7, 2025 • 0 new comments
[WebGPU] `Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528.`
#22994 commented on Jul 7, 2025 • 0 new comments
To reduce the compiled binary size of ONNX Runtime at x86_64 linux with "create_reduced_build_config.py", but got a Failed to find kernel for com.microsoft.nchwc.Conv(1)
#23018 commented on Jul 7, 2025 • 0 new comments
[Build] Dotnet packages on nuget are not built with Release optimizations
#23053 commented on Jul 7, 2025 • 0 new comments
[Web] ORT format model not working on WebGPU EP + Wasm Static lib
#23072 commented on Jul 7, 2025 • 0 new comments
[Build] onnxruntime_gpu PiPy on a slow host
#23079 commented on Jul 7, 2025 • 0 new comments
Cannot resolve operator 'LSTM' with webgl backend
#23083 commented on Jul 7, 2025 • 0 new comments
[Bug][CUDAExecutionProvider] INVALID_ARGUMENT : unsupported conv activation mode "Sigmoid"
#23114 commented on Jul 7, 2025 • 0 new comments
Understanding max_mem option of OrtArenaCfg class
#23121 commented on Jul 7, 2025 • 0 new comments
[Bug] Inconsistent Results After ONNX Runtime Optimization
#23133 commented on Jul 7, 2025 • 0 new comments
Inconsistent Results After ONNX Runtime Optimization
#23142 commented on Jul 7, 2025 • 0 new comments
[Build] Better support for vcpkg
#23158 commented on Jul 7, 2025 • 0 new comments
ONNX 1.17.0 integration remaining work: fix QNN EP test failures
#23163 commented on Jul 7, 2025 • 0 new comments
Inconsistent Results After ONNX Runtime Optimization
#23199 commented on Jul 7, 2025 • 0 new comments
[Inference Error] The onnx inference result is inconsistent with the numpy inference result
#23202 commented on Jul 7, 2025 • 0 new comments
[Build] how to build onnxruntime with openvino EP for android
#23222 commented on Jul 7, 2025 • 0 new comments
[Build] Xcode unit tests fail with libc++abi: terminating due to uncaught exception of type onnxruntime::OnnxRuntimeException:
#23259 commented on Jul 7, 2025 • 0 new comments
[Build] Build script for MacOS fails for targets older than 13.4 because tests can not be built
#24277 commented on Jul 7, 2025 • 0 new comments
SIGSEGV when calling OrtSession.run()
#24288 commented on Jul 7, 2025 • 0 new comments
[Build] Onnxruntime v1.21.0 fails to build with GCC-13
#24290 commented on Jul 7, 2025 • 0 new comments
quantize onnx models to INT8
#24374 commented on Jul 7, 2025 • 0 new comments
[Performance] [QNN EP] Performance gap between onnxruntime QNN EP and Genie from QNN SDK.
#24417 commented on Jul 7, 2025 • 0 new comments
[Build] Python build fails because onnxruntime/capi/build_and_package_info.py is missing
#24570 commented on Jul 7, 2025 • 0 new comments
[MLAS] Plan to add RISC-V Vector (RVV) support to MLAS
#24596 commented on Jul 7, 2025 • 0 new comments
nuget package 1.21.2 causes conflicts in Solutions targeting .NET Framework 4.8
#24599 commented on Jul 7, 2025 • 0 new comments
[Mobile] Objective-C API for register onnxruntime-extensions as a custom ops library
#24613 commented on Jul 7, 2025 • 0 new comments
[DO NOT UNPIN] ORT 1.22.0 Release Candidates available for testing
#24671 commented on Jul 7, 2025 • 0 new comments
Scale in resize node becomes an identity node not a parameter inside resize node
#24824 commented on Jul 7, 2025 • 0 new comments
Import error in pytest with onnxruntime-directml 1.22.0
#24907 commented on Jul 7, 2025 • 0 new comments
[Web] Fail to link static Wasm library with WebNN EP support
#24936 commented on Jul 7, 2025 • 0 new comments
[Build] CMake Error related to onnxruntime_unittests.cmake
#24972 commented on Jul 7, 2025 • 0 new comments
[Performance] Openvino 2x slower than with OpenCV on an Intel HD Graphics 620 / 630
#25266 commented on Jul 6, 2025 • 0 new comments
onnxruntime with the CPUExecutionProvider errors out while processing the ReverseSequence operator
#24920 commented on Jul 5, 2025 • 0 new comments
[Performance] ORT takes ~11GB memory for quantizing a model of size ~1GB
#24954 commented on Jul 5, 2025 • 0 new comments
[Documentation]
#24958 commented on Jul 5, 2025 • 0 new comments
mutex issue on Mac only for release 1.21.X only
#24579 commented on Jul 4, 2025 • 0 new comments
Can not get USE_MIMALLOC activated in Windows
#25213 commented on Jul 4, 2025 • 0 new comments
terminate called after throwing an instance of 'Ort::Exception' what(): Invalid input name: serving_default_input_1:0
#23730 commented on Jul 7, 2025 • 0 new comments
Onnxruntime using OpenVINO for older version Intel UHD630
#23735 commented on Jul 7, 2025 • 0 new comments
[Build] Error building with ACL EP on aarch64 linux (Raspberry Pi 5)
#23741 commented on Jul 7, 2025 • 0 new comments
[Mobile] Onnxruntime react-native issue: [java.lang.ClassCastException: java.lang.String[][] cannot be cast to java.lang.String[]]
#23782 commented on Jul 7, 2025 • 0 new comments
[Mobile] Dynamic Shape Challenge: Enabling LLM on QNN-HTP
#23832 commented on Jul 7, 2025 • 0 new comments
[Performance] Why does inference occupy so much memory?
#23867 commented on Jul 7, 2025 • 0 new comments
The Pad operator has a calculation error in the "reflect" mode.
#23878 commented on Jul 7, 2025 • 0 new comments
Bad Allocation Error in ONNX Runtime on Windows x86 CPU When Processing Multiple Images Sequentially
#23938 commented on Jul 7, 2025 • 0 new comments
TensorRT Support for Multiple Profiles
#23965 commented on Jul 7, 2025 • 0 new comments
[Build] Unsupported AVX512-FP16 Instructions in MLAS (vcvtneeph2ps, vcvtneoph2ps)
#24025 commented on Jul 7, 2025 • 0 new comments
Application is getting crashed while creating session for the onnxruntime-qnn with QnnCpu backend option.
#24082 commented on Jul 7, 2025 • 0 new comments
ImportError: Unable to import dependency onnxruntime
#24120 commented on Jul 7, 2025 • 0 new comments
onnxruntime-mobile implementation on custom execution provider
#24135 commented on Jul 7, 2025 • 0 new comments
segmentation fault while using onnxruntime==1.21.0
#24144 commented on Jul 7, 2025 • 0 new comments
[Feature Request] A model with dynamic input and dynamic output。 will have a memory leak after inference with Openvino.
#24162 commented on Jul 7, 2025 • 0 new comments
Python Session.run_async Causes Program Exit
#24200 commented on Jul 7, 2025 • 0 new comments
OpenVINO EP not able to use CPU device
#24208 commented on Jul 7, 2025 • 0 new comments
Questions about using AMD VitisAI EP, how can i run my model on AMD NPU?
#24214 commented on Jul 7, 2025 • 0 new comments
[Build] OpenVINO ep for macOS
#24273 commented on Jul 7, 2025 • 0 new comments
[Build] Building v1.21.0: unsupported instruction 'vpdpbusds'
#24275 commented on Jul 7, 2025 • 0 new comments
[Performance] Severe performance penalty with transformer model and DirectML
#20983 commented on Jul 7, 2025 • 0 new comments
onnxruntime shape mismatch during quantization of yolov8 models
#21048 commented on Jul 7, 2025 • 0 new comments
DML cannot use device_id = 1 , run_with_iobinding failed.
#21092 commented on Jul 7, 2025 • 0 new comments
Symbolic Shape infer fails on onnx file without much logs
#21120 commented on Jul 7, 2025 • 0 new comments
[Performance] Whisper model inference results incorrect after Transformer Optimizer
#21150 commented on Jul 7, 2025 • 0 new comments
ORT 1.18.1 Release Candidates available for testing
#21173 commented on Jul 7, 2025 • 0 new comments
[Performance] Mapfile support for certain external data files is not working
#21195 commented on Jul 7, 2025 • 0 new comments
[Mobile] QNN failed to finalize QNN graph for attention layer
#21221 commented on Jul 7, 2025 • 0 new comments
TArray used for broadcast was limited to be within range [0, 8] on onnxruntime 1.16.3
#21254 commented on Jul 7, 2025 • 0 new comments
Not able to load onnx model multilingual-e5-large
#21321 commented on Jul 7, 2025 • 0 new comments
TensorRT EP's inference results are abnormal.
#21457 commented on Jul 7, 2025 • 0 new comments
[Build] Unable to build with --use_dml
#21568 commented on Jul 7, 2025 • 0 new comments
Memory leak in NPU inference after each one session.run
#21587 commented on Jul 7, 2025 • 0 new comments
[Performance]
#21635 commented on Jul 7, 2025 • 0 new comments
Quantized SeaLLM v2 Model Outputs Same as Input
#21636 commented on Jul 7, 2025 • 0 new comments
Same Model Hash Code Issue from different models
#21672 commented on Jul 7, 2025 • 0 new comments
[Bug]: Onnxruntime.CPU memoty leaks
#21723 commented on Jul 7, 2025 • 0 new comments
onnxruntime-directml import interference with sklearn
#21724 commented on Jul 7, 2025 • 0 new comments
Inferencing FP16 model using onnxruntime
#21737 commented on Jul 7, 2025 • 0 new comments
[Web] requested dist/*.mjs files for cdnjs
#21785 commented on Jul 7, 2025 • 0 new comments
Dockerfile does not work
#20458 commented on Jul 7, 2025 • 0 new comments
[Build] cross-compiling onnxruntime for arm32 and onnxruntime_ENABLE_CPUINFO not working.
#20461 commented on Jul 7, 2025 • 0 new comments
RUNTIME_EXCEPTION, 80070057 The parameter is incorrect in v1.17.3
#20464 commented on Jul 7, 2025 • 0 new comments
[Build] cmake duplicate target "memory" between abseil and xnnpack
#20469 commented on Jul 7, 2025 • 0 new comments
[Build] Error when load pf16 model
#20570 commented on Jul 7, 2025 • 0 new comments
DirectML Exception 80070057 "The parameter is incorrect"
#20575 commented on Jul 7, 2025 • 0 new comments
windows系统，Java中使用onnxruntime进行压测，cpu飙升很快，一直100%
#20593 commented on Jul 7, 2025 • 0 new comments
Missing dll cudnn_ops_infer64_8.dll does not generate a python error
#20605 commented on Jul 7, 2025 • 0 new comments
[BUG] Running operations over concat output rewrites it's values
#20606 commented on Jul 7, 2025 • 0 new comments
[Discussion] ORT GPU binaries do not contain DML
#20638 commented on Jul 7, 2025 • 0 new comments
[Build] TVM EP Build
#20665 commented on Jul 7, 2025 • 0 new comments
LayerNormalization doesnt' work as expected on Mac
#20676 commented on Jul 7, 2025 • 0 new comments
User-provided session logging function is not used for every log
#20680 commented on Jul 7, 2025 • 0 new comments
Broken multithreading inference session Onnxruntime-directml >= 1.18
#20713 commented on Jul 7, 2025 • 0 new comments
Windows ARM64 & X64 CLIP Image Encoder different results
#20722 commented on Jul 7, 2025 • 0 new comments
[Build] quantization unittest failed when run all tests
#20821 commented on Jul 7, 2025 • 0 new comments
[.NET] Update tensor implementations to new Tensor<T> type
#20874 commented on Jul 7, 2025 • 0 new comments
Java CreateTensor with NIO ByteBuffer for reuse purpose
#20882 commented on Jul 7, 2025 • 0 new comments
[Build] how to buid on openharmony?
#20895 commented on Jul 7, 2025 • 0 new comments
Stateful/Memory models
#20943 commented on Jul 7, 2025 • 0 new comments
Upcoming ORT 1.20 Release Overview
#22274 commented on Jul 7, 2025 • 0 new comments
[Performance] High CUDA memory usage with ONNX Runtime and inconsistent memory release
#22297 commented on Jul 7, 2025 • 0 new comments
Build failure on Windows 10 using OpenVino 2024.3 & 2024.4 both.
#22314 commented on Jul 7, 2025 • 0 new comments
`quant_pre_process SymbolicShapeInference` causes AttributeError: 'NoneType' object has no attribute 'HasField' when the model has a Constant node.
#22422 commented on Jul 7, 2025 • 0 new comments
The EP_CTX_BLOB seems to have both WRITE and EXECUTABLE permissions enabled
#22437 commented on Jul 7, 2025 • 0 new comments
External data is not loaded with custom allocator
#22468 commented on Jul 7, 2025 • 0 new comments
[Performance] C++ api: destroy the execution provider if the `Ort::Session` is destroyed
#22511 commented on Jul 7, 2025 • 0 new comments
DistilBERT model inference failure using ONNX Runtime QNNExecutionProvider on Snapdragon® X Elite NPU
#22532 commented on Jul 7, 2025 • 0 new comments
Caused by: java.lang.UnsatisfiedLinkError: /tmp/onnxruntime-java18295816951647233732/libonnxruntime.so: Error relocating /tmp/onnxruntime-java18295816951647233732/libonnxruntime.so: __vsnprintf_chk: symbol not found
#22539 commented on Jul 7, 2025 • 0 new comments
[DO NOT UNPIN] ORT Nightly Package Name Change
#22541 commented on Jul 7, 2025 • 0 new comments
Negative output for sigmoid
#22557 commented on Jul 7, 2025 • 0 new comments
[Performance] Model runtime spiky with TensorRT Execution Provider
#22664 commented on Jul 7, 2025 • 0 new comments
Exception during initialization: safeint.h:17 static void SafeIntExceptionHandler<onnxruntime::OnnxRuntimeException>::SafeIntOnOverflow() Integer overflow - caused by int64 index of -1?
#22694 commented on Jul 7, 2025 • 0 new comments
FP16 ONNX model outputs NaN after the first successful execution
#22723 commented on Jul 7, 2025 • 0 new comments
CUDA providers failed to build against 12.6 with error error #221-D
#22728 commented on Jul 7, 2025 • 0 new comments
why force max_length <= kMaxSequenceLength in beam_search_parameters.cc ?
#22735 commented on Jul 7, 2025 • 0 new comments
[TensorRT EP] How can I disable generating cache when using trt execution provider
#22822 commented on Jul 7, 2025 • 0 new comments
[Dev] "./onnxruntime_test_all --help" gives segmentation fault
#22838 commented on Jul 7, 2025 • 0 new comments
how to release gpu memory when use onnxruntime with fastapi
#22899 commented on Jul 7, 2025 • 0 new comments
[Performance] Binary operators using SSE on AVX systems
#22905 commented on Jul 7, 2025 • 0 new comments
run_async not running asynchronously
#21791 commented on Jul 7, 2025 • 0 new comments
[Bug] [onnxruntime-node] Error: no available backend found. ERR: [wasm] backend not found.
#21813 commented on Jul 7, 2025 • 0 new comments
Error when trying to run vision model onnx
#21869 commented on Jul 7, 2025 • 0 new comments
[Build] “onnxruntime_cxx_api.h”: No such file or directory
#21891 commented on Jul 7, 2025 • 0 new comments
Snapdragon X processor is unsupported
#21947 commented on Jul 7, 2025 • 0 new comments
[Mobile] IOS library crashes in Release configuration
#21960 commented on Jul 7, 2025 • 0 new comments
[Web] Uncaught WebGPU validation error on Snapdragon SM8450 but works on SM8250
#21970 commented on Jul 7, 2025 • 0 new comments
[Web] no available backend found [wasm] when importing `onnxruntime-web/wasm`
#22010 commented on Jul 7, 2025 • 0 new comments
[Build] onnxruntime-openvino library does not have python3.12 support
#22015 commented on Jul 7, 2025 • 0 new comments
onnxruntime-gpu(1.18.0) can not be install
#22028 commented on Jul 7, 2025 • 0 new comments
[Training] Implicit dependency of Python training API on 'torch' package
#22070 commented on Jul 7, 2025 • 0 new comments
GetElementType is not implemented after updating onnxruntime
#22075 commented on Jul 7, 2025 • 0 new comments
[Web] Error when using Web Workers on Next.js
#22113 commented on Jul 7, 2025 • 0 new comments
[Question or BUG] ONNX Runtime CUDA Sessions in Unity Produce Empty Outputs When Running Multiple Models Sequentially on a Single Graphic Card
#22146 commented on Jul 7, 2025 • 0 new comments
Warnings displayed as errors during TensorRT optimization.
#22164 commented on Jul 7, 2025 • 0 new comments
trt_weight_stripped_engine_enable does not work for all networks/size ranges.
#22165 commented on Jul 7, 2025 • 0 new comments
trt_weight_stripped_engine_enable does not work together with trt_dump_ep_context_model
#22179 commented on Jul 7, 2025 • 0 new comments
Filenames in OrtTensorRTProviderOptionsV2 should be std::filesystem::path or at least const ORTCHAR_T*
#22182 commented on Jul 7, 2025 • 0 new comments
[CANN] When using onnxruntime-cann for inference, it failed to utilize the NPU for inference
#22229 commented on Jul 7, 2025 • 0 new comments
[Performance] fp16 support and performance
#22242 commented on Jul 7, 2025 • 0 new comments
Basic Optimizer adds non-standard ONNX ops for roi_align
#14753 commented on Jul 7, 2025 • 0 new comments
Basic Optimizer adds non-standard ONNX ops for input tensor
#14754 commented on Jul 7, 2025 • 0 new comments
[Build] cmake install when --use_xnnpack is broken
#14757 commented on Jul 7, 2025 • 0 new comments
Failed to build CUDA docker image[Build]
#14765 commented on Jul 7, 2025 • 0 new comments
`onnx.checker.check_model` raises `Bad node spec` for custom nodes created from ORT `optimize_model`
#14768 commented on Jul 7, 2025 • 0 new comments
Dependency Problem (java onnxruntime)
#14787 commented on Jul 7, 2025 • 0 new comments
[Build] Can't access OrtSessionOptionsAppendExecutionProvider_Dnnl while using oneDNN
#14799 commented on Jul 7, 2025 • 0 new comments
[Build] Dockerfile.arm64 build fails
#14801 commented on Jul 7, 2025 • 0 new comments
[Build] Unable to load TensorRT Execution Provider
#14802 commented on Jul 7, 2025 • 0 new comments
Read access violation under OnnxRuntimeCpuSessionBuilder::Initialize during WinML operator tests for function operators
#14810 commented on Jul 7, 2025 • 0 new comments
[Web] how to reduce wasm file size
#14817 commented on Jul 7, 2025 • 0 new comments
onnxruntime with CUDA not releasing about 400 MB memory after the session and environment is destroyed
#14819 commented on Jul 7, 2025 • 0 new comments
working model with Resize node becomes invalid after using convert_float_to_float16
#14827 commented on Jul 7, 2025 • 0 new comments
How do I pass a list of tensors in onnxruntime-web?
#14829 commented on Jul 7, 2025 • 0 new comments
DML EP cannot load some quantized onnx files.
#14835 commented on Jul 7, 2025 • 0 new comments
[Performance] Performance degradation while using dynamic axes
#14863 commented on Jul 7, 2025 • 0 new comments
UndefinedBehaviorSanitizer reports problem in onnxruntime_global_thread_pools_test
#14882 commented on Jul 7, 2025 • 0 new comments
[Build] Error APPX1101 - Payload contains two or more files with the same destination path 'microsoft.ai.machinelearning.dll'
#14915 commented on Jul 7, 2025 • 0 new comments
[Performance]
#14919 commented on Jul 7, 2025 • 0 new comments
Is there a Python way to get the max supported ONNX IR version from ORT package?
#14932 commented on Jul 7, 2025 • 0 new comments
[Performance] Memory grows after reloading model
#14641 commented on Jul 7, 2025 • 0 new comments
[Build] Building for C++ On Jetson Nano CUDA 10.2
#14644 commented on Jul 7, 2025 • 0 new comments
TensorRT Execution Build Fails on Jetson Jetpack 4.6.1
#14658 commented on Jul 7, 2025 • 0 new comments
DEEPFACE LIVE Issue with onnxruntime_pybind_state.
#14667 commented on Jul 7, 2025 • 0 new comments
[Build]
#14674 commented on Jul 7, 2025 • 0 new comments
Custom Operater Output Tensor Shape Error
#14683 commented on Jul 7, 2025 • 0 new comments
[Web] Inference speed halves if you open DevTools after loading an inference session, even if you close DevTools afterwards
#14692 commented on Jul 7, 2025 • 0 new comments
`CleanUnusedInitializersAndNodeArgs` warnings are printed only with subgraphs
#14694 commented on Jul 7, 2025 • 0 new comments
[Performance]why is the inference latency of onnx QDQ quantized model converted from tflite quantized model (or from tensorflow Quantization-Aware training (QAT) model) as same as normal onnx float32 model?
#14707 commented on Jul 7, 2025 • 0 new comments
A runtime can run on cuda device 0 but fail on cuda device 1
#14710 commented on Jul 7, 2025 • 0 new comments
Non-zero status code returned while running Reshape node. Name:'Reshape_7411' The input tensor cannot be reshaped to the requested shape. Input shape:{51}, requested shape:{}
#14712 commented on Jul 7, 2025 • 0 new comments
How to inference with multiple batches and multiple inputs.
#14713 commented on Jul 7, 2025 • 0 new comments
Crash in JavaGPU on Windows
#14714 commented on Jul 7, 2025 • 0 new comments
The Microsoft.ML.OnnxRuntime.Gpu nuget on Visual studio latest version 1.14.0 has a bug when running with the tensorrt on run time.
#14730 commented on Jul 7, 2025 • 0 new comments
clog_vlog_fatal[Build]
#14740 commented on Jul 7, 2025 • 0 new comments
[Performance] How to create multiple tensors with consecutive addresses when the cuda memory is not occupied?
#14742 commented on Jul 7, 2025 • 0 new comments
Memory Leak
#14745 commented on Jul 7, 2025 • 0 new comments
[Build] macOS: cross compiling arm64 on intel fails
#14746 commented on Jul 7, 2025 • 0 new comments
[Performance] Can oneDNN EP accelerate the inference time of onnxruntime on x86 machines?
#14749 commented on Jul 7, 2025 • 0 new comments
Basic Optimizer adds non-standard ONNX ops
#14752 commented on Jul 7, 2025 • 0 new comments
[bug] error while loading shared libraries: libonnxruntime.so.1.8.1: cannot open shared object file: No such file or directory
#15053 commented on Jul 7, 2025 • 0 new comments
[Performance] Inference doubles VRAM (DirectML)
#15074 commented on Jul 7, 2025 • 0 new comments
[Web] Memory spike in ORT-web leading to app crash
#15086 commented on Jul 7, 2025 • 0 new comments
onnxruntime: bfc_arena.cc:361 void* onnxruntime::BFCArena::FindChunkPtr(onnxruntime::BFCArena::BinNum, size_t, size_t) !chunk->in_use() was false.
#15087 commented on Jul 7, 2025 • 0 new comments
The dimension of incides to ScatterND op is wrong during inference.
#15095 commented on Jul 7, 2025 • 0 new comments
[Performance] onnxruntime allocates lots of cuda memory on T4
#15098 commented on Jul 7, 2025 • 0 new comments
fail build with gcc 12.x in onnxruntime/contrib_ops/cuda/quantization/qordered_ops/qordered_qdq.cc
#15111 commented on Jul 7, 2025 • 0 new comments
How to reduce GPU memory usage when inference
#15127 commented on Jul 7, 2025 • 0 new comments
descriptor_table_tensorboard_2fcompat_2fproto_2fattr_5fvalue_2eproto not declared (TRT 8.5.0)
#15131 commented on Jul 7, 2025 • 0 new comments
how to inference with fp16 precise in python code?
#15134 commented on Jul 7, 2025 • 0 new comments
NOT_IMPLEMENTED GridSample(16) on onnxruntime 1.14.1
#15137 commented on Jul 7, 2025 • 0 new comments
onnxruntime::utils::ConstantNodeProtoToTensorProto Unsupported attribute value type of 9 in 'Constant' node 'Constant_35'
#15149 commented on Jul 7, 2025 • 0 new comments
Type Error: Type 'tensor(int64)' of input parameter (relative_position) of operator (Min) in node (Min_2286) is invalid.
#15167 commented on Jul 7, 2025 • 0 new comments
inference speed is very slow when using fp16 while using fp 32 is normal
#15170 commented on Jul 7, 2025 • 0 new comments
A bug occurs when the program terminates
#15174 commented on Jul 7, 2025 • 0 new comments
[Performance] Why is the Conv + Max-Pool model faster than the Conv model using GraphOptimizationLevel::ORT::ENABLE_ALL?
#15180 commented on Jul 7, 2025 • 0 new comments
[Performance] GPT NEO: better performance of python GPT NEO than its onnx runtime version in C++?
#15191 commented on Jul 7, 2025 • 0 new comments
[Build] segfault when run unitest (ctest)
#15224 commented on Jul 7, 2025 • 0 new comments
[Build] fail to build on Windows ARM64
#15252 commented on Jul 7, 2025 • 0 new comments
[Performance] How to debug/reduce GPU utilization?
#15254 commented on Jul 7, 2025 • 0 new comments
[Performance] 3-100x regression when opset 16 or 17 is used (CUDA EP)
#14956 commented on Jul 7, 2025 • 0 new comments
[Performance] Can not release memory in gpu.
#14957 commented on Jul 7, 2025 • 0 new comments
Reuse output tensors memory that was allocated by first call to Ort::Session.Run(...)
#14960 commented on Jul 7, 2025 • 0 new comments
Compatibility between Onnx and Blazor Webassembly
#14962 commented on Jul 7, 2025 • 0 new comments
Running T5 export ONNX example leads to shape inference error
#14963 commented on Jul 7, 2025 • 0 new comments
Microsoft.ML.OnnxRuntime.Gpu not working in MAUI project
#14974 commented on Jul 7, 2025 • 0 new comments
[Build] Failed to build in docker container
#14983 commented on Jul 7, 2025 • 0 new comments
conv throws safeint exception
#14985 commented on Jul 7, 2025 • 0 new comments
Static linkage of onnx_runtime and providers library
#14986 commented on Jul 7, 2025 • 0 new comments
[Build] static assertion fails when building from source with GCC 13.0.1
#14991 commented on Jul 7, 2025 • 0 new comments
[Performance] inference problems with io_binding: unexpected shape or unexpected data type
#14998 commented on Jul 7, 2025 • 0 new comments
[Performance] TensorRT provider produces (slightly) differently named engine files for the same model between runs
#14999 commented on Jul 7, 2025 • 0 new comments
CUDA Graph Error - CUDA failure 900: operation not permitted when stream is capturing
#15002 commented on Jul 7, 2025 • 0 new comments
ONNX does not support Dirichlet distribution?
#15016 commented on Jul 7, 2025 • 0 new comments
[Build] Problems with FP16 Layernorm
#15021 commented on Jul 7, 2025 • 0 new comments
[Build] api-ms-win-core-heap-l2-1-0.dll missing on windows server 2012 R2
#15025 commented on Jul 7, 2025 • 0 new comments
onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126
#15035 commented on Jul 7, 2025 • 0 new comments
accuracy reduced with multithreaded GPU prediction
#15038 commented on Jul 7, 2025 • 0 new comments
mT5 convert to ONNX and GPU inference problems
#15042 commented on Jul 7, 2025 • 0 new comments
[Build] Cannot specify compile definitions for target "onnx" which is not built by this project.
#15051 commented on Jul 7, 2025 • 0 new comments
[Performance] CUDA EP with Strange Inference Time
#14016 commented on Jul 7, 2025 • 0 new comments
[Performance] the speed and cpu utilization with SetIntraOpNumThreads(1) and SetIntraOpNumThreads(2)
#14018 commented on Jul 7, 2025 • 0 new comments
[Performance] onnx vs pt memory usage
#14029 commented on Jul 7, 2025 • 0 new comments
[Performance] High memory use by CUDAProvider in Jetson Xavier NX(JetPack 4.4)
#14038 commented on Jul 7, 2025 • 0 new comments
java onnxruntime_providers_cuda.dll
#14047 commented on Jul 7, 2025 • 0 new comments
There is a vulnerability in torch:1.12.0,upgrade recommended
#14059 commented on Jul 7, 2025 • 0 new comments
[Build] impossible to build onnxruntime with vs2022
#14086 commented on Jul 7, 2025 • 0 new comments
[Build] core/framework/fence.h not found while build upon CANN
#14121 commented on Jul 7, 2025 • 0 new comments
300% slower on MYRIAD_FP16 when using CustomVision fp16 model
#14125 commented on Jul 7, 2025 • 0 new comments
[Training] Does the current training code support RNN model like seq2seq and Transformer and GNN model?
#14139 commented on Jul 7, 2025 • 0 new comments
[Build] Dockerfile.arm64 - No module named 'packaging' error
#14140 commented on Jul 7, 2025 • 0 new comments
CUDNN error executing cudnnConvolutionForward
#14186 commented on Jul 7, 2025 • 0 new comments
ONNXRuntime outputs numerically incorrect results for mixed precision models.
#14189 commented on Jul 7, 2025 • 0 new comments
Infer shape incorrect for Split with opset 15
#14200 commented on Jul 7, 2025 • 0 new comments
ConvTranspose2d onnxruntime and pytorch forward results are inconsistent
#14208 commented on Jul 7, 2025 • 0 new comments
`onnxruntime.quantization` does not support `.onnx` files produced by `tf2onnx.convert.from_function` with the `large_model` option set to `True`
#14213 commented on Jul 7, 2025 • 0 new comments
No module named 'onnxruntime.transformers.io_binding_helper'
#14230 commented on Jul 7, 2025 • 0 new comments
The input tensor cannot be reshaped to the requested shape.
#14237 commented on Jul 7, 2025 • 0 new comments
Valgrind: Source and destination overlap in memcpy_chk
#14254 commented on Jul 7, 2025 • 0 new comments
[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_3 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_3_0'
#14280 commented on Jul 7, 2025 • 0 new comments
Cannot run inference on Integrated Graphics with OpenVino EP using C Sharp API
#13772 commented on Jul 7, 2025 • 0 new comments
QDQ not instrumenting inputs if first operator is a SUM
#13794 commented on Jul 7, 2025 • 0 new comments
how bring my hardware backend to onnxruntime framework
#13797 commented on Jul 7, 2025 • 0 new comments
[WebGL] cannot resolve operator 'DynamicQuantizeLinear' with opsets: ai.onnx v16, ...
#13800 commented on Jul 7, 2025 • 0 new comments
[CPUExecutionProvider] PyTorch/Numpy operations following InferenceSession.run() are 50x slower compared to using dummy inputs
#13808 commented on Jul 7, 2025 • 0 new comments
hello,how to improve [Performance] in batch inference with multicore cpu
#13820 commented on Jul 7, 2025 • 0 new comments
Dynamic quantization is useless on AMD cpus(AMD EPYC 7K62 48-Core Processor)
#13872 commented on Jul 7, 2025 • 0 new comments
SSDLite 320: RuntimeException on CUDA. TopK index assert was false.
#13876 commented on Jul 7, 2025 • 0 new comments
Segmentation Faults when using TensorRT on Jetson Orin Dev Kit
#13877 commented on Jul 7, 2025 • 0 new comments
Model run with `TensorrtExecutionProvider` outputs different results compared to `CPUExecutionProvider` / `CUDAExecutionProvider` when the ONNX `Loop` operator is used
#13894 commented on Jul 7, 2025 • 0 new comments
[Web] dynamic batch size doesn't work when use webgl provider
#13909 commented on Jul 7, 2025 • 0 new comments
[Web] ort-wasm-simd.wasm can't be loaded in Electron renderer (using webpack)
#13933 commented on Jul 7, 2025 • 0 new comments
[Build] Incomplete type used in nested name specifier, Ubuntu
#13942 commented on Jul 7, 2025 • 0 new comments
Do I need to convert data to device for TensorRTExecutionProvider?
#13952 commented on Jul 7, 2025 • 0 new comments
CUDA provider gives different result with respect to CPU
#13962 commented on Jul 7, 2025 • 0 new comments
bug: onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_4 node.
#13973 commented on Jul 7, 2025 • 0 new comments
Non-zero status code returned while running Resize node
#13975 commented on Jul 7, 2025 • 0 new comments
Java Problematic Frame [libonnxruntime.dylib+0x8212be] onnxruntime::DataTypeImpl::ToString(onnxruntime::DataTypeImpl const*)+0xe
#13976 commented on Jul 7, 2025 • 0 new comments
[Performance] [webgl]bad performance of webgl
#13986 commented on Jul 7, 2025 • 0 new comments
[windows7] Unable to load DLL 'onnxruntime.dll': The specified module could not be found.
#14003 commented on Jul 7, 2025 • 0 new comments
[Performance] cuda_options.arena_extend_strategy = 1 does not free memory
#14474 commented on Jul 7, 2025 • 0 new comments
[Performance] DirectML cost more memory than CPU when process the Win32(X86) program (official demo).
#14479 commented on Jul 7, 2025 • 0 new comments
[Performance] CPU Usage is too high
#14490 commented on Jul 7, 2025 • 0 new comments
[Performance] cuDNN lib mismatch let to a underutilization of GPU
#14498 commented on Jul 7, 2025 • 0 new comments
missing headers and pkgconfig files in binary packages distribution (from github releases) (linux)
#14503 commented on Jul 7, 2025 • 0 new comments
[Web] Runtime error using `onnxruntime-node` with webpack
#14505 commented on Jul 7, 2025 • 0 new comments
[Performance] Find out why the GPU memory allocated with `CUDAExecutionProvider` is much larger than the ONNX size
#14526 commented on Jul 7, 2025 • 0 new comments
Non-zero status code returned while running DnnlCustomOp2 node
#14543 commented on Jul 7, 2025 • 0 new comments
Check and modify the weights of a layer of an onnx model at runtime
#14545 commented on Jul 7, 2025 • 0 new comments
[Performance] DirectML Dynamic Axes very slow
#14550 commented on Jul 7, 2025 • 0 new comments
[BUG] FusedConv node error
#14561 commented on Jul 7, 2025 • 0 new comments
fp32 model with autocast to fp16: Shape mismatch attempting to re-use buffer
#14582 commented on Jul 7, 2025 • 0 new comments
[Build] cuda dll wrap up
#14585 commented on Jul 7, 2025 • 0 new comments
different results with onnxruntime-gpu-1.10
#14587 commented on Jul 7, 2025 • 0 new comments
[Web] currently non-1 steps is not supported for Slice
#14588 commented on Jul 7, 2025 • 0 new comments
Destroying an inference session without exiting the python process
#14590 commented on Jul 7, 2025 • 0 new comments
C# - CUDA Nuget BUG : DefaultLogger Attempt to use DefaultLogger but none has been registered.
#14593 commented on Jul 7, 2025 • 0 new comments
Onnxruntime Arm NN Ep build error.
#14611 commented on Jul 7, 2025 • 0 new comments
[Performance]
#14615 commented on Jul 7, 2025 • 0 new comments
[Build] cpp_field.h(189,47): error C2059: 语法错误:“)”
#14627 commented on Jul 7, 2025 • 0 new comments
[Build] Docker arm64 build fails.
#14283 commented on Jul 7, 2025 • 0 new comments
ONNX Runtime support for the graph optimization of bigbird_pegasus model
#14295 commented on Jul 7, 2025 • 0 new comments
TensorRT EP same inference Time of INT 8 and FP 16
#14315 commented on Jul 7, 2025 • 0 new comments
STFT op has the wrong expected shape
#14316 commented on Jul 7, 2025 • 0 new comments
[Performance] running on xavier gpu but cpu usage high
#14676 commented on Jul 7, 2025 • 0 new comments
Program will stuck when creating 'Ort::Session'
#14317 commented on Jul 7, 2025 • 0 new comments
[Performance] ONNXruntime CPU is slower than Pytorch Tracing to Torchscript on CPU
#14326 commented on Jul 7, 2025 • 0 new comments
RemoveNode Should be unreachable if CanRemoveNodeAndMergeEdges is in sync with the logic
#14360 commented on Jul 7, 2025 • 0 new comments
[Bug] Attention and QAttention don't work properly in some cases
#14363 commented on Jul 7, 2025 • 0 new comments
Add some custom QlinearXXX Ops
#14365 commented on Jul 7, 2025 • 0 new comments
[Build] Error in builiding with Tensorrt EP
#14394 commented on Jul 7, 2025 • 0 new comments
Import Error " cannot import name 'get_all_providers' "
#14395 commented on Jul 7, 2025 • 0 new comments
[Training] The gradient builder has not been registered: ReduceMin
#14412 commented on Jul 7, 2025 • 0 new comments
Free allocated data of Ort::Value in C++
#14420 commented on Jul 7, 2025 • 0 new comments
Pad operator not quantizable?
#14422 commented on Jul 7, 2025 • 0 new comments
Different Python exceptions on OOM with `run_with_iobinding` and `run`
#14438 commented on Jul 7, 2025 • 0 new comments
Modifying QlinearADD
#14441 commented on Jul 7, 2025 • 0 new comments
[ONNXRuntimeError] Unsupported OrtValue type with CUDA EP
#14457 commented on Jul 7, 2025 • 0 new comments
[Performance] There is some confusion with onnx + oneDNN or onnx + OpenVINO
#14468 commented on Jul 7, 2025 • 0 new comments
[Build]
#14471 commented on Jul 7, 2025 • 0 new comments
[ onnxruntime::SequentialExecutor::Execute] Non-zero status code returned while running FusedMatMul node. Name:'MatMul_With_Transpose_token_14_FusedMatMulAndScale' Status Message: bad allocation unknown file: error: C++ exception with description "Non-zero status code returned while running FusedMatMul node. Name:'MatMul_With_Transpose_token_14_FusedMatMulAndScale' Status Message: bad allocation" thrown in the test body.
#16305 commented on Jul 7, 2025 • 0 new comments
issue running onnxruntime with pytest
#16306 commented on Jul 7, 2025 • 0 new comments
How to catch exception OOM.
#16307 commented on Jul 7, 2025 • 0 new comments
How to edit Clip Operator in OnnxRuntime?
#16315 commented on Jul 7, 2025 • 0 new comments
get error when using libonnxruntime with dnnl EP
#16320 commented on Jul 7, 2025 • 0 new comments
InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from d2net.onnx failed:This is an invalid model. Type Error: Type 'tensor(bool)' of input parameter (onnx::Min_109) of operator (Min) in node (Min_65) is invalid.
#16321 commented on Jul 7, 2025 • 0 new comments
Increase - decrease the maximum number of events during inference profiling.
#16334 commented on Jul 7, 2025 • 0 new comments
[Build] Fails to parse FP16 LayerNormalization in opset>=18
#16341 commented on Jul 7, 2025 • 0 new comments
[Build] Disable ORT_ENABLE_STREAM build error
#16345 commented on Jul 7, 2025 • 0 new comments
MaxPool: When Ceil_mode=1, MaxPool Generates Big Values.
#16350 commented on Jul 7, 2025 • 0 new comments
AveragePool: When Ceil_mode=1, AveragePool Generates Nan or 0 Values.
#16351 commented on Jul 7, 2025 • 0 new comments
[Training]
#16354 commented on Jul 7, 2025 • 0 new comments
multi-GPU inferencing
#16382 commented on Jul 7, 2025 • 0 new comments
Operator Pad reflect mode does not yield correct results
#16401 commented on Jul 7, 2025 • 0 new comments
[Web] Web ~40x slower than native
#16412 commented on Jul 7, 2025 • 0 new comments
[Performance] DML dynamic axes performance regression.
#16424 commented on Jul 7, 2025 • 0 new comments
C++ Runtime does not recognize supposedly correct input.
#16430 commented on Jul 7, 2025 • 0 new comments
Normalizer does not work as expected
#16451 commented on Jul 7, 2025 • 0 new comments
[Mobile] Unable to load models in Xamarin iOS
#16463 commented on Jul 7, 2025 • 0 new comments
[Performance] net.set_providers(['DmlExecutionProvider'], [{'device_id': 0}]) could get stuck forever (directml EP)
#16473 commented on Jul 7, 2025 • 0 new comments
no acceleration onnx on e5 2680v3
#16185 commented on Jul 7, 2025 • 0 new comments
[Performance] setIntraOpNumThreads doesn't offer enough parallelization in JAVA-API
#16192 commented on Jul 7, 2025 • 0 new comments
DmlExecutionProvider bound to PyTorch tensor stops running
#16197 commented on Jul 7, 2025 • 0 new comments
NullReferenceException when creating an object of class SessionOptions | Unity
#16205 commented on Jul 7, 2025 • 0 new comments
[quantization] Problem with QDQ of Pow/Sqrt/Div
#16219 commented on Jul 7, 2025 • 0 new comments
why the input doesn't place in cuda ?
#16225 commented on Jul 7, 2025 • 0 new comments
[Training][api:C++][feature request] Support Model Forward Output and Backward Gradient Extraction in ONNX runtime training
#16232 commented on Jul 7, 2025 • 0 new comments
TensorrtExecutionProvider::GetSupportedList graph_build.Resolve().IsOK() was false.
#16234 commented on Jul 7, 2025 • 0 new comments
Not returning anything for out-of-vocabulary text while batch inference using Tf-IDF ONNX Vectorizer model
#16251 commented on Jul 7, 2025 • 0 new comments
Inconsistent generation of vectors by TF-IDF ONNX Vectorizer Model
#16252 commented on Jul 7, 2025 • 0 new comments
[OOM] Unable to convert 30B Model
#16254 commented on Jul 7, 2025 • 0 new comments
[Performance] Evaluation behavior with external arrays (C API)
#16255 commented on Jul 7, 2025 • 0 new comments
onnx use more memory than pytorch for some model
#16264 commented on Jul 7, 2025 • 0 new comments
[Web/Build] Failed to consume onnxruntime-common because of JS parser not up-to-date
#16265 commented on Jul 7, 2025 • 0 new comments
how to trace the error "assert node is not None" when use the onnxruntime.transformers.optimizer
#16268 commented on Jul 7, 2025 • 0 new comments
ROCm EP: Errors when trying to infer, which GPUs are supported?
#16271 commented on Jul 7, 2025 • 0 new comments
[Accuracy/Performance]
#16275 commented on Jul 7, 2025 • 0 new comments
Does OnnxruntimeV1.14 still support the Python Operator, and which the highest version supports this feature？
#16277 commented on Jul 7, 2025 • 0 new comments
Seg faults when creating InferenceSession for SAM backbone
#16300 commented on Jul 7, 2025 • 0 new comments
[Mobile] Error: Non string type of a tensor data is not allowed
#16301 commented on Jul 7, 2025 • 0 new comments
[Bug] Coqui VITS ONNX model can't be statically quantized.
#16738 commented on Jul 7, 2025 • 0 new comments
CUDA Custom Op CUDA failure
#16748 commented on Jul 7, 2025 • 0 new comments
clean build v1.15.1 fails three fp16 tests due to `difference between... exceeds threshold"
#16775 commented on Jul 7, 2025 • 0 new comments
[Performance] FP16 models incur large cast latency when run on CPUs without FP16 support
#16778 commented on Jul 7, 2025 • 0 new comments
Incorrect Output from Java Model
#16781 commented on Jul 7, 2025 • 0 new comments
Segmentation Fault when using TensorRT execution provider
#16790 commented on Jul 7, 2025 • 0 new comments
[Performance] using onnxruntime with ray and also fix for memory footprint too high
#16793 commented on Jul 7, 2025 • 0 new comments
[Performance] [Web] Using the `onnxruntime-web` package (`wasm` backend) with Node.js is 1.6x to 2x faster than in browsers and Deno?
#16798 commented on Jul 7, 2025 • 0 new comments
[Training] Proposal: Implement back propagation algorithm for C#
#16809 commented on Jul 7, 2025 • 0 new comments
[Performance]
#16817 commented on Jul 7, 2025 • 0 new comments
[Mobile] Failure to load whisper model .ort with react-native, regular and quantized versions
#16819 commented on Jul 7, 2025 • 0 new comments
onnxruntime_providers_cuda.dll cannot be loaded due to "Can't find dependent libraries" under Windows 10 environment using Java
#16821 commented on Jul 7, 2025 • 0 new comments
op.SequenceEmpty(dtype=xxx) cannot be set to float16.
#16846 commented on Jul 7, 2025 • 0 new comments
[Performance]high latency variance
#16876 commented on Jul 7, 2025 • 0 new comments
[Performance] Convolution layer issue profiling
#16926 commented on Jul 7, 2025 • 0 new comments
[Mobile][Kotlin] OnnxTensor.createTensor from floatBuffer takes up 7 seconds
#16937 commented on Jul 7, 2025 • 0 new comments
Crash at winrt::Windows::AI::MachineLearning::implementation::LearningModelSession::GetResults during inferencing
#16988 commented on Jul 7, 2025 • 0 new comments
Native assemblies aren't copied when Onnx is a transitive dependency and using netstandard
#17010 commented on Jul 7, 2025 • 0 new comments
Why onnxruntime extracts only 483MB json file?
#17013 commented on Jul 7, 2025 • 0 new comments
Why some models are not profiling the input weights, bias etc and the node index in the json file properly?
#17022 commented on Jul 7, 2025 • 0 new comments
Automatic deallocation (?) of the Ort::Sessions, memory leak?
#16497 commented on Jul 7, 2025 • 0 new comments
m2m 100 418M
#16480 commented on Jul 7, 2025 • 0 new comments
[Performance] A model with a large TreeEnsembleClassifier node takes too long to be loaded
#16511 commented on Jul 7, 2025 • 0 new comments
Setting `CUBLAS_WORKSPACE_CONFIG=":4096:8"` leads to `CUBLAS_STATUS_ALLOC_FAILED`
#16512 commented on Jul 7, 2025 • 0 new comments
ONNXRuntimeError: Training mode does not support BN opset 14 (or higher) yet.
#16867 commented on Jul 7, 2025 • 0 new comments
[Build] INVALID_ARGUMENT : Invalid rank for input: input Got: 4 Expected: 2 Please fix either the inputs or the model.
#16557 commented on Jul 7, 2025 • 0 new comments
[Build] libonnxruntime_providers_dnnl.so: undefined symbol: omp_get_max_threads
#16561 commented on Jul 7, 2025 • 0 new comments
Large model >2GB save_to_ort
#16573 commented on Jul 7, 2025 • 0 new comments
[Build] fatal error: too many errors emitted, stopping now [-ferror-limit=]
#16576 commented on Jul 7, 2025 • 0 new comments
[Build] Cannot build onnxruntime
#16583 commented on Jul 7, 2025 • 0 new comments
Conv3d precision error between pytorch and onnx
#16589 commented on Jul 7, 2025 • 0 new comments
[Training] Define a custom training with some ONNX models
#16597 commented on Jul 7, 2025 • 0 new comments
[Performance] Performance degradation observed w.r.t DNNL-EP in v1.15.1 compared to v1.13.1
#16609 commented on Jul 7, 2025 • 0 new comments
[Build] No C++ library is generated after compilation completed
#16610 commented on Jul 7, 2025 • 0 new comments
[Performance] Computation time of iteratively applying neural network in a single ONNX model using CUDA Execution Provider dominated by Memcpy
#16625 commented on Jul 7, 2025 • 0 new comments
[Build] Dependency on OMP/MPI Runtime
#16631 commented on Jul 7, 2025 • 0 new comments
[Performance]
#16637 commented on Jul 7, 2025 • 0 new comments
The input tensor cannot be reshaped to the requested shape after adding Gather output to model's output
#16670 commented on Jul 7, 2025 • 0 new comments
Access violation reading location when I use CreateArenaCfgV2 and CUDA
#16686 commented on Jul 7, 2025 • 0 new comments
One path in the graph requests feature X(>Y) but input tensor has Y features
#16695 commented on Jul 7, 2025 • 0 new comments
InferenceSession fails with segmentation fault when fp16 model is loaded with CPUExecutionProvider
#15494 commented on Jul 7, 2025 • 0 new comments
[ErrorCode:Fail] Load model from [...]\latin_ipa_forward.onnx failed:invalid vector subscript
#15495 commented on Jul 7, 2025 • 0 new comments
[Build] Openvino debug build fails on VS2019
#15496 commented on Jul 7, 2025 • 0 new comments
[Web] probability is not returned: `error code = 1`
#15511 commented on Jul 7, 2025 • 0 new comments
SimplifiedLayerNormalization loading error for converted FP16 databricks/dolly-v2-3b model
#15531 commented on Jul 7, 2025 • 0 new comments
[Performance] FP16 model can not get acceleration on GPU with ONNXRuntime-GPU
#15534 commented on Jul 7, 2025 • 0 new comments
Get results from Mask RCNN model with C++
#15541 commented on Jul 7, 2025 • 0 new comments
fatal error: gsl/gsl: No such file or directory
#15554 commented on Jul 7, 2025 • 0 new comments
Error running quantize_dynamic: Failed to find proper ai.onnx domain
#15563 commented on Jul 7, 2025 • 0 new comments
[Build] 1.14.0-dev-20230120-0204-3d6cea14f4 (This build breaks model on Intel)
#15567 commented on Jul 7, 2025 • 0 new comments
[Performance] CUDA fp16 didn't get speed up
#15585 commented on Jul 7, 2025 • 0 new comments
Error with custom spconv class in onnx runtime
#15594 commented on Jul 7, 2025 • 0 new comments
[Build] Java Nightly build
#15600 commented on Jul 7, 2025 • 0 new comments
[Build] the Linux build config
#15621 commented on Jul 7, 2025 • 0 new comments
Can't use onnxruntime with DirectML built from source
#15628 commented on Jul 7, 2025 • 0 new comments
[Performance] CNN model exported by PyTorch runs slower than Tensorflow 1.0
#15647 commented on Jul 7, 2025 • 0 new comments
onnxRuntimeException and DefaultLogger issues in AWS Lambda runtime
#15650 commented on Jul 7, 2025 • 0 new comments
ONNXRuntime in Docker
#15652 commented on Jul 7, 2025 • 0 new comments
ONNX with FloatTensorType when inferred from C++ returns different label everytime
#15665 commented on Jul 7, 2025 • 0 new comments
[Build] Compile Error if path too long
#15674 commented on Jul 7, 2025 • 0 new comments
[Performance]
#15265 commented on Jul 7, 2025 • 0 new comments
ONNX model with FBNetv3 architecture Conversion to TensorRT Problem
#15269 commented on Jul 7, 2025 • 0 new comments
[Build] ONNX Java Runtime - Handle UnsatisfiedLinkError
#15281 commented on Jul 7, 2025 • 0 new comments
[Documentation Request] Estimating (or Checking) Allocated Memory
#15326 commented on Jul 7, 2025 • 0 new comments
[Performance] Timings feedback
#15328 commented on Jul 7, 2025 • 0 new comments
[Performance] Gemm op is slower after quantization
#15332 commented on Jul 7, 2025 • 0 new comments
[Mobile] onnxruntime-c and onnxruntime-extensions-c pod conflict with DocumentReader pod
#15333 commented on Jul 7, 2025 • 0 new comments
[Performance] ONNXRUNTIME sometime DEAD in python multiprocessing
#15345 commented on Jul 7, 2025 • 0 new comments
ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB: 2235646909
#15349 commented on Jul 7, 2025 • 0 new comments
[Web] custom ops
#15374 commented on Jul 7, 2025 • 0 new comments
[Performance] Running Large Language Models for dynamic input size is poor performance. (DirectML)
#15394 commented on Jul 7, 2025 • 0 new comments
Opset Coverage - Binary Size Tradeoff
#15397 commented on Jul 7, 2025 • 0 new comments
[Build] C++ API calling fail: error C2280: 'Ort::Value::Value(const Ort::Value &)' : attempt to reference a deleted function
#15418 commented on Jul 7, 2025 • 0 new comments
Mask-RCNN network is giving significantly different result with DirectML EP
#15459 commented on Jul 7, 2025 • 0 new comments
Error Unrecognized attribute: layout for operator DynamicQuantizeLSTM
#15465 commented on Jul 7, 2025 • 0 new comments
Please provide informative message on dlopen failures -- python API
#15476 commented on Jul 7, 2025 • 0 new comments
[Performance] WebAssembly 1x1 Conv almost 4x slower than native
#15483 commented on Jul 7, 2025 • 0 new comments
[Performance] Model converted to mixed precision results in higher latency
#15490 commented on Jul 7, 2025 • 0 new comments
Inference slows down on gpu.
#15491 commented on Jul 7, 2025 • 0 new comments
[Bug?] Casting int8-->float
#15492 commented on Jul 7, 2025 • 0 new comments
[Predict] Prediction from ONNX is same for all images
#16001 commented on Jul 7, 2025 • 0 new comments
The result of Col2Im operator not close with Torch result on fp16 dtype
#16007 commented on Jul 7, 2025 • 0 new comments
[Performance] QUInt8 vs a basic ONNX
#16009 commented on Jul 7, 2025 • 0 new comments
RunOptions.only_execute_path_to_fetches not working
#16013 commented on Jul 7, 2025 • 0 new comments
Cannot open include file: numpy/arrayobject.h
#16027 commented on Jul 7, 2025 • 0 new comments
[Web] The onnxruntime-web example is loading wasm file twice if set to local path
#16028 commented on Jul 7, 2025 • 0 new comments
[Web] [WebGPU] Uncaught (in promise) DOMException: Unable to instantiate a Device in Firefox Nightly/Linux
#16029 commented on Jul 7, 2025 • 0 new comments
inference time decreasing when increasing batch size to a certain point and them the inference time increasing again.
#16030 commented on Jul 7, 2025 • 0 new comments
can we customize memory allocation functions(like malloc/free) for inference in C api?
#16032 commented on Jul 7, 2025 • 0 new comments
[Performance] How to solve the problem of releasing GPU memory in onnxruntime
#16033 commented on Jul 7, 2025 • 0 new comments
[Performance] Huge gap between nn.Conv1d() and nn.Conv2d() - models exported by PyTorch
#16047 commented on Jul 7, 2025 • 0 new comments
Unexpected inference output from QLinearConv
#16105 commented on Jul 7, 2025 • 0 new comments
Memory leak in cpuinfo_x86_linux_init
#16117 commented on Jul 7, 2025 • 0 new comments
Segmentation Fault when optimizing Stable Diffusion models
#16140 commented on Jul 7, 2025 • 0 new comments
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] with swin-t
#16143 commented on Jul 7, 2025 • 0 new comments
Segmentation fault while loading CUDA Provider
#16146 commented on Jul 7, 2025 • 0 new comments
[Performance] ONNX Runtime doesn't parallelize operations in CPU models
#16158 commented on Jul 7, 2025 • 0 new comments
The prediction results from STFT has changed with a notable shift towards larger difference from PyTorch in ORT==1.15.0
#16163 commented on Jul 7, 2025 • 0 new comments
[MacOS] Unable to load libonnxruntime.dylib because binaries are not signed.
#16168 commented on Jul 7, 2025 • 0 new comments
[Build] line 2812, in <module> sys.exit(main())
#16179 commented on Jul 7, 2025 • 0 new comments
[CANN]EP: CANN cannot complete inference on Atlas200DK
#15677 commented on Jul 7, 2025 • 0 new comments
[Performance] Can't get GPU speed-up when exe program is located inside the path with chinese character
#15678 commented on Jul 7, 2025 • 0 new comments
[ErrorCode:InvalidArgument] Invalid Feed Input Name:image
#15692 commented on Jul 7, 2025 • 0 new comments
[Performance] Can we set model weight precision when converting keras model into onnx model?
#15695 commented on Jul 7, 2025 • 0 new comments
[Build] Onnxruntime-gpu for Jetpack 5.1.1 on Jetson Orin Nano Developer Kit
#15732 commented on Jul 7, 2025 • 0 new comments
GraphOptimization (ORT_ENABLE_ALL) is slower using ONNXRuntime-GPU
#15743 commented on Jul 7, 2025 • 0 new comments
Load onnx failed(segmentation fault) with version 1.14.1 (2)
#15745 commented on Jul 7, 2025 • 0 new comments
Inference using the CUDA EP returns nan
#15752 commented on Jul 7, 2025 • 0 new comments
[Build]
#15786 commented on Jul 7, 2025 • 0 new comments
How to set CalibrationDataReader when my datatype is time series?
#15836 commented on Jul 7, 2025 • 0 new comments
[Build]
#15863 commented on Jul 7, 2025 • 0 new comments
[Training] Training Onnx format Models
#15867 commented on Jul 7, 2025 • 0 new comments
Failed top create CUDAExecutionProvider
#15873 commented on Jul 7, 2025 • 0 new comments
[RunTimeError]Infer error shape in runtime and mismatch with onnx spec about Split opset 18
#15882 commented on Jul 7, 2025 • 0 new comments
[Performance] `CUDAExecutionProvider` uses 3x the memory of `CPUExecutionProvider`
#15886 commented on Jul 7, 2025 • 0 new comments
symbolic_shape_infer.py failure
#15898 commented on Jul 7, 2025 • 0 new comments
Linking executable with static libraries --> error LNK2038: mismatch detected
#15928 commented on Jul 7, 2025 • 0 new comments
Atlas200DK uses EP: CANN to infer resnet50 and reports "CANN errorEE9999: Inner Error!"
#15947 commented on Jul 7, 2025 • 0 new comments
[Performance] Redundant ReorderOutput / ReorderInput operators in Conv+Maxpool layers when graph optimization level is ALL
#15964 commented on Jul 7, 2025 • 0 new comments
float16 result not match with numpy or torch
#15977 commented on Jul 7, 2025 • 0 new comments