-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Insights: microsoft/onnxruntime
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
v1.22.1 ONNX Runtime v1.22.1
published
Jul 8, 2025
44 Pull requests merged by 28 people
-
QNN-EP: DSPQueue Polling
#25361 merged
Jul 11, 2025 -
[EP ABI] Add Node_GetEpType API
#25350 merged
Jul 11, 2025 -
[WebNN] Fix bug in Float16Array availability check
#25354 merged
Jul 11, 2025 -
add --client_package_build option
#25351 merged
Jul 11, 2025 -
[webgpu] a few optimization to WGSL template
#25333 merged
Jul 10, 2025 -
[EP ABI] Add Graph_GetGraphView API to get a OrtGraph from a subset of nodes
#25191 merged
Jul 10, 2025 -
Make TRT plugins optional
#25261 merged
Jul 10, 2025 -
[WebNN] Refactor webnn op input rank check and add validation for ops
#25185 merged
Jul 10, 2025 -
Added creation of QDQ for TopK node
#25309 merged
Jul 10, 2025 -
[QNN EP] Fix pool with reshape name conflicts
#25332 merged
Jul 10, 2025 -
Add PackageVersion parameter to NuGet packaging stage
#25315 merged
Jul 9, 2025 -
[CPU] GQA supports head_sink input for smooth softmax
#25269 merged
Jul 9, 2025 -
[MLAS] DequantizeLinear int8/uint8
#24818 merged
Jul 9, 2025 -
[web] Fix "npm run pull:wasm" script
#25330 merged
Jul 9, 2025 -
Move buffer release or cache from OnRefresh to ReleaseBuffer in BucketCacheManager
#25276 merged
Jul 9, 2025 -
Update vcpkg.json: remove optional-lite
#25339 merged
Jul 9, 2025 -
[EP ABI] Utility to serialize OrtGraph to GraphProto
#25292 merged
Jul 9, 2025 -
[webgpu] Move the early return after copying for ScatterND
#25345 merged
Jul 9, 2025 -
[webgpu] Update wgsl_templates README.md
#25336 merged
Jul 9, 2025 -
GatherBlockQuantized supports zero points and 8 bits for uint8 dtype
#25214 merged
Jul 9, 2025 -
fix build break when multi EP is enabled (inference_session_test.cc)
#25329 merged
Jul 8, 2025 -
Bump ruff from 0.11.13 to 0.12.2, clang-format from 19.1.7 to 20.1.7
#25301 merged
Jul 8, 2025 -
[WebGPU] allow WGSL template generation
#25130 merged
Jul 8, 2025 -
FIX printDebugInfo in ov_interface.cc
#25298 merged
Jul 8, 2025 -
fix 0 tensor issue in matmul and scatter_nd
#25326 merged
Jul 8, 2025 -
FIX c++17 compatibility in backend_utils.h
#25299 merged
Jul 8, 2025 -
Update deprecated CCCL API
#25246 merged
Jul 8, 2025 -
Delete .github/workflows/stale.yml
#25316 merged
Jul 8, 2025 -
[EP ABI] Infer OrtDevice for plugin EP from registered OrtMemoryInfo
#25308 merged
Jul 8, 2025 -
Fix cuda 12.9 windows build
#25317 merged
Jul 8, 2025 -
1. Fix Nv EP Build Break:wq
#25311 merged
Jul 8, 2025 -
[webgpu] support smooth softmax for non-FA GQA implementation
#25285 merged
Jul 7, 2025 -
Exclude EPContext Op from Common Subexpression Elimination graph optimization
#25296 merged
Jul 7, 2025 -
Add RotaryEmbeddings(23) - CUDA
#25178 merged
Jul 7, 2025 -
Migrate stale bot workflow to updateStaleIssues.yml policy
#21660 merged
Jul 7, 2025 -
[webgpu] Optimize DP4AMatMulNBitsSmallMProgram for intel
#25192 merged
Jul 7, 2025 -
fix webgpu dequantize_linear ut
#25271 merged
Jul 7, 2025 -
Add a new ORT API
GetSessionOptionConfigEntries
#25277 merged
Jul 7, 2025 -
[webgpu] a few optimizations to graph capture implementation
#25305 merged
Jul 7, 2025 -
[WebNN] Always create a new constant for zero_points
#25286 merged
Jul 7, 2025 -
[webgpu] Enable graph capture
#24900 merged
Jul 7, 2025 -
Add OrtEpFactory::GetVersion and store EP version in EP metadata.
#25272 merged
Jul 5, 2025 -
[OVEP] OpenVINO EP Features Release 1.23
#25262 merged
Jul 4, 2025
26 Pull requests opened by 23 people
-
FIX: dxcore include when compiling with older Windows SDK
#25297 opened
Jul 6, 2025 -
Add a new operator attribute type `ORT_OP_ATTR_BYTES` to the ORT C API
#25300 opened
Jul 7, 2025 -
Update updateStaleIssues.yml: remove the reopen issue logic
#25318 opened
Jul 7, 2025 -
[CPU] GQA supports attention scores output
#25319 opened
Jul 7, 2025 -
Convert Initializers to OrtValues Phase 2
#25320 opened
Jul 8, 2025 -
Iraut/update nv trt rtx ep doc
#25321 opened
Jul 8, 2025 -
[NvTensorRTRTX EP]Disable Fast GELU operator in base model used for NV EP Unit Tests
#25323 opened
Jul 8, 2025 -
Bump transformers from 4.48.0 to 4.52.1 in /onnxruntime/python/tools/transformers/models/llama
#25328 opened
Jul 8, 2025 -
[EP ABI] Get EP compiled model compatibility
#25331 opened
Jul 8, 2025 -
[MIGraphx EP] Sync AMD changes upstream
#25338 opened
Jul 9, 2025 -
Remove arm 32 references
#25341 opened
Jul 9, 2025 -
[QNN EP] Add EP-aware Reshape handler for Transpose optimization.
#25344 opened
Jul 9, 2025 -
Support read-only allocator for use with initializers
#25348 opened
Jul 9, 2025 -
Add patch for WebGPU on Android to handle fp16 in uniforms
#25349 opened
Jul 9, 2025 -
add build matrix for wgsl template
#25352 opened
Jul 10, 2025 -
[webgpu] Apply template to `MatMulNBitsWideTile`
#25353 opened
Jul 10, 2025 -
Add Compile API to set the location for the context binary file
#25356 opened
Jul 10, 2025 -
[CUDA] Update Flash Attention to support head_sink for smooth softmax in GQA
#25358 opened
Jul 10, 2025 -
Fix SigLIP casual mask bug
#25360 opened
Jul 10, 2025 -
[EP ABI] Update to use Node_GetEpName
#25363 opened
Jul 11, 2025 -
[JSEP] Fix inputShape index OOB in slice.ts
#25364 opened
Jul 11, 2025 -
Add vendor id to OrtEpFactory
#25365 opened
Jul 11, 2025 -
[webgpu] Enable per-run control for graph capture
#25367 opened
Jul 11, 2025 -
Enable CUDA Graph in nv_tensorrt_rtx EP
#25368 opened
Jul 11, 2025
28 Issues closed by 13 people
-
The computer and android reasoning results are inconsistent
#17016 closed
Jul 11, 2025 -
[Mobile] kokora.int8 Efficiency Below Expectations on iPhone 15
#25366 closed
Jul 11, 2025 -
[Build] how to correctly disable FORTIFY_SOURCE
#25337 closed
Jul 9, 2025 -
[Performance] gpu inference is much slower than cpu
#17489 closed
Jul 9, 2025 -
[Build] 'cudafe++' died with status 0xC0000005 (ACCESS_VIOLATION) when build onnxruntime_providers_cuda
#25239 closed
Jul 9, 2025 -
[Build] How to build static lib?
#24704 closed
Jul 8, 2025 -
Initializer duplication method in QDQQuantizer ignores existing `value_info` tensor with same name
#24705 closed
Jul 8, 2025 -
Inference session crashes using ONNX runtime.
#20043 closed
Jul 8, 2025 -
[Web] WASM sigmoid producing numbers below 0 or above 1
#23943 closed
Jul 7, 2025 -
[Web/WebGPU] Can't append execution provider: JS (v1.15.0)
#16137 closed
Jul 7, 2025 -
Component Governance Alert on `cmake/external/protobuf`
#10758 closed
Jul 7, 2025 -
Regression in TreeEnsembleRegressor if the provided graph is a DAG
#24636 closed
Jul 7, 2025 -
Segmentation fault in `AppendExecutionProvider_CUDA_V2` when no GPU is available
#24652 closed
Jul 7, 2025 -
Bug related to setting provider options for OpenVINO using Java API
#24658 closed
Jul 7, 2025 -
Is class Sigmoid op supported by CUDA 12.6?
#24670 closed
Jul 7, 2025 -
AveragePool v19+ ignores `end` padding in computation when count_include_pad=1
#24681 closed
Jul 7, 2025 -
When will v1.14 of the onnxruntime-openvino package be available?
#14773 closed
Jul 7, 2025 -
Crosscompiling using VS2017 from Windows for Raspberrypi4
#7962 closed
Jul 7, 2025 -
How to run YOLO with onnxruntime
#6236 closed
Jul 7, 2025 -
Build error in build.py
#5980 closed
Jul 7, 2025 -
DirectML error: The parameter is incorrect with KBNet S
#21583 closed
Jul 7, 2025 -
[Web] Ability to create/use multiple wasm web workers
#15735 closed
Jul 7, 2025 -
Cannot run Microsoft.SemanticKernel.Connectors.Onnx on RaspberryPi OS Lite
#25290 closed
Jul 7, 2025
19 Issues opened by 16 people
-
[Feature Request] The MatMulNBits matmul_nbits_quantizer does not support 3D weight tensors.
#25362 opened
Jul 11, 2025 -
[Web] Unable to build using WebGPU - `error: handleI64Signatures: signature too long for emwgpuWaitAny`
#25359 opened
Jul 10, 2025 -
[Feature Request] add webgpu support for a PowerPreference session option
#25357 opened
Jul 10, 2025 -
CudaProvider jumped completely when Tensorrt and Openvino provided
#25347 opened
Jul 9, 2025 -
QNN EP fails on model that can run as a QNN context binary
#25335 opened
Jul 9, 2025 -
QNN EP appears to accept a model that QNN cannot execute
#25334 opened
Jul 9, 2025 -
[Bug] [Node.js binding] Memory leak after releasing inference session
#25325 opened
Jul 8, 2025 -
[Web] WebGPU Device Promise not defined
#25324 opened
Jul 8, 2025 -
[Feature Request] Add a more detailed OrtStatus for diagnosing model compilation incompatibilities
#25314 opened
Jul 7, 2025 -
[Feature Request] API for callers to determine if a compiled model is compatible with a given device
#25313 opened
Jul 7, 2025 -
[Feature Request] Add mechanism for helping identify what driver version was used in compiling a model
#25312 opened
Jul 7, 2025 -
[Feature Request] CPU EP `Where` data type registration, add int8 and uint32
#25306 opened
Jul 7, 2025 -
OpenVINO EP fails to run models with in-memory external data
#25304 opened
Jul 7, 2025 -
[Feature Request] `GetEpDevices()` returns a sorted EP devices list
#25302 opened
Jul 7, 2025 -
GetShape crashes on Linux
#25295 opened
Jul 5, 2025 -
[Mobile] TypeError: A bool tensor's data must be type of function Uint8Array() { [native code] }
#25294 opened
Jul 5, 2025 -
[Build] /install-utils.js Error: Failed to download build list. HTTP status code = 302
#25293 opened
Jul 5, 2025 -
Missing win-arm libraries
#25291 opened
Jul 5, 2025
1,279 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Attention Operator (CPU)
#25156 commented on
Jul 11, 2025 • 58 new comments -
Plugin EP data transfer and Stream support.
#25254 commented on
Jul 11, 2025 • 14 new comments -
[WebGPU EP] extend concat to handle large number of inputs
#25177 commented on
Jul 10, 2025 • 6 new comments -
Add Int4 and UInt4 support for Cast
#24973 commented on
Jul 10, 2025 • 5 new comments -
Fix Sign and Clip operation on int64 tensors
#25280 commented on
Jul 7, 2025 • 3 new comments -
KleidiAI SGEMM/IGEMM/Quantized MatMul - Modular MLAS API Changes for KleidiAI
#25187 commented on
Jul 11, 2025 • 1 new comment -
[MLAS] Add 8-bit weights ARM64 Gemm implementation
#25110 commented on
Jul 7, 2025 • 1 new comment -
Use NPU in NXP iMX8MP?
#11854 commented on
Jul 7, 2025 • 0 new comments -
What's the meaning of the hole of tracing file
#11850 commented on
Jul 7, 2025 • 0 new comments -
Incompatible dimensions for matrix multiplication Error in StarNet model when doing InferenceSession
#11846 commented on
Jul 7, 2025 • 0 new comments -
[ONNXRuntimeError] FuseReluClip failure
#11836 commented on
Jul 7, 2025 • 0 new comments -
in cmake/CMakeList.txt all avx related option all set off, do we need do anything to use avx features?
#11833 commented on
Jul 7, 2025 • 0 new comments -
[Bug] Mixing negative and positive paddings causes segfault/uninitialized memory values produced in reflected pad
#11828 commented on
Jul 7, 2025 • 0 new comments -
Issue importing onnxruntime
#11815 commented on
Jul 7, 2025 • 0 new comments -
When I use onnxruntime to run onnx model on GPU, it sucks up too much video memory. Is that normal?
#11809 commented on
Jul 7, 2025 • 0 new comments -
I do not get any performance improvement after using TensorRT provider for object detection model
#11806 commented on
Jul 7, 2025 • 0 new comments -
Failed to build onnxruntime on Apple Sillion
#11805 commented on
Jul 7, 2025 • 0 new comments -
How to use batch run?
#11852 commented on
Jul 7, 2025 • 0 new comments -
What is the meaning of src_arg_index and dst_arg_index in EdgeEndToMatch structure?
#11856 commented on
Jul 7, 2025 • 0 new comments -
Wrong output shape due to MergeShape failure
#11870 commented on
Jul 7, 2025 • 0 new comments -
Not clear quantization pipeline for tensorrt ep
#11873 commented on
Jul 7, 2025 • 0 new comments -
Pytorch -> Onnx custom Yolov5 model works in python but not in JS
#11874 commented on
Jul 7, 2025 • 0 new comments -
[ONNXRuntimeError] Load model from *** failed: Unsuported type proto value case
#11889 commented on
Jul 7, 2025 • 0 new comments -
Quantize specific ops per-tensor while per_channel=True
#11890 commented on
Jul 7, 2025 • 0 new comments -
Bug: MatMul fails for input shapes of [0, k] and [k, ]
#11895 commented on
Jul 7, 2025 • 0 new comments -
onnx and onnxruntime disagree on input with no known rank
#11891 commented on
Jul 7, 2025 • 0 new comments -
Immense GPU memory consumption
#11903 commented on
Jul 7, 2025 • 0 new comments -
Inference time for qunatized onnx models, TensorRT> CUDA> CPU. Is this expected?
#11201 commented on
Jul 7, 2025 • 0 new comments -
build for c#
#11648 commented on
Jul 7, 2025 • 0 new comments -
output shape can not be specified in com.microsoft::GridSample op
#11652 commented on
Jul 7, 2025 • 0 new comments -
Installing ORTModule torch extension reports TypeError
#11663 commented on
Jul 7, 2025 • 0 new comments -
when set inter_op_num=0 with ORT_PARALLEL model the performance is very bad than inter_op_num=1?
#11668 commented on
Jul 7, 2025 • 0 new comments -
How to implement a new operator inference function?
#11678 commented on
Jul 7, 2025 • 0 new comments -
[web] `ort.InferenceSession.create` silently hangs/fails on iOS/iPad browsers if COEP/COOP headers are set
#11679 commented on
Jul 7, 2025 • 0 new comments -
which onnxruntime-gpu version is compatible for CUDA 11.1 ?
#11685 commented on
Jul 7, 2025 • 0 new comments -
Real-ESRGAN slow onnxruntime inference compared to Pytorch one
#11688 commented on
Jul 7, 2025 • 0 new comments -
Linux CI pipelines can't test unreleased versions of ONNX
#11693 commented on
Jul 7, 2025 • 0 new comments -
Dynamic quantization of Albert model
#11701 commented on
Jul 7, 2025 • 0 new comments -
Low level profiling for onnxrt Conv kernel(default backend)
#11702 commented on
Jul 7, 2025 • 0 new comments -
CUDA EP spending lots of time idling
#11706 commented on
Jul 7, 2025 • 0 new comments -
Race condition when setting do_copy_in_default_stream to false
#11713 commented on
Jul 7, 2025 • 0 new comments -
Reading back multidimensional output in C++
#11718 commented on
Jul 7, 2025 • 0 new comments -
how to get the remaining GPU memory to get the batch size?
#11735 commented on
Jul 7, 2025 • 0 new comments -
ssd_mobilenet_v1 infer error for TensorRT Execution Provider
#11736 commented on
Jul 7, 2025 • 0 new comments -
build rknpu backend error
#11738 commented on
Jul 7, 2025 • 0 new comments -
Pip installed Transformer Benchmark cannot run on TF
#11751 commented on
Jul 7, 2025 • 0 new comments -
Converted ONNX model works in Python but not in C++
#11761 commented on
Jul 7, 2025 • 0 new comments -
create op
#12017 commented on
Jul 7, 2025 • 0 new comments -
Resize with mode linear is missing output elements
#12019 commented on
Jul 7, 2025 • 0 new comments -
Microsoft.ML.OnnxRuntime.Tests.InferenceTest.TestPreTrainedModels should get opset version from the model file
#12040 commented on
Jul 7, 2025 • 0 new comments -
Builds C# bindings and creates nuget package
#12042 commented on
Jul 7, 2025 • 0 new comments -
GlobalAveragePool on large size of ones miscalculates
#12043 commented on
Jul 7, 2025 • 0 new comments -
Using onnxruntime server for model deployment
#12044 commented on
Jul 7, 2025 • 0 new comments -
Support pasts as inputs in gpt2 beam search operator
#12047 commented on
Jul 7, 2025 • 0 new comments -
Build wasm static library bug because of missing `testdata` folder.
#12048 commented on
Jul 7, 2025 • 0 new comments -
Performance in parallel session Run()
#12049 commented on
Jul 7, 2025 • 0 new comments -
Builds C# bindings and creates nuget package for vs2019 install
#12061 commented on
Jul 7, 2025 • 0 new comments -
ONNXRuntimeError for "Where" node when the input is too long
#12065 commented on
Jul 7, 2025 • 0 new comments -
Performance issue with beam search in onnxruntime
#12078 commented on
Jul 7, 2025 • 0 new comments -
Support for cmake's FetchContent()
#12081 commented on
Jul 7, 2025 • 0 new comments -
TensorRT Provider Vs TensorRT Native
#12083 commented on
Jul 7, 2025 • 0 new comments -
Resize with mode linear always produces 0.5 on GPU regardless of the input
#12091 commented on
Jul 7, 2025 • 0 new comments -
Resize with `nearest` mode have inconsistent results compared to PyTorch and TVM
#12098 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime tensorrt sometime cost verg log time
#12120 commented on
Jul 7, 2025 • 0 new comments -
How do I call the same model in CUDA with many various inputs?
#12126 commented on
Jul 7, 2025 • 0 new comments -
Error in symbloc_shape_infer.py: assert name in self.sympy_data_ or ...
#12127 commented on
Jul 7, 2025 • 0 new comments -
GPU inference result not stable
#13178 commented on
Jul 7, 2025 • 0 new comments -
ConvTranspose with auto_pad attribute
#11927 commented on
Jul 7, 2025 • 0 new comments -
how to get inference time with c# onnxruntime-gpu-1.6.0
#11946 commented on
Jul 7, 2025 • 0 new comments -
excute dnnl provider error
#11947 commented on
Jul 7, 2025 • 0 new comments -
windows11+onnxruntime1.8.0+vs2019 inferencing crash
#11950 commented on
Jul 7, 2025 • 0 new comments -
Multi thread of single session Python vs C++ (end with core dumped)
#11951 commented on
Jul 7, 2025 • 0 new comments -
Inference_GPT2-OneStepSearch_OnnxRuntime_CPU.ipynb Error
#11959 commented on
Jul 7, 2025 • 0 new comments -
Question about quantize Gemm OP
#11961 commented on
Jul 7, 2025 • 0 new comments -
Got segmentation fault error when using 'InferenceSession' API
#11964 commented on
Jul 7, 2025 • 0 new comments -
how to configure lobal/shared threadpool with multithread, in c#API?
#11966 commented on
Jul 7, 2025 • 0 new comments -
set gpu option failed
#11967 commented on
Jul 7, 2025 • 0 new comments -
quant onnx model slower than pytorch with mish6 activation, howerver faster with relu6
#11975 commented on
Jul 7, 2025 • 0 new comments -
inference time is not stable
#11983 commented on
Jul 7, 2025 • 0 new comments -
Any interest in hosting the Rust bindings
#11992 commented on
Jul 7, 2025 • 0 new comments -
inference is different on linux and windows
#11993 commented on
Jul 7, 2025 • 0 new comments -
Inconsistent result to NumPy and PyTorch when consecutively casting a float tensor to int32 and then to bool
#11994 commented on
Jul 7, 2025 • 0 new comments -
failed to initialize a session in the GPU environment
#11996 commented on
Jul 7, 2025 • 0 new comments -
The test time of sess.run does not match the time of profile
#11997 commented on
Jul 7, 2025 • 0 new comments -
build C#api with cuda 11.0 /cudnn 8.0
#11999 commented on
Jul 7, 2025 • 0 new comments -
Issue with NeMo MTEncDecModel model in ONNX IOBinding
#12003 commented on
Jul 7, 2025 • 0 new comments -
how to build onnxruntime from source with dnnl?
#12011 commented on
Jul 7, 2025 • 0 new comments -
auto_set_affinity can't be set to true for parallel executor
#11205 commented on
Jul 7, 2025 • 0 new comments -
[web] ~100 seconds to load model/InferenceSession
#11217 commented on
Jul 7, 2025 • 0 new comments -
NonZero shape inference behavior with scalar input mismatches ONNX and PyTorch
#11232 commented on
Jul 7, 2025 • 0 new comments -
Unhandled exception at 0x00007FFABE6A9538 (cudnn_cnn_infer64_8.dll) in Onnx.exe
#11235 commented on
Jul 7, 2025 • 0 new comments -
[React Native .ort Model Loading Error] "Error: Can't load a model: No content provider: ..."
#11239 commented on
Jul 7, 2025 • 0 new comments -
I want use gpu on my jetson nx2 platform with c++, how should i do?
#11240 commented on
Jul 7, 2025 • 0 new comments -
Non-zero status code returned while running Slice node. Name:'Slice_24' Status Message: slice.cc:153 FillVectorsFromInput Starts must be a 1-D array
#11257 commented on
Jul 7, 2025 • 0 new comments -
Unsupported If operator in gradient builder for Hugging Face Transformers RoBERTa model
#11268 commented on
Jul 7, 2025 • 0 new comments -
optimize_model : new model types
#11270 commented on
Jul 7, 2025 • 0 new comments -
The onnx model of IMDN is slower than the original pytorch model and output many warnings
#11274 commented on
Jul 7, 2025 • 0 new comments -
pulled master 1.12 quantization get unexpected result
#11277 commented on
Jul 7, 2025 • 0 new comments -
LSTM export ONNX:Non-zero status code returned while running ScatterElements node. Name:'ScatterElements_880'
#11278 commented on
Jul 7, 2025 • 0 new comments -
Why gpt2-xl (based transformer-xl) onnx slower than the originer pytorch
#11293 commented on
Jul 7, 2025 • 0 new comments -
is the effect of onnx on Bert affected by python version?
#11295 commented on
Jul 7, 2025 • 0 new comments -
TVM EP and TensorRT EP do not support dynamic inputs
#11333 commented on
Jul 7, 2025 • 0 new comments -
MacOS M1 binary compilation and possibility to fine tune a model in C++
#11343 commented on
Jul 7, 2025 • 0 new comments -
CUDAExecutionProvider optimized model adds incompatible node resulting in Failed to find kernel for MemcpyToHost
#11348 commented on
Jul 7, 2025 • 0 new comments -
Lower performance on Inceptionv3/4 model with TensorRT EP than TensorRT directly
#11356 commented on
Jul 7, 2025 • 0 new comments -
CUDAExecutionProvider not releasing memory after terminate session
#11362 commented on
Jul 7, 2025 • 0 new comments -
Incorrect TypeInferenceError on UNDEFINED tensor type
#6370 commented on
Jul 7, 2025 • 0 new comments -
Cuda EP parallelization issues for batches
#11047 commented on
Jul 7, 2025 • 0 new comments -
C++ API, "tried creating tensor with negative value in shape" error when 'permute' and 'reshape' functions are used
#11069 commented on
Jul 7, 2025 • 0 new comments -
Inference session creation freezes
#11087 commented on
Jul 7, 2025 • 0 new comments -
compile with cuda error:Couldn't find CUDA library root.
#11090 commented on
Jul 7, 2025 • 0 new comments -
Always getting "Failed to create CUDAExecutionProvider"
#11092 commented on
Jul 7, 2025 • 0 new comments -
Performance reduction due to copying of output OrtValues to numpy arrays
#11099 commented on
Jul 7, 2025 • 0 new comments -
Using DnnlExecutionProvider for inference is much slower than using CPUExecutionProvider.
#11122 commented on
Jul 7, 2025 • 0 new comments -
Different detection output values for C++ and Python with onnxruntime
#11123 commented on
Jul 7, 2025 • 0 new comments -
docker container linux run onnxruntime infer core dumped
#11135 commented on
Jul 7, 2025 • 0 new comments -
[question] yolov5-onnx-float16 not improve on GPU
#11151 commented on
Jul 7, 2025 • 0 new comments -
How to use Flask with onnxruntime
#11156 commented on
Jul 7, 2025 • 0 new comments -
Instruction level profiling in onnxruntime
#11159 commented on
Jul 7, 2025 • 0 new comments -
No c++ header files for building custom op
#11169 commented on
Jul 7, 2025 • 0 new comments -
A normal output of convolution layer multiplies infinity will result in NaN
#11173 commented on
Jul 7, 2025 • 0 new comments -
Build from source issue on Windows
#11178 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime-web is 11-17x times slower than native inference
#11181 commented on
Jul 7, 2025 • 0 new comments -
Custom Op does not support dynamic input/output number
#11186 commented on
Jul 7, 2025 • 0 new comments -
Saving GPT2LMHeadModel_ConfigurableOneStepSearch error.
#11198 commented on
Jul 7, 2025 • 0 new comments -
How to compress the sparse matrix in onnx model
#11200 commented on
Jul 7, 2025 • 0 new comments -
how to use c sharp to call libonnxruntime.dll? i build the onnxruntime dynamic dll, did it can be encapsulated c++ dll in order to c sharp called
#11550 commented on
Jul 7, 2025 • 0 new comments -
can c sharp call onnxruntime c++ dll don't use c# third lib? i create onnxruntime c++ project,but i want to call the dll with c sharp
#11551 commented on
Jul 7, 2025 • 0 new comments -
T5-Large Export Results in ProtoBuf Error due to 2GB External Data when using padded inputs
#11558 commented on
Jul 7, 2025 • 0 new comments -
CUDA failure 100: no CUDA-capable device is detected ; error when inferencing on a GPUVM
#11561 commented on
Jul 7, 2025 • 0 new comments -
Specify CPUs to use for parallel inference when external CPU pinning is used
#11563 commented on
Jul 7, 2025 • 0 new comments -
[js/web] Inference is Broken in Safari when Cross Origin Isolation is active
#11567 commented on
Jul 7, 2025 • 0 new comments -
Header missmatch C/C++ - mac
#11570 commented on
Jul 7, 2025 • 0 new comments -
The effect of turning optimization on and off on quantized model performance
#11576 commented on
Jul 7, 2025 • 0 new comments -
ONNXRUNTIME + OpenVINO on ARM64
#11582 commented on
Jul 7, 2025 • 0 new comments -
cpu and gpu results is not the same
#11590 commented on
Jul 7, 2025 • 0 new comments -
did it can build onnxruntime with any cuda version by source code ? is not relate to onnxtuntime version?
#11584 commented on
Jul 7, 2025 • 0 new comments -
CUDNN failure 4: CUDNN_STATUS_INTERNAL_ERROR ; error when inferencing on a GPUVM
#11592 commented on
Jul 7, 2025 • 0 new comments -
issues with pybind11 repository while installing
#11595 commented on
Jul 7, 2025 • 0 new comments -
Bad performance for QDQ model with openvino EP
#11604 commented on
Jul 7, 2025 • 0 new comments -
Unable to build onnxruntime_v1.10.0 C++ api with --enable_memory_profile --enable_cuda_profiling flags
#11607 commented on
Jul 7, 2025 • 0 new comments -
Shape inference fails
#11614 commented on
Jul 7, 2025 • 0 new comments -
building ——libonnxruntime_providers_cuda.so Error running link command: No such file or directory
#11621 commented on
Jul 7, 2025 • 0 new comments -
how to set providers with onnx runtime-gpu1.70 ?
#11624 commented on
Jul 7, 2025 • 0 new comments -
using multithread to call onnxruntime inference,
#11628 commented on
Jul 7, 2025 • 0 new comments -
which tags should i download of onnxruntime-gpu 1.6 for c#
#11646 commented on
Jul 7, 2025 • 0 new comments -
ONNX Runtime compatibility for Jetson AGX Xavier
#11378 commented on
Jul 7, 2025 • 0 new comments -
About running onnxruntime in singularity container
#11397 commented on
Jul 7, 2025 • 0 new comments -
Benchmark code using torch.onnx.export
#11399 commented on
Jul 7, 2025 • 0 new comments -
About building onnxruntime singularity container with DockerFile
#11409 commented on
Jul 7, 2025 • 0 new comments -
Static quantization+per_channel is wrong for MobileNetV3
#11415 commented on
Jul 7, 2025 • 0 new comments -
Can I quantize TreeEnsembleClassifier op?
#11436 commented on
Jul 7, 2025 • 0 new comments -
Onnx T5 fp16 conversion without past_key_values
#11438 commented on
Jul 7, 2025 • 0 new comments -
How to run a double input onnx model
#11453 commented on
Jul 7, 2025 • 0 new comments -
InferenceSession giving different results than the original sklearn SVC model
#11490 commented on
Jul 7, 2025 • 0 new comments -
C#, How to access the different output layer of inference (semantic segmentation)
#11502 commented on
Jul 7, 2025 • 0 new comments -
[Documentation Request]
#11505 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime error
#11509 commented on
Jul 7, 2025 • 0 new comments -
[Documentation Request] tensorAt for Csharp?
#11510 commented on
Jul 7, 2025 • 0 new comments -
About Convolution Implementation
#11517 commented on
Jul 7, 2025 • 0 new comments -
Tile fails for scalars on CPU
#11523 commented on
Jul 7, 2025 • 0 new comments -
How to release a session properly?
#11529 commented on
Jul 7, 2025 • 0 new comments -
Fail to convert model with reusable blocks
#11530 commented on
Jul 7, 2025 • 0 new comments -
CPUExecutionProvider outputs wrong value for a quantized model
#11532 commented on
Jul 7, 2025 • 0 new comments -
TensorRT EP session creation fails with invalid weights type of Int8 when ORT_TENSORRT_INT8_ENABLE set to 1
#11535 commented on
Jul 7, 2025 • 0 new comments -
Using a model with float input types causes space issue
#11541 commented on
Jul 7, 2025 • 0 new comments -
[Performance] inference time much slower (1529ms vs. 20 ms) on GPU vs CPU.
#13199 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Performance issue on Linux vs Windows for BERT model.
#13224 commented on
Jul 7, 2025 • 0 new comments -
Contrib IRFFT operator output dimensions calculation
#13236 commented on
Jul 7, 2025 • 0 new comments -
Onnx create session takes a long time.
#13240 commented on
Jul 7, 2025 • 0 new comments -
Inference time spikes in UNET onnx
#13258 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Too Slow when i do inference
#13265 commented on
Jul 7, 2025 • 0 new comments -
[Mobile] .Net target Arm64
#13295 commented on
Jul 7, 2025 • 0 new comments -
[ONNXRuntimeError] : 1 : FAIL : This is an invalid model. Error: the graph is not acyclic.
#13322 commented on
Jul 7, 2025 • 0 new comments -
onnx Pad operator with negative pads value outputs 'nan'
#13332 commented on
Jul 7, 2025 • 0 new comments -
[Build] Upgrade to latest protobuf
#13335 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Comparing ONNX CPU execution profiles of two FasterRCNN checkpoints
#13341 commented on
Jul 7, 2025 • 0 new comments -
[Build] ONNX Runtime Build Error ZCU102 (DPUCZDX8G)
#13351 commented on
Jul 7, 2025 • 0 new comments -
quantize_dynamic results in initializer error
#13358 commented on
Jul 7, 2025 • 0 new comments -
[Performance] CNN Inference has latency spikes with TensorRT EP
#13366 commented on
Jul 7, 2025 • 0 new comments -
Onnxruntime crashes if setting cpu affinity fails in Ort::Session constructor
#13367 commented on
Jul 7, 2025 • 0 new comments -
Using GPU in c++
#13380 commented on
Jul 7, 2025 • 0 new comments -
Can't run qdq model with TRT EP
#13381 commented on
Jul 7, 2025 • 0 new comments -
Whether the .trt model can be loaded
#13394 commented on
Jul 7, 2025 • 0 new comments -
Does ORT support quantize
#13413 commented on
Jul 7, 2025 • 0 new comments -
Why the performance of onednn is worse than the common version
#12315 commented on
Jul 7, 2025 • 0 new comments -
How to set cpu_num to a specific value?
#12819 commented on
Jul 7, 2025 • 0 new comments -
AttentionPastState_dynamic test fails during building with CUDA EP from source
#12820 commented on
Jul 7, 2025 • 0 new comments -
Memory management
#12824 commented on
Jul 7, 2025 • 0 new comments -
error: package directory 'onnxruntime/backend' does not exist [Build]
#12922 commented on
Jul 7, 2025 • 0 new comments -
[Web] Failed to compile shader on WebGL
#12927 commented on
Jul 7, 2025 • 0 new comments -
Disabling optimization produces incorrect results on CUDAExecutionProvider in 1.12
#12946 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Dynamic model input prediction is slow
#12955 commented on
Jul 7, 2025 • 0 new comments -
Why is there not ParallelExecutionPlan like SequentialExecutionPlan in the ParallelExecutor of onnxruntime?
#13036 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime calculate gradients but no need for training
#13057 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime-gpu, cudaoptions, result is different
#13061 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime-node crash the electron app[Web]
#13086 commented on
Jul 7, 2025 • 0 new comments -
what's the differences between onnxruntime with openvino backend VS openvino directly?
#13087 commented on
Jul 7, 2025 • 0 new comments -
[Performance] a problem for Ort::IoBinding
#13090 commented on
Jul 7, 2025 • 0 new comments -
[Performance] ONNX Runtime GPT2 Model Running Significantly Slower than PyTorch
#13105 commented on
Jul 7, 2025 • 0 new comments -
[Test issue] Updated Ignore
#13109 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Multithreading performance tails off after 3 threads, possible memory issue
#13138 commented on
Jul 7, 2025 • 0 new comments -
Failed to create CUDAExecutionProvider
#13139 commented on
Jul 7, 2025 • 0 new comments -
Onnxruntime fails on GPU loading inference with int8 models
#13168 commented on
Jul 7, 2025 • 0 new comments -
Multilingual-MiniLM-L12-H384 ONNX inference in NodeJS
#13171 commented on
Jul 7, 2025 • 0 new comments -
[Build]
#13554 commented on
Jul 7, 2025 • 0 new comments -
ORT fails on CPU looking for LayerNormalization node, for mixed-precision ONNX
#13556 commented on
Jul 7, 2025 • 0 new comments -
[TVM] Exception during initialization
#13572 commented on
Jul 7, 2025 • 0 new comments -
unable to build onnxruntime for openvino execution provider to get nuget packages
#13577 commented on
Jul 7, 2025 • 0 new comments -
Does Microsoft.ML.OnnxRuntime have a dependency on System.CodeDom.dll ?
#13604 commented on
Jul 7, 2025 • 0 new comments -
[Build]
#13606 commented on
Jul 7, 2025 • 0 new comments -
[C++] Model output image different in C++ ORT vs. Python ORT & PyTorch
#13614 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Operators assigned to CPU instead of CUDA
#13615 commented on
Jul 7, 2025 • 0 new comments -
Dimension Padding problem in reduction_ops.cc
#13654 commented on
Jul 7, 2025 • 0 new comments -
[Performance] onnxruntime session uses 5x more system memory if torch is imported
#13662 commented on
Jul 7, 2025 • 0 new comments -
GPT2 Static Quantization Failed. Non-zero status code returned while running Reshape node. Name:'past_0_ReduceMax_Reshape'
#13667 commented on
Jul 7, 2025 • 0 new comments -
Help in running onnxruntime with SNPE as execution provider
#13693 commented on
Jul 7, 2025 • 0 new comments -
GPU with device_id=0 is always occupied no matter what device_id is specified when run the inference
#13697 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime-gpu get warning "Serializing optimized model with Graph Optimization level greater than ORT_ENABLE_EXTENDED and the NchwcTransformer enabled".
#13709 commented on
Jul 7, 2025 • 0 new comments -
[DML] reproducible bug on DML provider
#13714 commented on
Jul 7, 2025 • 0 new comments -
[Build] Avoid NEON when building on Raspberry Pi 4
#13718 commented on
Jul 7, 2025 • 0 new comments -
[Web] Uncaught (in promise) TypeError: cannot resolve operator 'Erf' with opsets: ai.onnx v15
#13729 commented on
Jul 7, 2025 • 0 new comments -
[Web] NPM package include ts files in the output
#13736 commented on
Jul 7, 2025 • 0 new comments -
[Web]
#13749 commented on
Jul 7, 2025 • 0 new comments -
High Output Difference between ONNX model with different optimizer settings
#18959 commented on
Jul 7, 2025 • 0 new comments -
ONNX Runtime Inference on GPU: Failed to create CUDAExecutionProvider
#13414 commented on
Jul 7, 2025 • 0 new comments -
Consecutive casting leads to wrong result
#13418 commented on
Jul 7, 2025 • 0 new comments -
Parameters are optimized out even if it is a needed return value
#13425 commented on
Jul 7, 2025 • 0 new comments -
[Web] Is it possible to use both webgl backend and wasm backend in onnxruntime-web
#13435 commented on
Jul 7, 2025 • 0 new comments -
run_with_iobinding is not outputting the expected result for batched input data for T5 model running on ort CUDA EP
#13463 commented on
Jul 7, 2025 • 0 new comments -
GPU Arena blocked session->Run()
#13464 commented on
Jul 7, 2025 • 0 new comments -
Consecutive call to Ort::Session::Run() crashes
#13476 commented on
Jul 7, 2025 • 0 new comments -
did onnxruntime-gpu surport call CUDA code or call custom kernel funtion to preprocess Image?
#13491 commented on
Jul 7, 2025 • 0 new comments -
[Performance]
#13492 commented on
Jul 7, 2025 • 0 new comments -
ORT fails on Slice() when indices are of different integer types
#13497 commented on
Jul 7, 2025 • 0 new comments -
Init provider bridge failed when put onnxruntime folder under path which contains other Unicode character
#13499 commented on
Jul 7, 2025 • 0 new comments -
[Performance]
#13500 commented on
Jul 7, 2025 • 0 new comments -
[Performance] C# Gpu memory allocation
#13504 commented on
Jul 7, 2025 • 0 new comments -
Removing the semantic segmentation's bounding box
#13513 commented on
Jul 7, 2025 • 0 new comments -
How to transfer the Ort::Value obtained to cuda code for post-processing, such as a .cu file?
#13528 commented on
Jul 7, 2025 • 0 new comments -
Unable to use LSTM with mask of dynamic shape with TensorrtExecutionProvider
#16885 commented on
Jul 7, 2025 • 0 new comments -
[Training] Whether onnxruntime training can be used in Megatron.
#13532 commented on
Jul 7, 2025 • 0 new comments -
How can I load a model larger than 2G in memory
#13543 commented on
Jul 7, 2025 • 0 new comments -
Zero Result with DirectML Execution Provider
#13545 commented on
Jul 7, 2025 • 0 new comments -
Inference speed: Swintransformer torch vs onnxruntime-gpu
#13550 commented on
Jul 7, 2025 • 0 new comments -
ONNXRT default CPU EP vs Openvino EP Performance
#12316 commented on
Jul 7, 2025 • 0 new comments -
onnx graph partition optimize
#12318 commented on
Jul 7, 2025 • 0 new comments -
Wrong native library directory name for M1 Mac in the Java package
#12324 commented on
Jul 7, 2025 • 0 new comments -
MetaCommand exception from DirectML EP
#12328 commented on
Jul 7, 2025 • 0 new comments -
window10 ort with openvino backend error
#12334 commented on
Jul 7, 2025 • 0 new comments -
unsafe exception code in C++ API, wrongly declaring exceptions, incomplete constructors
#12338 commented on
Jul 7, 2025 • 0 new comments -
Unable to build Onnxruntime 1.12.0 with OpenVINO 2020.3 on Windows 10
#12342 commented on
Jul 7, 2025 • 0 new comments -
Quantized ONNX model output
#12346 commented on
Jul 7, 2025 • 0 new comments -
Performance gains by ONNX inconsistent
#12348 commented on
Jul 7, 2025 • 0 new comments -
Integer quantization fails on Transformer-based vision model
#12362 commented on
Jul 7, 2025 • 0 new comments -
Setting Openvino EP to run on one core with one thread
#12365 commented on
Jul 7, 2025 • 0 new comments -
Unable to build tensorrt docker image
#12373 commented on
Jul 7, 2025 • 0 new comments -
Accept dictionary of tensor as input (python api)
#12380 commented on
Jul 7, 2025 • 0 new comments -
Fail to build onnxRT with oneDNN using official build command
#12382 commented on
Jul 7, 2025 • 0 new comments -
Segmentation fault
#12386 commented on
Jul 7, 2025 • 0 new comments -
While loading the onnx file with InferenceSession getting session ID 11 error
#12402 commented on
Jul 7, 2025 • 0 new comments -
Failed to build with ACL(and ARMnn)
#12407 commented on
Jul 7, 2025 • 0 new comments -
Can't build with OpenVINO 2022.1 ("onnxruntime_providers_shared" does not exist)
#12411 commented on
Jul 7, 2025 • 0 new comments -
`Env(OrtLoggingLevel, const char* logid, OrtLoggingFunction, ...` fails to pass `logid` param to log function
#12414 commented on
Jul 7, 2025 • 0 new comments -
Inference time vs torch w/regard to batch_size and BatchNorm
#12130 commented on
Jul 7, 2025 • 0 new comments -
When will Attention OP extra_add_qk input support automatic broadcast
#12149 commented on
Jul 7, 2025 • 0 new comments -
Query regarding timings under ONNXRT profiler
#12150 commented on
Jul 7, 2025 • 0 new comments -
Hi Does ONNX Runtime support FP16 and INT8 inference on Intel OneDNN ExecutionProvider?
#12160 commented on
Jul 7, 2025 • 0 new comments -
Eager mode generator support non-tensor return types
#12163 commented on
Jul 7, 2025 • 0 new comments -
symbolic_shape_infer.py not working with models quantized with 🤗 Optimum for TensorRT
#12173 commented on
Jul 7, 2025 • 0 new comments -
upgrading pip and wheels kills CUDAExecutionProvider
#12185 commented on
Jul 7, 2025 • 0 new comments -
why first session.run is too slower than after
#12197 commented on
Jul 7, 2025 • 0 new comments -
Performance issue of ConvInteger
#12206 commented on
Jul 7, 2025 • 0 new comments -
How to release memory after Inference session run in Python
#12207 commented on
Jul 7, 2025 • 0 new comments -
Regarding the dynamism for custom op in ONNXRT
#12211 commented on
Jul 7, 2025 • 0 new comments -
Quantized Model Running Slow Using Cuda as EP
#12229 commented on
Jul 7, 2025 • 0 new comments -
Exported beam search model consumes a lot of more memory
#12246 commented on
Jul 7, 2025 • 0 new comments -
Mismatch in the order of the column names in the benchmarking script for transformer models
#12265 commented on
Jul 7, 2025 • 0 new comments -
LoadLibrary failed with error 126 (DirectML)
#12269 commented on
Jul 7, 2025 • 0 new comments -
TRT EP failed to create model session with CUDA custom op
#12282 commented on
Jul 7, 2025 • 0 new comments -
Since ORT 1.12 ort.InferenceSession throws error when the last provider is not capable
#12287 commented on
Jul 7, 2025 • 0 new comments -
SafeIntOnOverflow() Integer overflow error when running inference in an ASGI server
#12288 commented on
Jul 7, 2025 • 0 new comments -
Resize op can't work well under Cubic mode with ORT 1.12.
#12302 commented on
Jul 7, 2025 • 0 new comments -
Details regarding ONNXRuntime inference with OpenVino Backend
#12305 commented on
Jul 7, 2025 • 0 new comments -
[TEST FAILED] Several tests fails while running onnxruntime_test_all on armv7 based device
#16387 commented on
Jul 7, 2025 • 0 new comments -
Does ortvalue_from_numpy support directml?
#15421 commented on
Jul 7, 2025 • 0 new comments -
Confusing exception about supported types
#12648 commented on
Jul 7, 2025 • 0 new comments -
get kill signal when quantize the ONNX model using quantize_static
#12652 commented on
Jul 7, 2025 • 0 new comments -
Enable Global Shared Threadpool and Memory Allocator For C#
#12654 commented on
Jul 7, 2025 • 0 new comments -
Non-zero status code returned while running TopK node. (ssdlite320_mobilenet_v3_large)
#12669 commented on
Jul 7, 2025 • 0 new comments -
Mixed Precision ValueError: validation failed for model with all nodes in node_block_list
#14235 commented on
Jul 7, 2025 • 0 new comments -
[CPU EP] GatherND crashes with division by zero when batch dimensions mismatch between input and indices
#23828 commented on
Jul 7, 2025 • 0 new comments -
perf_view shows nothing after json load
#15927 commented on
Jul 7, 2025 • 0 new comments -
GPU Memory allocation with multiple cuda stream
#12920 commented on
Jul 7, 2025 • 0 new comments -
Wrong Results for FP16 Models in CUDAExecutionProvider and TensorRTExecutionProvider
#12726 commented on
Jul 7, 2025 • 0 new comments -
`static inline Ort::Env onnx_env{nullptr}` easily leads to nullptr deref on app exit
#12736 commented on
Jul 7, 2025 • 0 new comments -
SystemError : 13 for transformers optimizer
#12745 commented on
Jul 7, 2025 • 0 new comments -
BatchNormalization produces all zeros for 1D input
#12754 commented on
Jul 7, 2025 • 0 new comments -
How to set the priority of ONNX in GPU?
#12760 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime-linux-x64-gpu-1.12.1
#12766 commented on
Jul 7, 2025 • 0 new comments -
Asynchrononus Inference
#12768 commented on
Jul 7, 2025 • 0 new comments -
I want to use tensorrt as the back-end of onnx
#12781 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : CUDA error executing cudaSetDevice(GetDeviceId())
#12785 commented on
Jul 7, 2025 • 0 new comments -
cast op not support multithread
#12786 commented on
Jul 7, 2025 • 0 new comments -
CUDA support for longer-input models like BigBird
#12463 commented on
Jul 7, 2025 • 0 new comments -
I found that the OnnxRuntime used almost all of the instruction sets for the convolutional computations and I wanted to optimize for that
#12479 commented on
Jul 7, 2025 • 0 new comments -
How to exit abnormally in the Python Operator (PyOp)
#12481 commented on
Jul 7, 2025 • 0 new comments -
QDQ + Add nodes are not fused into QLinearAdd when the graph is optimized
#12487 commented on
Jul 7, 2025 • 0 new comments -
performance is poor when onnxruntime C++ run in intel cpu
#12489 commented on
Jul 7, 2025 • 0 new comments -
LSTM Y output is inconsistent with TF inference result when seq_len is effective
#12492 commented on
Jul 7, 2025 • 0 new comments -
Clarify NMS sorting strategy
#12493 commented on
Jul 7, 2025 • 0 new comments -
Attributes in nested function calls are zeroed out
#12506 commented on
Jul 7, 2025 • 0 new comments -
Computing loss within onnxrunitme inference (GPT2 model)
#12526 commented on
Jul 7, 2025 • 0 new comments -
java deploy in k8s Failed to load library libonnxruntime_providers_cuda.so with error
#12540 commented on
Jul 7, 2025 • 0 new comments -
engine decryption does not work in TensorRT EP
#12551 commented on
Jul 7, 2025 • 0 new comments -
Add execution provider selection for quantize_static
#12573 commented on
Jul 7, 2025 • 0 new comments -
Document beamsearch
#12584 commented on
Jul 7, 2025 • 0 new comments -
Name:'MatMul_32007' Status Message: matmul_helper.h:61 Compute MatMul dimension mismatch
#12594 commented on
Jul 7, 2025 • 0 new comments -
Run the onnx model converted from seq2seq and report an error
#12608 commented on
Jul 7, 2025 • 0 new comments -
Where is the definition of session.Run() in onnxruntime C++ api
#12623 commented on
Jul 7, 2025 • 0 new comments -
cuda_provider_options.h include non existing file?
#12636 commented on
Jul 7, 2025 • 0 new comments -
The quantization model reduces the accuracy compared to the TRT
#12638 commented on
Jul 7, 2025 • 0 new comments -
Failed to create TensorrtExecutionProvider using onnxruntime-gpu
#12639 commented on
Jul 7, 2025 • 0 new comments -
[C-Api] Dynamic Shape Error: Non-zero status code returned while running Sigmoid node.
#6372 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime v1.6.0 on Jetson Nano - Illegal Instruction (core dumped)
#6375 commented on
Jul 7, 2025 • 0 new comments -
How to extract dimension of inputs in onnxruntime/core/providers/cpu/math/matmul.cc
#6396 commented on
Jul 7, 2025 • 0 new comments -
Which executor to build when using: Intel® Deep Learning Boost (Intel® DL Boost)
#6400 commented on
Jul 7, 2025 • 0 new comments -
[question] Configure GPU arena with Python bindings
#6411 commented on
Jul 7, 2025 • 0 new comments -
Onnxruntime error when Relu-layer follows Dense-layer without activation and biases
#6423 commented on
Jul 7, 2025 • 0 new comments -
Reshape `requested_shape` forced to have leading dimension 1 when it should be -1
#6424 commented on
Jul 7, 2025 • 0 new comments -
Build failure in `orttraining_pybind_state.cc` when building with `--enable_training` and `--build_wheel`
#6536 commented on
Jul 7, 2025 • 0 new comments -
NaN in AveragePooling
#6543 commented on
Jul 7, 2025 • 0 new comments -
Loss of accuracy when GPT-2 based model is exported to ONNX
#6549 commented on
Jul 7, 2025 • 0 new comments -
Custom Op Registration and Implementation
#6564 commented on
Jul 7, 2025 • 0 new comments -
Inference error using migraohx-onnxruntime
#6605 commented on
Jul 7, 2025 • 0 new comments -
/onnxruntime/core/mlas/lib/quantize.cpp:50:62: error: ‘vminnmq_f32’ was not declared in this scope
#6638 commented on
Jul 7, 2025 • 0 new comments -
Failed to add Microsoft.AI.MachineLearning NuGet package to .NET Framework 4.6.1 projects
#6662 commented on
Jul 7, 2025 • 0 new comments -
INT8 quantized model is very slow
#6732 commented on
Jul 7, 2025 • 0 new comments -
Shape inference error for Range node
#6737 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime-gpu (cudaexecutionprovider) usage of cudnn autotuner
#6744 commented on
Jul 7, 2025 • 0 new comments -
Unable to compile on Linux with CUDA
#6749 commented on
Jul 7, 2025 • 0 new comments -
Onnxruntime inference with Integrated GPU Failed
#6755 commented on
Jul 7, 2025 • 0 new comments -
Check if GPU is available
#15942 commented on
Jul 9, 2025 • 0 new comments -
Onnx Batch Processing
#6044 commented on
Jul 7, 2025 • 0 new comments -
How to extract the size of a map type in c++?
#6077 commented on
Jul 7, 2025 • 0 new comments -
could the checkpoint of bert convert to onnx model? I have a bug that 'BertForPreTraining' object has no attribute 'layers, output'
#6089 commented on
Jul 7, 2025 • 0 new comments -
how to implement execution provider (EP) that allow onnx run on my hardware?
#6110 commented on
Jul 7, 2025 • 0 new comments -
32bit vs 64bit when compiling or something else?
#6144 commented on
Jul 7, 2025 • 0 new comments -
GPU memory consumption keeps increasing with multithreading in Java
#6181 commented on
Jul 7, 2025 • 0 new comments -
Not support rtx 3000 series
#6213 commented on
Jul 7, 2025 • 0 new comments -
sample c++ program just print "hello" does not start
#6243 commented on
Jul 7, 2025 • 0 new comments -
Cannot create OnnxTensor with UINT8 type.
#6261 commented on
Jul 7, 2025 • 0 new comments -
Referencing Microsoft.ML.OnnxRuntime and Microsoft.ML.OnnxRuntime.GPU in a c# project.
#6264 commented on
Jul 7, 2025 • 0 new comments -
Could onnxruntime be compiled into wasm using emsdk?
#6275 commented on
Jul 7, 2025 • 0 new comments -
Performance shaking
#6301 commented on
Jul 7, 2025 • 0 new comments -
[Bug] Wrong implementation in LpPool
#6302 commented on
Jul 7, 2025 • 0 new comments -
Memory corruption when using OnnxRuntime with OpenVINO on the Intel MyriadX and Raspberry Pi 4B
#6304 commented on
Jul 7, 2025 • 0 new comments -
[ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Error while inferencing DLRM onnx model
#6319 commented on
Jul 7, 2025 • 0 new comments -
Error: Running double precision model exported from pyTorch
#6320 commented on
Jul 7, 2025 • 0 new comments -
The output for GPT is NAN when fp16=True
#6328 commented on
Jul 7, 2025 • 0 new comments -
ROCm build seems broken: `error: ‘ncclComm_t’ does not name a type`
#6358 commented on
Jul 7, 2025 • 0 new comments -
Implementation of ONNX Functions
#6360 commented on
Jul 7, 2025 • 0 new comments -
Run a model containing CustomOp with TensorRT provider fails
#7314 commented on
Jul 7, 2025 • 0 new comments -
C# console app crash upon appending OpenVino execution provider
#7330 commented on
Jul 7, 2025 • 0 new comments -
Cannot save Tensorrt .engine model in v1.7.1
#7339 commented on
Jul 7, 2025 • 0 new comments -
openvino continued package by pyinstaller external dll issue
#7346 commented on
Jul 7, 2025 • 0 new comments -
Resize Operator rounds-down instead of round-to-even for int32/uint8
#7368 commented on
Jul 7, 2025 • 0 new comments -
How to compile the framework that can run in Windows XP?
#7444 commented on
Jul 7, 2025 • 0 new comments -
Please, update the docs. Provider parameter "cuda_mem_limit" was renamed to "gpu_mem_limit" in nightly build.
#7457 commented on
Jul 7, 2025 • 0 new comments -
How to release gpu memory without exiting the process?
#7463 commented on
Jul 7, 2025 • 0 new comments -
Running inference using GPU or TensorRT on Jetson
#7484 commented on
Jul 7, 2025 • 0 new comments -
Problem compiling ONNX RT with CUDA and TensorRT on Windows
#7562 commented on
Jul 7, 2025 • 0 new comments -
Use of torch InstanceNorm2d and dynamic tensor size causes crash
#7572 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime build is not compatible with onnx build. Protobuf loaded twice.
#7597 commented on
Jul 7, 2025 • 0 new comments -
Large GPU memory usage with EXHAUSTIVE cuDNN search
#7612 commented on
Jul 7, 2025 • 0 new comments -
Enable CUDA provider option configuration in Java
#7613 commented on
Jul 7, 2025 • 0 new comments -
Publish the providers with the release build
#7628 commented on
Jul 7, 2025 • 0 new comments -
Build fails with --use_rknpu
#7614 commented on
Jul 7, 2025 • 0 new comments -
int8 quantization on GPU support? (transformers)
#7634 commented on
Jul 7, 2025 • 0 new comments -
Does onnxruntime support bert with relative position embedding
#7713 commented on
Jul 7, 2025 • 0 new comments -
quantize model can‘t run on gpu ?
#7745 commented on
Jul 7, 2025 • 0 new comments -
Loading a Keras model with custom layers into Microsoft.ML
#10419 commented on
Jul 7, 2025 • 0 new comments -
Onnxruntime.gpu is as slower than cpu mode
#6799 commented on
Jul 7, 2025 • 0 new comments -
Multiple input and multiple output models that create tensors in loops can cause serious crashes
#6821 commented on
Jul 7, 2025 • 0 new comments -
ONNXRuntime Inference with Finetuned BERT Model outputting odd results
#6830 commented on
Jul 7, 2025 • 0 new comments -
Unable to build onnxruntime with "--build_wheel" and "--enable_pybind" options
#6841 commented on
Jul 7, 2025 • 0 new comments -
[JAVA Bindings + Android arm64-v8a] ONNXRuntime build documentation
#6923 commented on
Jul 7, 2025 • 0 new comments -
dynamic shape input is much slower than fixed shape input in gpu
#6978 commented on
Jul 7, 2025 • 0 new comments -
CUDA header requested but missing in DNNL part of ORT 1.7.1
#7005 commented on
Jul 7, 2025 • 0 new comments -
Build fail for docker on MacOS. -NO GPU.
#7052 commented on
Jul 7, 2025 • 0 new comments -
Large Memory Allocations When Loading RandomForestRegressor Model
#7067 commented on
Jul 7, 2025 • 0 new comments -
Non-zero status code returned while running BatchNormalization node
#7095 commented on
Jul 7, 2025 • 0 new comments -
Memory and timing issue with onnxruntime python API with TensorFlow model
#7106 commented on
Jul 7, 2025 • 0 new comments -
Compile error in header onnxruntime_cxx_api.h when update ONNX runtime from 1.5.2 to 1.7.1
#7142 commented on
Jul 7, 2025 • 0 new comments -
Batch inference
#7178 commented on
Jul 7, 2025 • 0 new comments -
Segmentation fault when running onnxruntime inside docker with cpuset restrictions
#7207 commented on
Jul 7, 2025 • 0 new comments -
Significant difference in the performance of pytorch and exported onnx models
#7212 commented on
Jul 7, 2025 • 0 new comments -
TensorrtExecutionProvider slower than CUDAExecutionProvider: Transformers
#7230 commented on
Jul 7, 2025 • 0 new comments -
The speed of running the onnx model is 6x slower than the pytorch model on Jetson TX2
#7233 commented on
Jul 7, 2025 • 0 new comments -
[Python API + ARM64] Running ResNet50 on ARM board using ACL Error and Performance Issue
#7234 commented on
Jul 7, 2025 • 0 new comments -
ACL (32bit) Execution Provider fails on gemm node
#7255 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime gpu version can't installed, how to fix it?
#7272 commented on
Jul 7, 2025 • 0 new comments -
System.ExecutionEngineException creating Microsoft.ML.OnnxRuntime.SessionOptions
#23263 commented on
Jul 9, 2025 • 0 new comments -
onnxruntime produces invalid results due to the wrong shape inference for the clip operator
#24971 commented on
Jul 9, 2025 • 0 new comments -
Focus is not visible on Scrolling cards under Trusted By section: A11y_ONNX Runtime & Ecosystem_Runtime_Focus Visible
#24996 commented on
Jul 9, 2025 • 0 new comments -
The behavior of Gather/GatherElements/GatherND when the indices values are out-of-bounds
#25251 commented on
Jul 9, 2025 • 0 new comments -
MaxPool produces results with wrong shape
#25234 commented on
Jul 9, 2025 • 0 new comments -
Multi-threaded GPU inferencing failing with whisper-small: Non-zero status code returned while running DecoderMaskedMultiHeadAttention node
#21413 commented on
Jul 9, 2025 • 0 new comments -
Significant loading time with TensorRT compared to not
#4018 commented on
Jul 9, 2025 • 0 new comments -
preprocess issues around MeanReduce/Reshape nodes and negative axes
#23868 commented on
Jul 8, 2025 • 0 new comments -
Reshape with a `0` dimension produces incorrect shape
#15203 commented on
Jul 8, 2025 • 0 new comments -
Creating ORT inference session from onnx model gives segmentation fault
#24087 commented on
Jul 8, 2025 • 0 new comments -
[Web] `Error: using ceil() in shape computation is not yet supported for AveragePool`
#21206 commented on
Jul 8, 2025 • 0 new comments -
[Build] .pc file asks for -lonnxruntime but onnxruntime.a isn't installed
#23959 commented on
Jul 8, 2025 • 0 new comments -
Onnx Runtime for Java is packaged with 200MB onnxruntime.pdb in the win-x64 native package
#12084 commented on
Jul 8, 2025 • 0 new comments -
[OpenVINO] SessionOptionsAppendExecutionProvider_OpenVINO API loads NULL config file
#23871 commented on
Jul 8, 2025 • 0 new comments -
[Performance] Increased memory usage when loading from bytes
#21165 commented on
Jul 8, 2025 • 0 new comments -
[Web] `Error: [WebGPU] Kernel "[Conv] /text_encoder/encoder/layers.0/feed_forward/conv_2/Conv" failed. Error: FILTER_IN_CHANNEL should be equal to DATA_CHANNEL`
#21108 commented on
Jul 8, 2025 • 0 new comments -
[Performance] Dynamic Shape performance
#13198 commented on
Jul 8, 2025 • 0 new comments -
[Build] Mismatched library directory in linux-x64 package: lib and lib64
#22267 commented on
Jul 7, 2025 • 0 new comments -
Use env. allocators for initializers (#25108)
#25281 commented on
Jul 11, 2025 • 0 new comments -
Upgrade xnnpack to latest
#25275 commented on
Jul 11, 2025 • 0 new comments -
[ARM CPU] SVE support for Elementwise kernels
#25238 commented on
Jul 10, 2025 • 0 new comments -
[webgpu] extend cast version to 23
#25235 commented on
Jul 9, 2025 • 0 new comments -
[Don't review][webgpu] Support sg_size=32 for dp4 shader
#25184 commented on
Jul 7, 2025 • 0 new comments -
[QNN_EP] Implement Efficient Mode API
#25146 commented on
Jul 11, 2025 • 0 new comments -
Compile API: support for OrtModel input and write output to stream
#24740 commented on
Jul 7, 2025 • 0 new comments -
Prototype getting EP graph partitioning info from OrtSession
#24688 commented on
Jul 9, 2025 • 0 new comments -
Migrate issue labeler workflow to issueLabeler.yml policy
#21659 commented on
Jul 7, 2025 • 0 new comments -
Output mismatch in QDQ model with optimizations enabled vs disabled (CPU Execution Provider)
#25259 commented on
Jul 11, 2025 • 0 new comments -
Memory safety for Nvidia GPU time-slicing
#24943 commented on
Jul 10, 2025 • 0 new comments -
[Build] Build fails: 'error : no operator "+=" matches these operands' with nv_bfloat16
#25162 commented on
Jul 10, 2025 • 0 new comments -
[CUDA] Acquiring a CUDA allocator without loading a session.
#19420 commented on
Jul 10, 2025 • 0 new comments -
[Performance] ONNX Runtime: Concat and Slice ops fallback to CPU even with float32 and static shapes
#24999 commented on
Jul 10, 2025 • 0 new comments -
[Bug] Invalid type for QuantizeLinear dtype post-ORT optimizations
#25001 commented on
Jul 10, 2025 • 0 new comments -
AMD GPU-NPU
#25142 commented on
Jul 10, 2025 • 0 new comments -
how to release gpu memory when keep onnxruntime session around.
#9509 commented on
Jul 10, 2025 • 0 new comments -
[OpenVINO EP] GetCapability shouldn't override the NPU device type as CPU
#25164 commented on
Jul 10, 2025 • 0 new comments -
[CPU EP] Fail to run some WPT WebNN argMin/argMax conformance tests of uint32/uint64 types by default CPU EP
#25183 commented on
Jul 10, 2025 • 0 new comments -
[Build] Unable to build ONNX Runtime 1.22 due to dependency update
#25098 commented on
Jul 9, 2025 • 0 new comments -
Performance comparison
#5834 commented on
Jul 7, 2025 • 0 new comments -
IOBindings in C++ API are missing a way to SynchronizeInputs.
#5857 commented on
Jul 7, 2025 • 0 new comments -
How to compile with vs2019, with the platform tool set "Visual Studio 2015 - Windows XP (v140_xp)", i want use it in xp system
#5859 commented on
Jul 7, 2025 • 0 new comments -
Quantized model much slower than full precision model
#5865 commented on
Jul 7, 2025 • 0 new comments -
Performance issue with operator Where on CPU
#5896 commented on
Jul 7, 2025 • 0 new comments -
Performance issue with operators SVMRegressor and SVMClassifier for RBF kernel on CPU
#5898 commented on
Jul 7, 2025 • 0 new comments -
failed:/onnxruntime_src/onnxruntime/core/graph/model_load_utils.h:47 void onnxruntime::model_load_utils::ValidateOpsetForDomain
#5905 commented on
Jul 7, 2025 • 0 new comments -
Support GCN
#5910 commented on
Jul 7, 2025 • 0 new comments -
EyeLike with dynamic shape results in error
#5917 commented on
Jul 7, 2025 • 0 new comments -
Can't train mnist in parallel
#5918 commented on
Jul 7, 2025 • 0 new comments -
could not open "tensorrt_provider_factory.h", "mkldnn_provider_factory.h"
#5925 commented on
Jul 7, 2025 • 0 new comments -
Dynamic shape got wrong output
#5928 commented on
Jul 7, 2025 • 0 new comments -
Issue with Multi-GPU and GPU memory limit
#5939 commented on
Jul 7, 2025 • 0 new comments -
Drop support for Python 3.5
#5961 commented on
Jul 7, 2025 • 0 new comments -
can not get expected speed in onnxruntime
#5953 commented on
Jul 7, 2025 • 0 new comments -
Error using onnx model containing Bidirectional layer with MatMulAddFusion
#5955 commented on
Jul 7, 2025 • 0 new comments -
No opset import for domain 'com.microsoft'
#5971 commented on
Jul 7, 2025 • 0 new comments -
"undefined symbol" error occured, when I use ort.SessionOptions.register_custom_ops_library
#5984 commented on
Jul 7, 2025 • 0 new comments -
Under TRT EP, custom op cannot fall back to CUDA EP
#6002 commented on
Jul 7, 2025 • 0 new comments -
Inconsistent inference time between C Python API [Megatron-LM]
#6025 commented on
Jul 7, 2025 • 0 new comments -
Microsoft.AI.MachineLearning cannot be used in UWP app on on Windows 10 ARM64
#4686 commented on
Jul 7, 2025 • 0 new comments -
Debugging capability of onnxruntime in Visual Studio 2019 incapacitated
#4812 commented on
Jul 7, 2025 • 0 new comments -
[WinML] [C++/WinRT] Clarify how to share Ort::Env environments with WinRT/WinML instances
#4971 commented on
Jul 7, 2025 • 0 new comments -
C Sharp API for openvino doesn't run on GPU
#5011 commented on
Jul 7, 2025 • 0 new comments -
onxruntime-gpu installation issues
#5020 commented on
Jul 7, 2025 • 0 new comments -
program stucks when multi processes
#5093 commented on
Jul 7, 2025 • 0 new comments -
Exception thrown from Dispose method (When missing dependency)
#5250 commented on
Jul 7, 2025 • 0 new comments -
DLRM model failure to execute on GPU
#5295 commented on
Jul 7, 2025 • 0 new comments -
Running quantized models on GPU
#5359 commented on
Jul 7, 2025 • 0 new comments -
Can Session::Run be const?
#5558 commented on
Jul 7, 2025 • 0 new comments -
ML.NET issue while Using yolov4 onnx model
#5593 commented on
Jul 7, 2025 • 0 new comments -
Passing Non-Const pointer to Session::Run() using CPP Api
#5597 commented on
Jul 7, 2025 • 0 new comments -
How to reduce memory used?
#5711 commented on
Jul 7, 2025 • 0 new comments -
openvino build failed nuget
#5749 commented on
Jul 7, 2025 • 0 new comments -
How to loading a pytorch model with input shape of (None, 32) using the C# inference ?
#5781 commented on
Jul 7, 2025 • 0 new comments -
Any support for double type tensor when loading pytorch onnx model ?
#5782 commented on
Jul 7, 2025 • 0 new comments -
memory keep increasing with dynamic input shape of network
#5796 commented on
Jul 7, 2025 • 0 new comments -
Memory usage with Cuda ExecutionProvider
#5801 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : The node is not placed on any Execution Provider. OneHot(11) (node while/cond_5/one_hot).
#5825 commented on
Jul 7, 2025 • 0 new comments -
Non-zero status code returned while running Div node
#5830 commented on
Jul 7, 2025 • 0 new comments -
ort-web Error: invalid input shape. when using webgl backend and there is a torch.nn.BatchNorm1d layer in the network
#10437 commented on
Jul 7, 2025 • 0 new comments -
cast BatchNorm2d to int32
#10440 commented on
Jul 7, 2025 • 0 new comments -
TensorRT input: 717 has no shape specified.
#10443 commented on
Jul 7, 2025 • 0 new comments -
raise Exception("Incomplete symbolic shape inference") when running "symbolic_shape_infer.py"
#10484 commented on
Jul 7, 2025 • 0 new comments -
C++ OnnxRuntime-GPU Slower than Python OnnxRuntime-GPU/C++ OnnxRuntime-CPU
#10492 commented on
Jul 7, 2025 • 0 new comments -
slower after graph optimization!
#10538 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime and onnxruntime-gpu produce different output for ReduceL1 operator
#10542 commented on
Jul 7, 2025 • 0 new comments -
Run maskrcnn onnx from pytorch and inference on c++ with gpu sometimes will error
#10543 commented on
Jul 7, 2025 • 0 new comments -
Exception in DirectML on second inference run
#10546 commented on
Jul 7, 2025 • 0 new comments -
Unit Tests failure while building on Windows with CUDA EP
#10561 commented on
Jul 7, 2025 • 0 new comments -
Building Error
#10600 commented on
Jul 7, 2025 • 0 new comments -
OpenVINO Execution provider's CPU Utility is low
#10601 commented on
Jul 7, 2025 • 0 new comments -
How to use OpenVINO GetAvailableDevices?
#10602 commented on
Jul 7, 2025 • 0 new comments -
why it take 200 seconds to run onnxruntime.InferenceSession
#10608 commented on
Jul 7, 2025 • 0 new comments -
Building OnnxRuntime v1.10.0 with CUDAExecutionProvider for sm_75 GPU fails in CUDA10.2 environment
#10610 commented on
Jul 7, 2025 • 0 new comments -
C + + onnxruntime GPU is ten times slower than CPU
#10611 commented on
Jul 7, 2025 • 0 new comments -
Optimization for T5 transformer models.
#10613 commented on
Jul 7, 2025 • 0 new comments -
[E:onnxruntime:, sequential_executor.cc:346 Execute] Non-zero status code returned while running Add node. Name:'Add_1363' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/math/element_wise_ops.h:505 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 9 by 505
#10618 commented on
Jul 7, 2025 • 0 new comments -
about providers and providers_options in InferenceSession
#10620 commented on
Jul 7, 2025 • 0 new comments -
Awful performance with LASER model when using TensorRT provider
#8315 commented on
Jul 7, 2025 • 0 new comments -
Sigmoid fails and output all zeros
#10154 commented on
Jul 7, 2025 • 0 new comments -
Why does onnxruntime run slower on C++?
#10155 commented on
Jul 7, 2025 • 0 new comments -
`InferenceSession` initialization hangs
#10166 commented on
Jul 7, 2025 • 0 new comments -
TensorRT EP failed to set INT8 dynamic range.
#10206 commented on
Jul 7, 2025 • 0 new comments -
how to use docker and onnxruntime deploy onnx model on GPU?
#10257 commented on
Jul 7, 2025 • 0 new comments -
Inconsistent inference timing on CPU
#10270 commented on
Jul 7, 2025 • 0 new comments -
Inference: Time in GPU is similar in CPU. GPU not speed up
#10271 commented on
Jul 7, 2025 • 0 new comments -
multiple InferenceSession slowdown inference speed
#10273 commented on
Jul 7, 2025 • 0 new comments -
DnnlExecutionProvider is not visible in python API
#10275 commented on
Jul 7, 2025 • 0 new comments -
add QLinearMatMul do not quantize per channel flag to quantize_static extra options
#10283 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime inference is around 5 times slower than pytorch when using GPU
#10303 commented on
Jul 7, 2025 • 0 new comments -
Bug: pthread sent an error! undefined:undefined: ortWasmThreaded is not defined
#10311 commented on
Jul 7, 2025 • 0 new comments -
Quantized int8 onnx GPT2 model returns different tokens whether using past_key_values or not for the same sentence
#10322 commented on
Jul 7, 2025 • 0 new comments -
Onnxruntime multithread options [C++ CPU]
#10330 commented on
Jul 7, 2025 • 0 new comments -
Issues when trying to use Onnxruntime and Tensorrt execution provider in a java application
#10352 commented on
Jul 7, 2025 • 0 new comments -
build onnxruntime error linux
#10364 commented on
Jul 7, 2025 • 0 new comments -
Error happened while building onnxruntime
#10378 commented on
Jul 7, 2025 • 0 new comments -
Question about hidden states in onnx DistilGPT2
#10382 commented on
Jul 7, 2025 • 0 new comments -
Is TensorRT execution provider caching is thread-safe
#10412 commented on
Jul 7, 2025 • 0 new comments -
Different inference results from python and C#
#10863 commented on
Jul 7, 2025 • 0 new comments -
Does WebGL fail when network inputs are not dimensions in powers of two?
#10873 commented on
Jul 7, 2025 • 0 new comments -
TensorRT conversion support on Huggingface transformers quantized models.
#10888 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime/capi/onnxruntime_inference_collection.py", line 370, in _create_inference_session sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model) onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from onnx_data/cpm_large_opt.onnx failed:Protobuf parsing failed.
#10892 commented on
Jul 7, 2025 • 0 new comments -
python3 -m onnxruntime_tools.transformers.optimizer when opt_level=1 comes error for BERT
#10893 commented on
Jul 7, 2025 • 0 new comments -
1 : Fail : Non-zero status code returned while running FusedConv node.
#10894 commented on
Jul 7, 2025 • 0 new comments -
After using onnxruntime.transformers.optimizer to optimize onnx, the optimized model fail to tensorrt
#10905 commented on
Jul 7, 2025 • 0 new comments -
TensorRT Execution [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization
#10914 commented on
Jul 7, 2025 • 0 new comments -
slow fp16 performance
#10919 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime TensorRT Related Questions
#10930 commented on
Jul 7, 2025 • 0 new comments -
MinGW support (MSYS2)
#10976 commented on
Jul 7, 2025 • 0 new comments -
docker can't clone git repository for ARM64
#10991 commented on
Jul 7, 2025 • 0 new comments -
Xor with broadcasting computes error
#11000 commented on
Jul 7, 2025 • 0 new comments -
Inconsistent behavior between CPU and GPU on ReLU operator when input is NaN
#11010 commented on
Jul 7, 2025 • 0 new comments -
0xc00007b error, could not startup exe at all with onnxruntime1.7 win-x64 cpu on win10
#11016 commented on
Jul 7, 2025 • 0 new comments -
Failed to build onnxruntime-vitisai docker container due to missing NO_PUBKEY
#11017 commented on
Jul 7, 2025 • 0 new comments -
Huggingface Transformers Shape Inference Issue
#11019 commented on
Jul 7, 2025 • 0 new comments -
kalid-onnxruntime Fatal error: Gemm is not a registered function/op
#11021 commented on
Jul 7, 2025 • 0 new comments -
Updating state of the network
#11026 commented on
Jul 7, 2025 • 0 new comments -
Can't constant fold SequenceEmpty node
#11041 commented on
Jul 7, 2025 • 0 new comments -
How to use mimalloc in Linux?
#10629 commented on
Jul 7, 2025 • 0 new comments -
CPU & CUDA execution provider produce different value
#10636 commented on
Jul 7, 2025 • 0 new comments -
No libonnxruntime_providers_cuda.so generated?
#10639 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION
#10657 commented on
Jul 7, 2025 • 0 new comments -
Get wrong result when use webgl backend
#10673 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs initializer_node_arg != nullptr was false.
#10677 commented on
Jul 7, 2025 • 0 new comments -
Need help on the following from wiki listed roadmap.
#10689 commented on
Jul 7, 2025 • 0 new comments -
Output shape is mismatched with ONNX SPEC about Resize_tf_crop_and_size with scale input
#10727 commented on
Jul 7, 2025 • 0 new comments -
gpu onnxruntime lib
#10731 commented on
Jul 7, 2025 • 0 new comments -
Onnx model consumes huge CPU memory
#10742 commented on
Jul 7, 2025 • 0 new comments -
inference qdq model failed with TRT EP.
#10743 commented on
Jul 7, 2025 • 0 new comments -
build on windows cup is fine,but cuda not
#10745 commented on
Jul 7, 2025 • 0 new comments -
Is there a version of onnxruntime that is compatible with windows 7?
#10749 commented on
Jul 7, 2025 • 0 new comments -
can build on windows with Geforce 1060 card, cuda 11.0 cudnn 8.0.2 successfully?
#10763 commented on
Jul 7, 2025 • 0 new comments -
very slow in inference
#10764 commented on
Jul 7, 2025 • 0 new comments -
ONNX models give slower inference in Python Multiprocessing
#10786 commented on
Jul 7, 2025 • 0 new comments -
Inference time of onnxruntime gpu increases at very high batch sizes
#10789 commented on
Jul 7, 2025 • 0 new comments -
Transformer optimizer outputs confusing error
#10838 commented on
Jul 7, 2025 • 0 new comments -
C++ is 10x slower compared with Python, CPU only
#10849 commented on
Jul 7, 2025 • 0 new comments -
Windows 32 bit performance much slower than 64bit?
#10855 commented on
Jul 7, 2025 • 0 new comments -
Inference Speed is slow on GPU
#8316 commented on
Jul 7, 2025 • 0 new comments -
After 8bit quantization, the GPU inference speed is very slow
#8330 commented on
Jul 7, 2025 • 0 new comments -
GPUs operate slower than CPUs
#8362 commented on
Jul 7, 2025 • 0 new comments -
error using C# tensorRT EP builded from source
#8367 commented on
Jul 7, 2025 • 0 new comments -
Why cuda provider allocator must be threadlocal?
#8378 commented on
Jul 7, 2025 • 0 new comments -
ERROR running model inference:Non-zero status code returned while running Cast node
#8424 commented on
Jul 7, 2025 • 0 new comments -
Implement Split for double or float64 data type
#8382 commented on
Jul 7, 2025 • 0 new comments -
Found regression on ORT 1.8.1
#8513 commented on
Jul 7, 2025 • 0 new comments -
Does the onnxruntime.quantization.quantize_dynamic support GPU quantization?
#8524 commented on
Jul 7, 2025 • 0 new comments -
gpu memory can not release.
#8544 commented on
Jul 7, 2025 • 0 new comments -
Build failure of onnxruntime Docker container with Vitis-AI
#8596 commented on
Jul 7, 2025 • 0 new comments -
PrepareForCompute Non concat axis dimensions must match: Axis 0 has mismatched dimensions of 1 and 0
#8685 commented on
Jul 7, 2025 • 0 new comments -
error with torch.sum or torch.tensor.mean operator on GPU
#8742 commented on
Jul 7, 2025 • 0 new comments -
Symbolic shape inference error for loop node & seq(tensor)
#8755 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime Jetson tx2 cuda
#8771 commented on
Jul 7, 2025 • 0 new comments -
AttributeError: module 'onnxruntime' has no attribute 'set_default_logger_severity'
#8789 commented on
Jul 7, 2025 • 0 new comments -
IsNaN and Split have no double implementations
#8791 commented on
Jul 7, 2025 • 0 new comments -
Readily available Python wheels for ARM?
#8874 commented on
Jul 7, 2025 • 0 new comments -
cannot import name ‘get_all_providers‘
#8907 commented on
Jul 7, 2025 • 0 new comments -
TensorRT execution provider SEGFAULT
#7757 commented on
Jul 7, 2025 • 0 new comments -
CUDA kernel not found in registries for Op type: Pad
#7779 commented on
Jul 7, 2025 • 0 new comments -
ACL and ArmNN v21.02 EP has problem with GEMM
#7784 commented on
Jul 7, 2025 • 0 new comments -
get error when using a model with custom op
#7788 commented on
Jul 7, 2025 • 0 new comments -
Force fallback to CPU execution for Gather, Unsqueeze, Concat nodes - onnxruntime-gpu 1.7.0, opset 12 and 13
#7792 commented on
Jul 7, 2025 • 0 new comments -
How to get sparse tensor input in custom op?
#7838 commented on
Jul 7, 2025 • 0 new comments -
Non-zero status code returned while running FusedConv node. Name:'fused ' onnxruntime::OpKernelContext::Input Missing Input: input
#7853 commented on
Jul 7, 2025 • 0 new comments -
Build failure in onnxruntime/test/featurizers_ops/truncated_svdtransformer_test.cc
#7878 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime::BroadcastIterator::Init axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 15 by 64
#7888 commented on
Jul 7, 2025 • 0 new comments -
undefined reference to `onnx::optimization::GetAvailablePasses() on Nvidia Jetson NX
#7970 commented on
Jul 7, 2025 • 0 new comments -
Runtime exception during initialization of SparkML model (One falsenode is pointing either to itself, either to another tree.)
#8008 commented on
Jul 7, 2025 • 0 new comments -
Running multiple input node onnx model using onnxrntime C/C++ API
#8019 commented on
Jul 7, 2025 • 0 new comments -
Memory leak in free-dimention model in C++
#8053 commented on
Jul 7, 2025 • 0 new comments -
CUDAExecutionProvider does not handle Clip on float16 tensor.
#8070 commented on
Jul 7, 2025 • 0 new comments -
Why ReduceSum get shape 0 for an empty input?
#8146 commented on
Jul 7, 2025 • 0 new comments -
System memory leak on cuda GPU backend.
#8147 commented on
Jul 7, 2025 • 0 new comments -
Does ONNX Runtime and its execution providers support FP16 inference?
#8173 commented on
Jul 7, 2025 • 0 new comments -
Reflect padding output seems incorrect when padding size larger than input dimension
#8265 commented on
Jul 7, 2025 • 0 new comments -
ai.onnxruntime.OrtException: Error code - ORT_FAIL - message: OrtSessionOptionsAppendExecutionProvider_Cuda: Failed to load shared library
#8283 commented on
Jul 7, 2025 • 0 new comments -
Eigen::ThreadPoolInterface*, const onnxruntime::ThreadOptions&) pthread_setaffinity_np failed
#8313 commented on
Jul 7, 2025 • 0 new comments -
yolov5 with the compiled onnxruntime by self,but is so slow, not with the GPU
#9689 commented on
Jul 7, 2025 • 0 new comments -
Unable to load shared library 'onnxruntime' on MacOS (DllNotFoundException)
#9707 commented on
Jul 7, 2025 • 0 new comments -
Support for int64 with webgl backend of the web runtime
#9724 commented on
Jul 7, 2025 • 0 new comments -
ouput of onnx model with custom op in the loop structrue is confusing
#9742 commented on
Jul 7, 2025 • 0 new comments -
How to build for multiple execution provider?
#9756 commented on
Jul 7, 2025 • 0 new comments -
Inference is slower when running inside Docker
#9767 commented on
Jul 7, 2025 • 0 new comments -
[ONNXRuntimeError] : 1 : FAIL : Fatal error: test_custom is not a registered function/op
#9831 commented on
Jul 7, 2025 • 0 new comments -
non-NEON Compatibility
#9849 commented on
Jul 7, 2025 • 0 new comments -
Yolov5 ORT train failed with onnxruntime backend
#9936 commented on
Jul 7, 2025 • 0 new comments -
Support for pip wheel tensorrt
#9986 commented on
Jul 7, 2025 • 0 new comments -
question about warnup long time
#10017 commented on
Jul 7, 2025 • 0 new comments -
Importing onnxruntime on AWS Lambdas with ARM64 processor causes crash
#10038 commented on
Jul 7, 2025 • 0 new comments -
how to forward with a batch images, oncetime?
#10071 commented on
Jul 7, 2025 • 0 new comments -
when my models input size is 3808, then i forward with yolov5, the memry is break.
#10074 commented on
Jul 7, 2025 • 0 new comments -
Same Pad_Head value in ORT for SAME_UPPER/SAME_LOWER if get negative odd pad value
#10086 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime latest version segment fault
#10113 commented on
Jul 7, 2025 • 0 new comments -
ORTModule import error : with onnxruntime
#10127 commented on
Jul 7, 2025 • 0 new comments -
BatchNorm fails on CUDA EP with zero length sequences
#10128 commented on
Jul 7, 2025 • 0 new comments -
Do you have any plan to add 'Round' Operator for gradient builder registry for orttrainer?
#10138 commented on
Jul 7, 2025 • 0 new comments -
Performance question about some nodes generated by dynamic quantization
#10153 commented on
Jul 7, 2025 • 0 new comments -
Runetime Error: Decoder with dynamic axes does not work with Encoder output
#8910 commented on
Jul 7, 2025 • 0 new comments -
How to get the value of tensors in subgraph?
#8929 commented on
Jul 7, 2025 • 0 new comments -
The model run time become longer when i update the onnxruntime from version 1.7 to version 1.8
#8938 commented on
Jul 7, 2025 • 0 new comments -
ONNX inference result are different to pytorch model
#8977 commented on
Jul 7, 2025 • 0 new comments -
Type error when runs an control flow model in ORT
#8999 commented on
Jul 7, 2025 • 0 new comments -
[E:onnxruntime:, sequential_executor.cc:339 Execute] Non-zero status code returned while running Transpose node. Name:'model/unet3d_segmentation/conv3d_12/Conv3D__165' Status Message: CUDA error cudaErrorInvalidConfiguration:invalid configuration argument
#9083 commented on
Jul 7, 2025 • 0 new comments -
cross compile but onnx-ml.pb.cc error
#9093 commented on
Jul 7, 2025 • 0 new comments -
how to input 'None' in cpp-version
#9121 commented on
Jul 7, 2025 • 0 new comments -
InferenceSession.run in python is inconsistent in terms of performance
#9208 commented on
Jul 7, 2025 • 0 new comments -
Can't load Cuda Provider on Linux due symbol lookup error
#9309 commented on
Jul 7, 2025 • 0 new comments -
ONNXRuntime CPU - Memory spiking continuously (Memory leak)
#9313 commented on
Jul 7, 2025 • 0 new comments -
error: '_Frees_ptr_opt_' has not been declared
#9332 commented on
Jul 7, 2025 • 0 new comments -
QLinearConv per-channel result is wrong and it's seem overflow when input is big for my model
#9365 commented on
Jul 7, 2025 • 0 new comments -
ORT execution fails when a gradient builder is not registered for module-local functions
#9375 commented on
Jul 7, 2025 • 0 new comments -
Relu getting dropped during quantization
#9425 commented on
Jul 7, 2025 • 0 new comments -
OnnxRuntime Build Failure in Docker
#9530 commented on
Jul 7, 2025 • 0 new comments -
YAMNet model running on CudaExecutionProvider is 3x slower than running on tensorflow
#9657 commented on
Jul 7, 2025 • 0 new comments -
Gap in inference time between onnxruntime and torch vanishes when increasing the batch size
#9660 commented on
Jul 7, 2025 • 0 new comments -
libonnxruntime.so crash
#9684 commented on
Jul 7, 2025 • 0 new comments -
[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for Conv(1) node with name 'Conv_0'
#9685 commented on
Jul 7, 2025 • 0 new comments -
[Build] ModuleNotFoundError: No module named 'onnxruntime'
#18966 commented on
Jul 7, 2025 • 0 new comments -
Error with finding onnxruntime_binding.node on Windows 10 on a bootcamp Macbook
#18971 commented on
Jul 7, 2025 • 0 new comments -
How to observe arena allocator memory request metrics
#18972 commented on
Jul 7, 2025 • 0 new comments -
Could not load library cudnn_cnn_infer64_8.dll. Error code 127
#18973 commented on
Jul 7, 2025 • 0 new comments -
[Build] Failure with OneDNN on Intel MacOS
#18976 commented on
Jul 7, 2025 • 0 new comments -
Cannot quantize yolov5 float to int8 onnx model
#18987 commented on
Jul 7, 2025 • 0 new comments -
Encounter unknown exception in initialize using Openvino EP
#19004 commented on
Jul 7, 2025 • 0 new comments -
ONNX Runtime inference on string input
#19006 commented on
Jul 7, 2025 • 0 new comments -
[Error: Exception in HostFunction: <unknown>] while running ort models in react-native
#19021 commented on
Jul 7, 2025 • 0 new comments -
[Performance] It is not possible to use a discrete graphics card with DML.
#19025 commented on
Jul 7, 2025 • 0 new comments -
[Build] deploying the EfficientAD anomaly detection algorithm, an error occurred while executing the "Run" command
#19030 commented on
Jul 7, 2025 • 0 new comments -
Freeing tensor data created via CreateTensor
#19034 commented on
Jul 7, 2025 • 0 new comments -
[Build] Linux x86_64 STATIC Build
#19035 commented on
Jul 7, 2025 • 0 new comments -
cudaMemcpyAsync throws exception in GPUDataTransfer
#19076 commented on
Jul 7, 2025 • 0 new comments -
[Training] On device training doesn't work with INT8 Models
#19078 commented on
Jul 7, 2025 • 0 new comments -
[Performance] The CUDA Stream cannot be set through Python API
#19094 commented on
Jul 7, 2025 • 0 new comments -
Longformer `convert_to_onnx.py` not working due to missing imports
#19149 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Why run first inference so slow, although run one time in initialzation?
#19177 commented on
Jul 7, 2025 • 0 new comments -
ORT 1.17.0 Release Candidates available for testing
#19236 commented on
Jul 7, 2025 • 0 new comments -
Missprinted condition: head_size != num_heads * head_size
#18675 commented on
Jul 7, 2025 • 0 new comments -
Parallel inference of multiple models in different threads
#18806 commented on
Jul 7, 2025 • 0 new comments -
Onnxruntime using OpenVINO as execution provider encountered Exception during initialization problem on model candy.onnx
#18825 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Java API lacks functionality to control allocator settings.
#18845 commented on
Jul 7, 2025 • 0 new comments -
[dynamo_export] starts_.size() == ends_.size() + 1 was false. No matching 'start' entry.
#18863 commented on
Jul 7, 2025 • 0 new comments -
[dynamo_export] MLFloat16 data type is not supported with ScatterElements opset 18 when reduction is 'max'.
#18864 commented on
Jul 7, 2025 • 0 new comments -
[Web] Non-zero status code returned while running Slice node `webgpu`
#18892 commented on
Jul 7, 2025 • 0 new comments -
compute_range not available
#18893 commented on
Jul 7, 2025 • 0 new comments -
the resout of onnx and trt engine is different?why?
#18902 commented on
Jul 7, 2025 • 0 new comments -
SafeIntOnOverflow() Integer overflow error when inferencing on too many samples with Python
#18905 commented on
Jul 7, 2025 • 0 new comments -
error 126 Onnx in ComfyUI[Performance] O
#18925 commented on
Jul 7, 2025 • 0 new comments -
ai.onnxruntime.OrtException: Unsupported type - FLOAT16
#18926 commented on
Jul 7, 2025 • 0 new comments -
How to use multiple inputs of different types in C++ session
#18932 commented on
Jul 7, 2025 • 0 new comments -
[Web] onnxruntime-web is not work in nodejs
#18933 commented on
Jul 7, 2025 • 0 new comments -
C# I need to run the program on NPU (OnnxRuntime + DirectML + NPU),but it failed
#19846 commented on
Jul 7, 2025 • 0 new comments -
How to set `trt_profile_min_shapes` for inputs with name containing colons?
#18939 commented on
Jul 7, 2025 • 0 new comments -
OP (Conv) inference results mismatch with PyTorch
#18946 commented on
Jul 7, 2025 • 0 new comments -
[Build] How to build onnxruntime with openvino statically?
#18950 commented on
Jul 7, 2025 • 0 new comments -
[Performance] 2x Regression in 1st Inference time cost
#18957 commented on
Jul 7, 2025 • 0 new comments -
[iOS] Output of type sequence<map<int64,float32>> causes crash on iOS
#19867 commented on
Jul 7, 2025 • 0 new comments -
[Build] Where is official build for Unity?
#19964 commented on
Jul 7, 2025 • 0 new comments -
[BUG] [OpenVino EP] Only first result in session is correct.
#19975 commented on
Jul 7, 2025 • 0 new comments -
Onnx Runtime EntryPointNotFoundException: OrtGetApiBase in Unity Application.
#20048 commented on
Jul 7, 2025 • 0 new comments -
Layer not supported in one provider (Tensorrt) not working with second provider (CUDA) in an inference problem.
#20058 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Inference failed or unsupported using quantize_dynamic
#20060 commented on
Jul 7, 2025 • 0 new comments -
openvino with int8
#20072 commented on
Jul 7, 2025 • 0 new comments -
Unpredictable onnxruntime-node crash when using Electron
#20084 commented on
Jul 7, 2025 • 0 new comments -
In Aquatic mode links text “PyTorch and Hugging face” is not clearly visible: A11y_WCP URLs - ONNX Runtime_Home_Learn more about how to use ONNX Runtime with_usability
#20150 commented on
Jul 7, 2025 • 0 new comments -
multiple tests fail on Windows due to `ORT_ENABLE_STREAM` define logic error
#20180 commented on
Jul 7, 2025 • 0 new comments -
`convert_float_to_float16` results in `failed in shape inference <class 'Exception'>`
#20189 commented on
Jul 7, 2025 • 0 new comments -
Whether CUDA12.4 and cudnn9 matches onnxruntime-win-x64-cuda12-1.17.1
#20223 commented on
Jul 7, 2025 • 0 new comments -
[Training] Can we use ORTModule for inference?
#20281 commented on
Jul 7, 2025 • 0 new comments -
C API Seg Fault from OrtGetApiBase()->GetApi(ORT_API_VERSION);
#20283 commented on
Jul 7, 2025 • 0 new comments -
[Performance] ScatterND / GridSample operators are on CPU instead of GPU / CUDA
#20297 commented on
Jul 7, 2025 • 0 new comments -
DirectML returning empty result with ObjectDetection (Mobilinet V2 FPN Keras)
#20386 commented on
Jul 7, 2025 • 0 new comments -
[Build] Cmake install debug and release configuration
#20387 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Profiling on CUDA shows confusing values
#20398 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Massive Performance slowdown from v1.13.1 -> 1.14.0
#20400 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime 1.17.3 is missing from cuda 12 artifacts feed
#20409 commented on
Jul 7, 2025 • 0 new comments -
`shape_inference.quant_pre_process` causes `AttributeError: module 'onnx.helper' has no attribute 'make_sequence_value_info'`
#19323 commented on
Jul 7, 2025 • 0 new comments -
[Training] How to update running_mean and running_var of BatchNormalization during training
#19370 commented on
Jul 7, 2025 • 0 new comments -
[Performance] In ONNX Runtime, the CPU consumption does not scale linearly with the number of threads
#19384 commented on
Jul 7, 2025 • 0 new comments -
Backwards convolution layers in CUDA provider should heed
#19391 commented on
Jul 7, 2025 • 0 new comments -
InferenceSession.run does not validate rank of scalar inputs
#19434 commented on
Jul 7, 2025 • 0 new comments -
[Web] Memory Access Out of Bounds Error When Using ONNX Runtime Web Inference in NPM Package (wasm)
#19443 commented on
Jul 7, 2025 • 0 new comments -
[Performance] CPU inference much slower from GPU runtime
#19451 commented on
Jul 7, 2025 • 0 new comments -
[On-device Training] Yolo custom loss
#19464 commented on
Jul 7, 2025 • 0 new comments -
[Performance]
#19479 commented on
Jul 7, 2025 • 0 new comments -
Errors about using c# and TensorRT
#19489 commented on
Jul 7, 2025 • 0 new comments -
Accuracy drops a lot when using fp16 with TensorRT EP
#19492 commented on
Jul 7, 2025 • 0 new comments -
quantize_dynamic : nodes_to_quantize(Gemm) is ignored
#19503 commented on
Jul 7, 2025 • 0 new comments -
ONNX Runtime OpenVINO EP is way behind
#19688 commented on
Jul 7, 2025 • 0 new comments -
Observed TDR on a low-end system
#19724 commented on
Jul 7, 2025 • 0 new comments -
Inconsistent Prediction Outputs for Onnx Model
#19834 commented on
Jul 7, 2025 • 0 new comments -
import InferenceSesseion and capi._pybind_state.
#19836 commented on
Jul 7, 2025 • 0 new comments -
[Performance] onnxruntime 1.17.1 version doesnt support CUDA 12.4
#19839 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Accuracy dropped heavily using onnxruntime to inference a model quantized by QAT
#19850 commented on
Jul 7, 2025 • 0 new comments -
[Mobile] android prod crash: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR)
#20828 commented on
Jul 7, 2025 • 0 new comments -
Inference speed problem even if using a high-end Hardware.
#19865 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Data size of Batch Normalization using cuDNN in inference.
#17406 commented on
Jul 7, 2025 • 0 new comments -
Yolov8 Static Quantization
#17410 commented on
Jul 7, 2025 • 0 new comments -
CUDA Stream and Synchronous in custom operato
#17412 commented on
Jul 7, 2025 • 0 new comments -
[Performance] How much memory it needs to load a 3.4 GB model to GPU through DirectML?
#17413 commented on
Jul 7, 2025 • 0 new comments -
valgrind memcpy_chk overlap onnxruntime1.15.1
#17431 commented on
Jul 7, 2025 • 0 new comments -
Extract node info
#17444 commented on
Jul 7, 2025 • 0 new comments -
[Bug] FP16 conversion yields an unusable model
#17447 commented on
Jul 7, 2025 • 0 new comments -
[Mobile iOS] Run fp16 onnx model on CoreML EP
#17448 commented on
Jul 7, 2025 • 0 new comments -
C++ API, Memory Leak instantiating Ort::Sessions
#17451 commented on
Jul 7, 2025 • 0 new comments -
Failure with OpenvinoEP within ORT
#17499 commented on
Jul 7, 2025 • 0 new comments -
Resize of doesn't work well while the coordinate_transformation_mode is 'align_corners'.
#17564 commented on
Jul 7, 2025 • 0 new comments -
Inference speed of Quantized model not increased after static Quantization[Performance]
#17634 commented on
Jul 7, 2025 • 0 new comments -
DML EP One session but called in different threads. [Performance]
#17686 commented on
Jul 7, 2025 • 0 new comments -
SkipLayerNormFusion -- High Output Difference Between PyTorch and ONNX Runtime with Extended Optimizations
#17689 commented on
Jul 7, 2025 • 0 new comments -
[Web]
#17700 commented on
Jul 7, 2025 • 0 new comments -
[Mobile | iOS] I got "Unknown exception" error.
#17731 commented on
Jul 7, 2025 • 0 new comments -
[Web] Custom build packages
#17743 commented on
Jul 7, 2025 • 0 new comments -
[web] following-up work items for supporting uniform buffers
#17860 commented on
Jul 7, 2025 • 0 new comments -
[Web] Declaration is not emitted in onnxruntime-node package
#17979 commented on
Jul 7, 2025 • 0 new comments -
[Build] Why does TensorRT EP need the full version of protobuf?
#18040 commented on
Jul 7, 2025 • 0 new comments -
An error occurred when I used the TensorrtExecutionProvider in onnx runtime
#17047 commented on
Jul 7, 2025 • 0 new comments -
[Web] Cannot Convert to RGB when using Tensor.fromImage(image,{tensorFormat:'RGB'})
#17094 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Pytorch Model converted to ONNX with CUDAProvider run slower 3x time than Using Pytorch with GPU
#17116 commented on
Jul 7, 2025 • 0 new comments -
How to release onnxruntime gpu memory
#17142 commented on
Jul 7, 2025 • 0 new comments -
[TOOLS]:Using transformers.optimizer optimize large model, segmentation fault (core dumped)
#17212 commented on
Jul 7, 2025 • 0 new comments -
Onnx model inference Fatal error: ai.onnx.contib:bev_pool_v2(-1) is not a registered function/op
#17214 commented on
Jul 7, 2025 • 0 new comments -
[C#] Invalid input name error
#17244 commented on
Jul 7, 2025 • 0 new comments -
AssertionError on num_heads > 0 for bert with specific optimization config
#17254 commented on
Jul 7, 2025 • 0 new comments -
windows10 x86 x64 inference time varies greatly
#17256 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Operators assigned to CPU instead of CUDA, CPU thread management problem
#17268 commented on
Jul 7, 2025 • 0 new comments -
[Web] Error: no available backend found. ERR: [wasm] TypeError: Failed to parse URL from
#17274 commented on
Jul 7, 2025 • 0 new comments -
[Build] Error: cpuid.h: No such file or directory when cross-compiling ORT 1.15.1 with NNAPI for arm64
#17283 commented on
Jul 7, 2025 • 0 new comments -
Freeing heap block containing an active critical section
#17345 commented on
Jul 7, 2025 • 0 new comments -
[Performance] 3X slower inference on onnxruntime than pytorch(huggingface)
#17366 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Memcpy leads to AllocationError for argmax
#17371 commented on
Jul 7, 2025 • 0 new comments -
[web/js] need for more methods on tensor object
#17372 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Quantized model inference on CPU slower/same as FP32
#17389 commented on
Jul 7, 2025 • 0 new comments -
Default `tensorFormat` should RGBA for HTMLImageElement variant
#17395 commented on
Jul 7, 2025 • 0 new comments -
[Build] windows dll compilation error with versions above 1.14.0
#17404 commented on
Jul 7, 2025 • 0 new comments -
[Web] Add binary/where broadcast case when FXC issue got fixed in tint
#17405 commented on
Jul 7, 2025 • 0 new comments -
[Build]
#18570 commented on
Jul 7, 2025 • 0 new comments -
# Issue with Rounding Behavior in onnxruntime's Quantizelinear Layer
#18576 commented on
Jul 7, 2025 • 0 new comments -
Session Run throws an access violation exception when I recreate the session
#18578 commented on
Jul 7, 2025 • 0 new comments -
[Node.js] Support for loading models with external data in `onnxruntime-node`
#18586 commented on
Jul 7, 2025 • 0 new comments -
Cuda EP does not compute reduce with empty set correctly?
#18588 commented on
Jul 7, 2025 • 0 new comments -
[Mobile] Model with large input size cause Segmentation Fault while session->run()
#18595 commented on
Jul 7, 2025 • 0 new comments -
Session initialization stuck/crash in DMLCreateDevice while using DirectML EP
#18599 commented on
Jul 7, 2025 • 0 new comments -
Profiling multithreaded runs
#18600 commented on
Jul 7, 2025 • 0 new comments -
Segmentation Fault when some of node outputs is empty
#18601 commented on
Jul 7, 2025 • 0 new comments -
What is the recommended setup for running multiple models/sessions in parallel in C++?
#18610 commented on
Jul 7, 2025 • 0 new comments -
DirectML Resize Node error.
#18613 commented on
Jul 7, 2025 • 0 new comments -
[Build]
#18617 commented on
Jul 7, 2025 • 0 new comments -
Could not find an implementation for SkipGroupNorm(1) node with name 'SkipGroupNorm_0'
#18623 commented on
Jul 7, 2025 • 0 new comments -
Crash in ResizeHelper::Initialize executing a model on ARM64
#18628 commented on
Jul 7, 2025 • 0 new comments -
ONNXRuntime Segmentation Fault Crash on Inference (iOS and Mac)
#18632 commented on
Jul 7, 2025 • 0 new comments -
[Performance] dynamic batch infer cost time question
#18639 commented on
Jul 7, 2025 • 0 new comments -
ORT memory error with the graph from linspace
#18648 commented on
Jul 7, 2025 • 0 new comments -
Are there any benchmark tools for onnx mobile like Tensorflow Lite?
#18664 commented on
Jul 7, 2025 • 0 new comments -
Different results of consecutive runs for same input
#18672 commented on
Jul 7, 2025 • 0 new comments -
Strange condition size_t channel_rindex = is_nchw ? 2 : 2;
#18674 commented on
Jul 7, 2025 • 0 new comments -
[Web] Which node.js version is supposed to be supported?
#18078 commented on
Jul 7, 2025 • 0 new comments -
Microsoft.ML.OnnxRuntime.OpenVino Encountered unknown exception in Initialize
#18152 commented on
Jul 7, 2025 • 0 new comments -
ORT bug in Col2Im CPU 3D cases
#18156 commented on
Jul 7, 2025 • 0 new comments -
[Mobile|Android] Fatal error: ai.onnx.contrib:SentencepieceTokenizer(-1) is not a registered function/op
#18226 commented on
Jul 7, 2025 • 0 new comments -
The onnx.helper make_function command strips type information leading to inference errors
#18264 commented on
Jul 7, 2025 • 0 new comments -
[Web] onnxruntime-web and onnxruntime-node return different results for LSTM model
#18335 commented on
Jul 7, 2025 • 0 new comments -
[Performance] the speed with SetIntraOpNumThreads(1),SetIntraOpNumThreads(4),SetInterOpNumThreads(1),SetInterOpNumThreads(4)
#18385 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Does `com.microsoft.Attention` use FlashAttention-2?
#18474 commented on
Jul 7, 2025 • 0 new comments -
Add ORT Extensions to Java and build with Gradle
#18503 commented on
Jul 7, 2025 • 0 new comments -
Model Run Session wasting time[Performance]
#18510 commented on
Jul 7, 2025 • 0 new comments -
Is there any way to convert a qdqmodel to qlinearmodel use ort?
#18511 commented on
Jul 7, 2025 • 0 new comments -
[Training] qat
#18534 commented on
Jul 7, 2025 • 0 new comments -
[Performance] GPU op placement control when some ops must be on the CPU
#23154 commented on
Jul 7, 2025 • 0 new comments -
[Build] manylinux_2_28 support
#18537 commented on
Jul 7, 2025 • 0 new comments -
RunAsync C# API crashes without any error
#19140 commented on
Jul 7, 2025 • 0 new comments -
[Build] TRT EP cannot be built without CUDA EP
#18542 commented on
Jul 7, 2025 • 0 new comments -
[Build] 1.20.2 Microsoft.ML.OnnxRuntime.Managed nuget package needs Microsoft.ML.OnnxRuntime 1.20.2 which is not available
#23640 commented on
Jul 7, 2025 • 0 new comments -
Call Session class method name Run failed,don't know why
#18548 commented on
Jul 7, 2025 • 0 new comments -
Does the computation order affect the computation result?
#18564 commented on
Jul 7, 2025 • 0 new comments -
[Web] How could I get the shape of the output tensor?
#18568 commented on
Jul 7, 2025 • 0 new comments -
[Build] Building for Mac Catalyst Fails When Installed Via Cocoapods
#23307 commented on
Jul 7, 2025 • 0 new comments -
Using separate cuda streams for one session
#23319 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Max operator became 4.5X slower after Fixing NaN propagation for float16 min and max operators.
#23337 commented on
Jul 7, 2025 • 0 new comments -
memory.enable_memory_arena_shrinkage is not working in python
#23339 commented on
Jul 7, 2025 • 0 new comments -
Issue loading custom ONNX model with complex-valued operations in ONNX Runtime (C++)
#23341 commented on
Jul 7, 2025 • 0 new comments -
Memory creeping up
#23348 commented on
Jul 7, 2025 • 0 new comments -
No speedup from float16 with directml compared to cuda
#23359 commented on
Jul 7, 2025 • 0 new comments -
[Build] Possibly unintentional or misconfigured dependencies for QNN EP in onnxruntime_python.cmake
#23360 commented on
Jul 7, 2025 • 0 new comments -
[Performance] GPU Fallback to CPU Without Error When CUDA DLLs Are Missing
#23372 commented on
Jul 7, 2025 • 0 new comments -
[Performance] 40% slowdown in ONNX Resize Operator on CPU
#23391 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Round node shows huge performance drop on Windows
#23430 commented on
Jul 7, 2025 • 0 new comments -
debug result is ok, release get NaN output
#23440 commented on
Jul 7, 2025 • 0 new comments -
[QUESTION]: onnxruntime with onednn backend
#23543 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Speed-up TensorRT engine compilation
#23546 commented on
Jul 7, 2025 • 0 new comments -
Custom operators is not a registered function/op (python)
#23566 commented on
Jul 7, 2025 • 0 new comments -
[Performance] ORT-WebGPU Average Pooling is working too long in edge case
#23614 commented on
Jul 7, 2025 • 0 new comments -
TensorRT Provider "Attribute reduction is not supported"
#23618 commented on
Jul 7, 2025 • 0 new comments -
session.disable_fallback() has no effect, it always fallback to cpu
#23647 commented on
Jul 7, 2025 • 0 new comments -
OnnxRuntimeGenAIException: CUDA execution provider is not enabled in this build.
#23715 commented on
Jul 7, 2025 • 0 new comments -
[Mobile] Error: Can't load a model: Error Code - ORT_INVALID_PROTOBUF
#22927 commented on
Jul 7, 2025 • 0 new comments -
[Training] RuntimeError: gradient_builder_base.h:123 onnxruntime::training::ArgDef onnxruntime::training::GradientBuilderBase::O(size_t, bool) const i < node_->OutputDefs().size() was false
#22955 commented on
Jul 7, 2025 • 0 new comments -
[WebGPU] `Kernel "[GroupQueryAttention] /model/layers.0/attn/GroupQueryAttention" failed. Error: Input "key" is expected to have 3, 4, or 5 dimensions".`
#22987 commented on
Jul 7, 2025 • 0 new comments -
Remove Python :: 3.7 Python :: 3.8 Python :: 3.9 from pypi metadata
#22993 commented on
Jul 7, 2025 • 0 new comments -
[WebGPU] `Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528.`
#22994 commented on
Jul 7, 2025 • 0 new comments -
To reduce the compiled binary size of ONNX Runtime at x86_64 linux with "create_reduced_build_config.py", but got a Failed to find kernel for com.microsoft.nchwc.Conv(1)
#23018 commented on
Jul 7, 2025 • 0 new comments -
[Build] Dotnet packages on nuget are not built with Release optimizations
#23053 commented on
Jul 7, 2025 • 0 new comments -
[Web] ORT format model not working on WebGPU EP + Wasm Static lib
#23072 commented on
Jul 7, 2025 • 0 new comments -
[Build] onnxruntime_gpu PiPy on a slow host
#23079 commented on
Jul 7, 2025 • 0 new comments -
Cannot resolve operator 'LSTM' with webgl backend
#23083 commented on
Jul 7, 2025 • 0 new comments -
[Bug][CUDAExecutionProvider] INVALID_ARGUMENT : unsupported conv activation mode "Sigmoid"
#23114 commented on
Jul 7, 2025 • 0 new comments -
Understanding max_mem option of OrtArenaCfg class
#23121 commented on
Jul 7, 2025 • 0 new comments -
[Bug] Inconsistent Results After ONNX Runtime Optimization
#23133 commented on
Jul 7, 2025 • 0 new comments -
Inconsistent Results After ONNX Runtime Optimization
#23142 commented on
Jul 7, 2025 • 0 new comments -
[Build] Better support for vcpkg
#23158 commented on
Jul 7, 2025 • 0 new comments -
ONNX 1.17.0 integration remaining work: fix QNN EP test failures
#23163 commented on
Jul 7, 2025 • 0 new comments -
Inconsistent Results After ONNX Runtime Optimization
#23199 commented on
Jul 7, 2025 • 0 new comments -
[Inference Error] The onnx inference result is inconsistent with the numpy inference result
#23202 commented on
Jul 7, 2025 • 0 new comments -
[Build] how to build onnxruntime with openvino EP for android
#23222 commented on
Jul 7, 2025 • 0 new comments -
[Build] Xcode unit tests fail with libc++abi: terminating due to uncaught exception of type onnxruntime::OnnxRuntimeException:
#23259 commented on
Jul 7, 2025 • 0 new comments -
[Build] Build script for MacOS fails for targets older than 13.4 because tests can not be built
#24277 commented on
Jul 7, 2025 • 0 new comments -
SIGSEGV when calling OrtSession.run()
#24288 commented on
Jul 7, 2025 • 0 new comments -
[Build] Onnxruntime v1.21.0 fails to build with GCC-13
#24290 commented on
Jul 7, 2025 • 0 new comments -
quantize onnx models to INT8
#24374 commented on
Jul 7, 2025 • 0 new comments -
[Performance] [QNN EP] Performance gap between onnxruntime QNN EP and Genie from QNN SDK.
#24417 commented on
Jul 7, 2025 • 0 new comments -
[Build] Python build fails because onnxruntime/capi/build_and_package_info.py is missing
#24570 commented on
Jul 7, 2025 • 0 new comments -
[MLAS] Plan to add RISC-V Vector (RVV) support to MLAS
#24596 commented on
Jul 7, 2025 • 0 new comments -
nuget package 1.21.2 causes conflicts in Solutions targeting .NET Framework 4.8
#24599 commented on
Jul 7, 2025 • 0 new comments -
[Mobile] Objective-C API for register onnxruntime-extensions as a custom ops library
#24613 commented on
Jul 7, 2025 • 0 new comments -
[DO NOT UNPIN] ORT 1.22.0 Release Candidates available for testing
#24671 commented on
Jul 7, 2025 • 0 new comments -
Scale in resize node becomes an identity node not a parameter inside resize node
#24824 commented on
Jul 7, 2025 • 0 new comments -
Import error in pytest with onnxruntime-directml 1.22.0
#24907 commented on
Jul 7, 2025 • 0 new comments -
[Web] Fail to link static Wasm library with WebNN EP support
#24936 commented on
Jul 7, 2025 • 0 new comments -
[Build] CMake Error related to onnxruntime_unittests.cmake
#24972 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Openvino 2x slower than with OpenCV on an Intel HD Graphics 620 / 630
#25266 commented on
Jul 6, 2025 • 0 new comments -
onnxruntime with the CPUExecutionProvider errors out while processing the ReverseSequence operator
#24920 commented on
Jul 5, 2025 • 0 new comments -
[Performance] ORT takes ~11GB memory for quantizing a model of size ~1GB
#24954 commented on
Jul 5, 2025 • 0 new comments -
[Documentation]
#24958 commented on
Jul 5, 2025 • 0 new comments -
mutex issue on Mac only for release 1.21.X only
#24579 commented on
Jul 4, 2025 • 0 new comments -
Can not get USE_MIMALLOC activated in Windows
#25213 commented on
Jul 4, 2025 • 0 new comments -
terminate called after throwing an instance of 'Ort::Exception' what(): Invalid input name: serving_default_input_1:0
#23730 commented on
Jul 7, 2025 • 0 new comments -
Onnxruntime using OpenVINO for older version Intel UHD630
#23735 commented on
Jul 7, 2025 • 0 new comments -
[Build] Error building with ACL EP on aarch64 linux (Raspberry Pi 5)
#23741 commented on
Jul 7, 2025 • 0 new comments -
[Mobile] Onnxruntime react-native issue: [java.lang.ClassCastException: java.lang.String[][] cannot be cast to java.lang.String[]]
#23782 commented on
Jul 7, 2025 • 0 new comments -
[Mobile] Dynamic Shape Challenge: Enabling LLM on QNN-HTP
#23832 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Why does inference occupy so much memory?
#23867 commented on
Jul 7, 2025 • 0 new comments -
The Pad operator has a calculation error in the "reflect" mode.
#23878 commented on
Jul 7, 2025 • 0 new comments -
Bad Allocation Error in ONNX Runtime on Windows x86 CPU When Processing Multiple Images Sequentially
#23938 commented on
Jul 7, 2025 • 0 new comments -
TensorRT Support for Multiple Profiles
#23965 commented on
Jul 7, 2025 • 0 new comments -
[Build] Unsupported AVX512-FP16 Instructions in MLAS (vcvtneeph2ps, vcvtneoph2ps)
#24025 commented on
Jul 7, 2025 • 0 new comments -
Application is getting crashed while creating session for the onnxruntime-qnn with QnnCpu backend option.
#24082 commented on
Jul 7, 2025 • 0 new comments -
ImportError: Unable to import dependency onnxruntime
#24120 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime-mobile implementation on custom execution provider
#24135 commented on
Jul 7, 2025 • 0 new comments -
segmentation fault while using onnxruntime==1.21.0
#24144 commented on
Jul 7, 2025 • 0 new comments -
[Feature Request] A model with dynamic input and dynamic output。 will have a memory leak after inference with Openvino.
#24162 commented on
Jul 7, 2025 • 0 new comments -
Python Session.run_async Causes Program Exit
#24200 commented on
Jul 7, 2025 • 0 new comments -
OpenVINO EP not able to use CPU device
#24208 commented on
Jul 7, 2025 • 0 new comments -
Questions about using AMD VitisAI EP, how can i run my model on AMD NPU?
#24214 commented on
Jul 7, 2025 • 0 new comments -
[Build] OpenVINO ep for macOS
#24273 commented on
Jul 7, 2025 • 0 new comments -
[Build] Building v1.21.0: unsupported instruction 'vpdpbusds'
#24275 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Severe performance penalty with transformer model and DirectML
#20983 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime shape mismatch during quantization of yolov8 models
#21048 commented on
Jul 7, 2025 • 0 new comments -
DML cannot use device_id = 1 , run_with_iobinding failed.
#21092 commented on
Jul 7, 2025 • 0 new comments -
Symbolic Shape infer fails on onnx file without much logs
#21120 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Whisper model inference results incorrect after Transformer Optimizer
#21150 commented on
Jul 7, 2025 • 0 new comments -
ORT 1.18.1 Release Candidates available for testing
#21173 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Mapfile support for certain external data files is not working
#21195 commented on
Jul 7, 2025 • 0 new comments -
[Mobile] QNN failed to finalize QNN graph for attention layer
#21221 commented on
Jul 7, 2025 • 0 new comments -
TArray used for broadcast was limited to be within range [0, 8] on onnxruntime 1.16.3
#21254 commented on
Jul 7, 2025 • 0 new comments -
Not able to load onnx model multilingual-e5-large
#21321 commented on
Jul 7, 2025 • 0 new comments -
TensorRT EP's inference results are abnormal.
#21457 commented on
Jul 7, 2025 • 0 new comments -
[Build] Unable to build with --use_dml
#21568 commented on
Jul 7, 2025 • 0 new comments -
Memory leak in NPU inference after each one session.run
#21587 commented on
Jul 7, 2025 • 0 new comments -
[Performance]
#21635 commented on
Jul 7, 2025 • 0 new comments -
Quantized SeaLLM v2 Model Outputs Same as Input
#21636 commented on
Jul 7, 2025 • 0 new comments -
Same Model Hash Code Issue from different models
#21672 commented on
Jul 7, 2025 • 0 new comments -
[Bug]: Onnxruntime.CPU memoty leaks
#21723 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime-directml import interference with sklearn
#21724 commented on
Jul 7, 2025 • 0 new comments -
Inferencing FP16 model using onnxruntime
#21737 commented on
Jul 7, 2025 • 0 new comments -
[Web] requested dist/*.mjs files for cdnjs
#21785 commented on
Jul 7, 2025 • 0 new comments -
Dockerfile does not work
#20458 commented on
Jul 7, 2025 • 0 new comments -
[Build] cross-compiling onnxruntime for arm32 and onnxruntime_ENABLE_CPUINFO not working.
#20461 commented on
Jul 7, 2025 • 0 new comments -
RUNTIME_EXCEPTION, 80070057 The parameter is incorrect in v1.17.3
#20464 commented on
Jul 7, 2025 • 0 new comments -
[Build] cmake duplicate target "memory" between abseil and xnnpack
#20469 commented on
Jul 7, 2025 • 0 new comments -
[Build] Error when load pf16 model
#20570 commented on
Jul 7, 2025 • 0 new comments -
DirectML Exception 80070057 "The parameter is incorrect"
#20575 commented on
Jul 7, 2025 • 0 new comments -
windows系统,Java中使用onnxruntime进行压测,cpu飙升很快,一直100%
#20593 commented on
Jul 7, 2025 • 0 new comments -
Missing dll cudnn_ops_infer64_8.dll does not generate a python error
#20605 commented on
Jul 7, 2025 • 0 new comments -
[BUG] Running operations over concat output rewrites it's values
#20606 commented on
Jul 7, 2025 • 0 new comments -
[Discussion] ORT GPU binaries do not contain DML
#20638 commented on
Jul 7, 2025 • 0 new comments -
[Build] TVM EP Build
#20665 commented on
Jul 7, 2025 • 0 new comments -
LayerNormalization doesnt' work as expected on Mac
#20676 commented on
Jul 7, 2025 • 0 new comments -
User-provided session logging function is not used for every log
#20680 commented on
Jul 7, 2025 • 0 new comments -
Broken multithreading inference session Onnxruntime-directml >= 1.18
#20713 commented on
Jul 7, 2025 • 0 new comments -
Windows ARM64 & X64 CLIP Image Encoder different results
#20722 commented on
Jul 7, 2025 • 0 new comments -
[Build] quantization unittest failed when run all tests
#20821 commented on
Jul 7, 2025 • 0 new comments -
[.NET] Update tensor implementations to new Tensor<T> type
#20874 commented on
Jul 7, 2025 • 0 new comments -
Java CreateTensor with NIO ByteBuffer for reuse purpose
#20882 commented on
Jul 7, 2025 • 0 new comments -
[Build] how to buid on openharmony?
#20895 commented on
Jul 7, 2025 • 0 new comments -
Stateful/Memory models
#20943 commented on
Jul 7, 2025 • 0 new comments -
Upcoming ORT 1.20 Release Overview
#22274 commented on
Jul 7, 2025 • 0 new comments -
[Performance] High CUDA memory usage with ONNX Runtime and inconsistent memory release
#22297 commented on
Jul 7, 2025 • 0 new comments -
Build failure on Windows 10 using OpenVino 2024.3 & 2024.4 both.
#22314 commented on
Jul 7, 2025 • 0 new comments -
`quant_pre_process SymbolicShapeInference` causes AttributeError: 'NoneType' object has no attribute 'HasField' when the model has a Constant node.
#22422 commented on
Jul 7, 2025 • 0 new comments -
The EP_CTX_BLOB seems to have both WRITE and EXECUTABLE permissions enabled
#22437 commented on
Jul 7, 2025 • 0 new comments -
External data is not loaded with custom allocator
#22468 commented on
Jul 7, 2025 • 0 new comments -
[Performance] C++ api: destroy the execution provider if the `Ort::Session` is destroyed
#22511 commented on
Jul 7, 2025 • 0 new comments -
DistilBERT model inference failure using ONNX Runtime QNNExecutionProvider on Snapdragon® X Elite NPU
#22532 commented on
Jul 7, 2025 • 0 new comments -
[DO NOT UNPIN] ORT Nightly Package Name Change
#22541 commented on
Jul 7, 2025 • 0 new comments -
Negative output for sigmoid
#22557 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Model runtime spiky with TensorRT Execution Provider
#22664 commented on
Jul 7, 2025 • 0 new comments -
Exception during initialization: safeint.h:17 static void SafeIntExceptionHandler<onnxruntime::OnnxRuntimeException>::SafeIntOnOverflow() Integer overflow - caused by int64 index of -1?
#22694 commented on
Jul 7, 2025 • 0 new comments -
FP16 ONNX model outputs NaN after the first successful execution
#22723 commented on
Jul 7, 2025 • 0 new comments -
CUDA providers failed to build against 12.6 with error error #221-D
#22728 commented on
Jul 7, 2025 • 0 new comments -
why force max_length <= kMaxSequenceLength in beam_search_parameters.cc ?
#22735 commented on
Jul 7, 2025 • 0 new comments -
[TensorRT EP] How can I disable generating cache when using trt execution provider
#22822 commented on
Jul 7, 2025 • 0 new comments -
[Dev] "./onnxruntime_test_all --help" gives segmentation fault
#22838 commented on
Jul 7, 2025 • 0 new comments -
how to release gpu memory when use onnxruntime with fastapi
#22899 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Binary operators using SSE on AVX systems
#22905 commented on
Jul 7, 2025 • 0 new comments -
run_async not running asynchronously
#21791 commented on
Jul 7, 2025 • 0 new comments -
[Bug] [onnxruntime-node] Error: no available backend found. ERR: [wasm] backend not found.
#21813 commented on
Jul 7, 2025 • 0 new comments -
Error when trying to run vision model onnx
#21869 commented on
Jul 7, 2025 • 0 new comments -
[Build] “onnxruntime_cxx_api.h”: No such file or directory
#21891 commented on
Jul 7, 2025 • 0 new comments -
Snapdragon X processor is unsupported
#21947 commented on
Jul 7, 2025 • 0 new comments -
[Mobile] IOS library crashes in Release configuration
#21960 commented on
Jul 7, 2025 • 0 new comments -
[Web] Uncaught WebGPU validation error on Snapdragon SM8450 but works on SM8250
#21970 commented on
Jul 7, 2025 • 0 new comments -
[Web] no available backend found [wasm] when importing `onnxruntime-web/wasm`
#22010 commented on
Jul 7, 2025 • 0 new comments -
[Build] onnxruntime-openvino library does not have python3.12 support
#22015 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime-gpu(1.18.0) can not be install
#22028 commented on
Jul 7, 2025 • 0 new comments -
[Training] Implicit dependency of Python training API on 'torch' package
#22070 commented on
Jul 7, 2025 • 0 new comments -
GetElementType is not implemented after updating onnxruntime
#22075 commented on
Jul 7, 2025 • 0 new comments -
[Web] Error when using Web Workers on Next.js
#22113 commented on
Jul 7, 2025 • 0 new comments -
[Question or BUG] ONNX Runtime CUDA Sessions in Unity Produce Empty Outputs When Running Multiple Models Sequentially on a Single Graphic Card
#22146 commented on
Jul 7, 2025 • 0 new comments -
Warnings displayed as errors during TensorRT optimization.
#22164 commented on
Jul 7, 2025 • 0 new comments -
trt_weight_stripped_engine_enable does not work for all networks/size ranges.
#22165 commented on
Jul 7, 2025 • 0 new comments -
trt_weight_stripped_engine_enable does not work together with trt_dump_ep_context_model
#22179 commented on
Jul 7, 2025 • 0 new comments -
Filenames in OrtTensorRTProviderOptionsV2 should be std::filesystem::path or at least const ORTCHAR_T*
#22182 commented on
Jul 7, 2025 • 0 new comments -
[CANN] When using onnxruntime-cann for inference, it failed to utilize the NPU for inference
#22229 commented on
Jul 7, 2025 • 0 new comments -
[Performance] fp16 support and performance
#22242 commented on
Jul 7, 2025 • 0 new comments -
Basic Optimizer adds non-standard ONNX ops for roi_align
#14753 commented on
Jul 7, 2025 • 0 new comments -
Basic Optimizer adds non-standard ONNX ops for input tensor
#14754 commented on
Jul 7, 2025 • 0 new comments -
[Build] cmake install when --use_xnnpack is broken
#14757 commented on
Jul 7, 2025 • 0 new comments -
Failed to build CUDA docker image[Build]
#14765 commented on
Jul 7, 2025 • 0 new comments -
`onnx.checker.check_model` raises `Bad node spec` for custom nodes created from ORT `optimize_model`
#14768 commented on
Jul 7, 2025 • 0 new comments -
Dependency Problem (java onnxruntime)
#14787 commented on
Jul 7, 2025 • 0 new comments -
[Build] Can't access OrtSessionOptionsAppendExecutionProvider_Dnnl while using oneDNN
#14799 commented on
Jul 7, 2025 • 0 new comments -
[Build] Dockerfile.arm64 build fails
#14801 commented on
Jul 7, 2025 • 0 new comments -
[Build] Unable to load TensorRT Execution Provider
#14802 commented on
Jul 7, 2025 • 0 new comments -
Read access violation under OnnxRuntimeCpuSessionBuilder::Initialize during WinML operator tests for function operators
#14810 commented on
Jul 7, 2025 • 0 new comments -
[Web] how to reduce wasm file size
#14817 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime with CUDA not releasing about 400 MB memory after the session and environment is destroyed
#14819 commented on
Jul 7, 2025 • 0 new comments -
working model with Resize node becomes invalid after using convert_float_to_float16
#14827 commented on
Jul 7, 2025 • 0 new comments -
How do I pass a list of tensors in onnxruntime-web?
#14829 commented on
Jul 7, 2025 • 0 new comments -
DML EP cannot load some quantized onnx files.
#14835 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Performance degradation while using dynamic axes
#14863 commented on
Jul 7, 2025 • 0 new comments -
UndefinedBehaviorSanitizer reports problem in onnxruntime_global_thread_pools_test
#14882 commented on
Jul 7, 2025 • 0 new comments -
[Build] Error APPX1101 - Payload contains two or more files with the same destination path 'microsoft.ai.machinelearning.dll'
#14915 commented on
Jul 7, 2025 • 0 new comments -
[Performance]
#14919 commented on
Jul 7, 2025 • 0 new comments -
Is there a Python way to get the max supported ONNX IR version from ORT package?
#14932 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Memory grows after reloading model
#14641 commented on
Jul 7, 2025 • 0 new comments -
[Build] Building for C++ On Jetson Nano CUDA 10.2
#14644 commented on
Jul 7, 2025 • 0 new comments -
TensorRT Execution Build Fails on Jetson Jetpack 4.6.1
#14658 commented on
Jul 7, 2025 • 0 new comments -
DEEPFACE LIVE Issue with onnxruntime_pybind_state.
#14667 commented on
Jul 7, 2025 • 0 new comments -
[Build]
#14674 commented on
Jul 7, 2025 • 0 new comments -
Custom Operater Output Tensor Shape Error
#14683 commented on
Jul 7, 2025 • 0 new comments -
[Web] Inference speed halves if you open DevTools after loading an inference session, even if you close DevTools afterwards
#14692 commented on
Jul 7, 2025 • 0 new comments -
`CleanUnusedInitializersAndNodeArgs` warnings are printed only with subgraphs
#14694 commented on
Jul 7, 2025 • 0 new comments -
A runtime can run on cuda device 0 but fail on cuda device 1
#14710 commented on
Jul 7, 2025 • 0 new comments -
Non-zero status code returned while running Reshape node. Name:'Reshape_7411' The input tensor cannot be reshaped to the requested shape. Input shape:{51}, requested shape:{}
#14712 commented on
Jul 7, 2025 • 0 new comments -
How to inference with multiple batches and multiple inputs.
#14713 commented on
Jul 7, 2025 • 0 new comments -
Crash in JavaGPU on Windows
#14714 commented on
Jul 7, 2025 • 0 new comments -
The Microsoft.ML.OnnxRuntime.Gpu nuget on Visual studio latest version 1.14.0 has a bug when running with the tensorrt on run time.
#14730 commented on
Jul 7, 2025 • 0 new comments -
clog_vlog_fatal[Build]
#14740 commented on
Jul 7, 2025 • 0 new comments -
[Performance] How to create multiple tensors with consecutive addresses when the cuda memory is not occupied?
#14742 commented on
Jul 7, 2025 • 0 new comments -
Memory Leak
#14745 commented on
Jul 7, 2025 • 0 new comments -
[Build] macOS: cross compiling arm64 on intel fails
#14746 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Can oneDNN EP accelerate the inference time of onnxruntime on x86 machines?
#14749 commented on
Jul 7, 2025 • 0 new comments -
Basic Optimizer adds non-standard ONNX ops
#14752 commented on
Jul 7, 2025 • 0 new comments -
[bug] error while loading shared libraries: libonnxruntime.so.1.8.1: cannot open shared object file: No such file or directory
#15053 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Inference doubles VRAM (DirectML)
#15074 commented on
Jul 7, 2025 • 0 new comments -
[Web] Memory spike in ORT-web leading to app crash
#15086 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime: bfc_arena.cc:361 void* onnxruntime::BFCArena::FindChunkPtr(onnxruntime::BFCArena::BinNum, size_t, size_t) !chunk->in_use() was false.
#15087 commented on
Jul 7, 2025 • 0 new comments -
The dimension of incides to ScatterND op is wrong during inference.
#15095 commented on
Jul 7, 2025 • 0 new comments -
[Performance] onnxruntime allocates lots of cuda memory on T4
#15098 commented on
Jul 7, 2025 • 0 new comments -
fail build with gcc 12.x in onnxruntime/contrib_ops/cuda/quantization/qordered_ops/qordered_qdq.cc
#15111 commented on
Jul 7, 2025 • 0 new comments -
How to reduce GPU memory usage when inference
#15127 commented on
Jul 7, 2025 • 0 new comments -
descriptor_table_tensorboard_2fcompat_2fproto_2fattr_5fvalue_2eproto not declared (TRT 8.5.0)
#15131 commented on
Jul 7, 2025 • 0 new comments -
how to inference with fp16 precise in python code?
#15134 commented on
Jul 7, 2025 • 0 new comments -
NOT_IMPLEMENTED GridSample(16) on onnxruntime 1.14.1
#15137 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime::utils::ConstantNodeProtoToTensorProto Unsupported attribute value type of 9 in 'Constant' node 'Constant_35'
#15149 commented on
Jul 7, 2025 • 0 new comments -
Type Error: Type 'tensor(int64)' of input parameter (relative_position) of operator (Min) in node (Min_2286) is invalid.
#15167 commented on
Jul 7, 2025 • 0 new comments -
inference speed is very slow when using fp16 while using fp 32 is normal
#15170 commented on
Jul 7, 2025 • 0 new comments -
A bug occurs when the program terminates
#15174 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Why is the Conv + Max-Pool model faster than the Conv model using GraphOptimizationLevel::ORT::ENABLE_ALL?
#15180 commented on
Jul 7, 2025 • 0 new comments -
[Performance] GPT NEO: better performance of python GPT NEO than its onnx runtime version in C++?
#15191 commented on
Jul 7, 2025 • 0 new comments -
[Build] segfault when run unitest (ctest)
#15224 commented on
Jul 7, 2025 • 0 new comments -
[Build] fail to build on Windows ARM64
#15252 commented on
Jul 7, 2025 • 0 new comments -
[Performance] How to debug/reduce GPU utilization?
#15254 commented on
Jul 7, 2025 • 0 new comments -
[Performance] 3-100x regression when opset 16 or 17 is used (CUDA EP)
#14956 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Can not release memory in gpu.
#14957 commented on
Jul 7, 2025 • 0 new comments -
Reuse output tensors memory that was allocated by first call to Ort::Session.Run(...)
#14960 commented on
Jul 7, 2025 • 0 new comments -
Compatibility between Onnx and Blazor Webassembly
#14962 commented on
Jul 7, 2025 • 0 new comments -
Running T5 export ONNX example leads to shape inference error
#14963 commented on
Jul 7, 2025 • 0 new comments -
Microsoft.ML.OnnxRuntime.Gpu not working in MAUI project
#14974 commented on
Jul 7, 2025 • 0 new comments -
[Build] Failed to build in docker container
#14983 commented on
Jul 7, 2025 • 0 new comments -
conv throws safeint exception
#14985 commented on
Jul 7, 2025 • 0 new comments -
Static linkage of onnx_runtime and providers library
#14986 commented on
Jul 7, 2025 • 0 new comments -
[Build] static assertion fails when building from source with GCC 13.0.1
#14991 commented on
Jul 7, 2025 • 0 new comments -
[Performance] inference problems with io_binding: unexpected shape or unexpected data type
#14998 commented on
Jul 7, 2025 • 0 new comments -
[Performance] TensorRT provider produces (slightly) differently named engine files for the same model between runs
#14999 commented on
Jul 7, 2025 • 0 new comments -
CUDA Graph Error - CUDA failure 900: operation not permitted when stream is capturing
#15002 commented on
Jul 7, 2025 • 0 new comments -
ONNX does not support Dirichlet distribution?
#15016 commented on
Jul 7, 2025 • 0 new comments -
[Build] Problems with FP16 Layernorm
#15021 commented on
Jul 7, 2025 • 0 new comments -
[Build] api-ms-win-core-heap-l2-1-0.dll missing on windows server 2012 R2
#15025 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126
#15035 commented on
Jul 7, 2025 • 0 new comments -
accuracy reduced with multithreaded GPU prediction
#15038 commented on
Jul 7, 2025 • 0 new comments -
mT5 convert to ONNX and GPU inference problems
#15042 commented on
Jul 7, 2025 • 0 new comments -
[Build] Cannot specify compile definitions for target "onnx" which is not built by this project.
#15051 commented on
Jul 7, 2025 • 0 new comments -
[Performance] CUDA EP with Strange Inference Time
#14016 commented on
Jul 7, 2025 • 0 new comments -
[Performance] the speed and cpu utilization with SetIntraOpNumThreads(1) and SetIntraOpNumThreads(2)
#14018 commented on
Jul 7, 2025 • 0 new comments -
[Performance] onnx vs pt memory usage
#14029 commented on
Jul 7, 2025 • 0 new comments -
[Performance] High memory use by CUDAProvider in Jetson Xavier NX(JetPack 4.4)
#14038 commented on
Jul 7, 2025 • 0 new comments -
java onnxruntime_providers_cuda.dll
#14047 commented on
Jul 7, 2025 • 0 new comments -
There is a vulnerability in torch:1.12.0,upgrade recommended
#14059 commented on
Jul 7, 2025 • 0 new comments -
[Build] impossible to build onnxruntime with vs2022
#14086 commented on
Jul 7, 2025 • 0 new comments -
[Build] core/framework/fence.h not found while build upon CANN
#14121 commented on
Jul 7, 2025 • 0 new comments -
300% slower on MYRIAD_FP16 when using CustomVision fp16 model
#14125 commented on
Jul 7, 2025 • 0 new comments -
[Training] Does the current training code support RNN model like seq2seq and Transformer and GNN model?
#14139 commented on
Jul 7, 2025 • 0 new comments -
[Build] Dockerfile.arm64 - No module named 'packaging' error
#14140 commented on
Jul 7, 2025 • 0 new comments -
CUDNN error executing cudnnConvolutionForward
#14186 commented on
Jul 7, 2025 • 0 new comments -
ONNXRuntime outputs numerically incorrect results for mixed precision models.
#14189 commented on
Jul 7, 2025 • 0 new comments -
Infer shape incorrect for Split with opset 15
#14200 commented on
Jul 7, 2025 • 0 new comments -
ConvTranspose2d onnxruntime and pytorch forward results are inconsistent
#14208 commented on
Jul 7, 2025 • 0 new comments -
`onnxruntime.quantization` does not support `.onnx` files produced by `tf2onnx.convert.from_function` with the `large_model` option set to `True`
#14213 commented on
Jul 7, 2025 • 0 new comments -
No module named 'onnxruntime.transformers.io_binding_helper'
#14230 commented on
Jul 7, 2025 • 0 new comments -
The input tensor cannot be reshaped to the requested shape.
#14237 commented on
Jul 7, 2025 • 0 new comments -
Valgrind: Source and destination overlap in memcpy_chk
#14254 commented on
Jul 7, 2025 • 0 new comments -
[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_3 node. Name:'OpenVINOExecutionProvider_OpenVINO-EP-subgraph_3_0'
#14280 commented on
Jul 7, 2025 • 0 new comments -
Cannot run inference on Integrated Graphics with OpenVino EP using C Sharp API
#13772 commented on
Jul 7, 2025 • 0 new comments -
QDQ not instrumenting inputs if first operator is a SUM
#13794 commented on
Jul 7, 2025 • 0 new comments -
how bring my hardware backend to onnxruntime framework
#13797 commented on
Jul 7, 2025 • 0 new comments -
[WebGL] cannot resolve operator 'DynamicQuantizeLinear' with opsets: ai.onnx v16, ...
#13800 commented on
Jul 7, 2025 • 0 new comments -
[CPUExecutionProvider] PyTorch/Numpy operations following InferenceSession.run() are 50x slower compared to using dummy inputs
#13808 commented on
Jul 7, 2025 • 0 new comments -
hello,how to improve [Performance] in batch inference with multicore cpu
#13820 commented on
Jul 7, 2025 • 0 new comments -
Dynamic quantization is useless on AMD cpus(AMD EPYC 7K62 48-Core Processor)
#13872 commented on
Jul 7, 2025 • 0 new comments -
SSDLite 320: RuntimeException on CUDA. TopK index assert was false.
#13876 commented on
Jul 7, 2025 • 0 new comments -
Segmentation Faults when using TensorRT on Jetson Orin Dev Kit
#13877 commented on
Jul 7, 2025 • 0 new comments -
Model run with `TensorrtExecutionProvider` outputs different results compared to `CPUExecutionProvider` / `CUDAExecutionProvider` when the ONNX `Loop` operator is used
#13894 commented on
Jul 7, 2025 • 0 new comments -
[Web] dynamic batch size doesn't work when use webgl provider
#13909 commented on
Jul 7, 2025 • 0 new comments -
[Web] ort-wasm-simd.wasm can't be loaded in Electron renderer (using webpack)
#13933 commented on
Jul 7, 2025 • 0 new comments -
[Build] Incomplete type used in nested name specifier, Ubuntu
#13942 commented on
Jul 7, 2025 • 0 new comments -
Do I need to convert data to device for TensorRTExecutionProvider?
#13952 commented on
Jul 7, 2025 • 0 new comments -
CUDA provider gives different result with respect to CPU
#13962 commented on
Jul 7, 2025 • 0 new comments -
bug: onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running OpenVINO-EP-subgraph_4 node.
#13973 commented on
Jul 7, 2025 • 0 new comments -
Non-zero status code returned while running Resize node
#13975 commented on
Jul 7, 2025 • 0 new comments -
Java Problematic Frame [libonnxruntime.dylib+0x8212be] onnxruntime::DataTypeImpl::ToString(onnxruntime::DataTypeImpl const*)+0xe
#13976 commented on
Jul 7, 2025 • 0 new comments -
[Performance] [webgl]bad performance of webgl
#13986 commented on
Jul 7, 2025 • 0 new comments -
[windows7] Unable to load DLL 'onnxruntime.dll': The specified module could not be found.
#14003 commented on
Jul 7, 2025 • 0 new comments -
[Performance] cuda_options.arena_extend_strategy = 1 does not free memory
#14474 commented on
Jul 7, 2025 • 0 new comments -
[Performance] DirectML cost more memory than CPU when process the Win32(X86) program (official demo).
#14479 commented on
Jul 7, 2025 • 0 new comments -
[Performance] CPU Usage is too high
#14490 commented on
Jul 7, 2025 • 0 new comments -
[Performance] cuDNN lib mismatch let to a underutilization of GPU
#14498 commented on
Jul 7, 2025 • 0 new comments -
missing headers and pkgconfig files in binary packages distribution (from github releases) (linux)
#14503 commented on
Jul 7, 2025 • 0 new comments -
[Web] Runtime error using `onnxruntime-node` with webpack
#14505 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Find out why the GPU memory allocated with `CUDAExecutionProvider` is much larger than the ONNX size
#14526 commented on
Jul 7, 2025 • 0 new comments -
Non-zero status code returned while running DnnlCustomOp2 node
#14543 commented on
Jul 7, 2025 • 0 new comments -
Check and modify the weights of a layer of an onnx model at runtime
#14545 commented on
Jul 7, 2025 • 0 new comments -
[Performance] DirectML Dynamic Axes very slow
#14550 commented on
Jul 7, 2025 • 0 new comments -
[BUG] FusedConv node error
#14561 commented on
Jul 7, 2025 • 0 new comments -
fp32 model with autocast to fp16: Shape mismatch attempting to re-use buffer
#14582 commented on
Jul 7, 2025 • 0 new comments -
[Build] cuda dll wrap up
#14585 commented on
Jul 7, 2025 • 0 new comments -
different results with onnxruntime-gpu-1.10
#14587 commented on
Jul 7, 2025 • 0 new comments -
[Web] currently non-1 steps is not supported for Slice
#14588 commented on
Jul 7, 2025 • 0 new comments -
Destroying an inference session without exiting the python process
#14590 commented on
Jul 7, 2025 • 0 new comments -
C# - CUDA Nuget BUG : DefaultLogger Attempt to use DefaultLogger but none has been registered.
#14593 commented on
Jul 7, 2025 • 0 new comments -
Onnxruntime Arm NN Ep build error.
#14611 commented on
Jul 7, 2025 • 0 new comments -
[Performance]
#14615 commented on
Jul 7, 2025 • 0 new comments -
[Build] cpp_field.h(189,47): error C2059: 语法错误:“)”
#14627 commented on
Jul 7, 2025 • 0 new comments -
[Build] Docker arm64 build fails.
#14283 commented on
Jul 7, 2025 • 0 new comments -
ONNX Runtime support for the graph optimization of bigbird_pegasus model
#14295 commented on
Jul 7, 2025 • 0 new comments -
TensorRT EP same inference Time of INT 8 and FP 16
#14315 commented on
Jul 7, 2025 • 0 new comments -
STFT op has the wrong expected shape
#14316 commented on
Jul 7, 2025 • 0 new comments -
[Performance] running on xavier gpu but cpu usage high
#14676 commented on
Jul 7, 2025 • 0 new comments -
Program will stuck when creating 'Ort::Session'
#14317 commented on
Jul 7, 2025 • 0 new comments -
[Performance] ONNXruntime CPU is slower than Pytorch Tracing to Torchscript on CPU
#14326 commented on
Jul 7, 2025 • 0 new comments -
RemoveNode Should be unreachable if CanRemoveNodeAndMergeEdges is in sync with the logic
#14360 commented on
Jul 7, 2025 • 0 new comments -
[Bug] Attention and QAttention don't work properly in some cases
#14363 commented on
Jul 7, 2025 • 0 new comments -
Add some custom QlinearXXX Ops
#14365 commented on
Jul 7, 2025 • 0 new comments -
[Build] Error in builiding with Tensorrt EP
#14394 commented on
Jul 7, 2025 • 0 new comments -
Import Error " cannot import name 'get_all_providers' "
#14395 commented on
Jul 7, 2025 • 0 new comments -
[Training] The gradient builder has not been registered: ReduceMin
#14412 commented on
Jul 7, 2025 • 0 new comments -
Free allocated data of Ort::Value in C++
#14420 commented on
Jul 7, 2025 • 0 new comments -
Pad operator not quantizable?
#14422 commented on
Jul 7, 2025 • 0 new comments -
Different Python exceptions on OOM with `run_with_iobinding` and `run`
#14438 commented on
Jul 7, 2025 • 0 new comments -
Modifying QlinearADD
#14441 commented on
Jul 7, 2025 • 0 new comments -
[ONNXRuntimeError] Unsupported OrtValue type with CUDA EP
#14457 commented on
Jul 7, 2025 • 0 new comments -
[Performance] There is some confusion with onnx + oneDNN or onnx + OpenVINO
#14468 commented on
Jul 7, 2025 • 0 new comments -
[Build]
#14471 commented on
Jul 7, 2025 • 0 new comments -
[ onnxruntime::SequentialExecutor::Execute] Non-zero status code returned while running FusedMatMul node. Name:'MatMul_With_Transpose_token_14_FusedMatMulAndScale' Status Message: bad allocation unknown file: error: C++ exception with description "Non-zero status code returned while running FusedMatMul node. Name:'MatMul_With_Transpose_token_14_FusedMatMulAndScale' Status Message: bad allocation" thrown in the test body.
#16305 commented on
Jul 7, 2025 • 0 new comments -
issue running onnxruntime with pytest
#16306 commented on
Jul 7, 2025 • 0 new comments -
How to catch exception OOM.
#16307 commented on
Jul 7, 2025 • 0 new comments -
How to edit Clip Operator in OnnxRuntime?
#16315 commented on
Jul 7, 2025 • 0 new comments -
get error when using libonnxruntime with dnnl EP
#16320 commented on
Jul 7, 2025 • 0 new comments -
Increase - decrease the maximum number of events during inference profiling.
#16334 commented on
Jul 7, 2025 • 0 new comments -
[Build] Fails to parse FP16 LayerNormalization in opset>=18
#16341 commented on
Jul 7, 2025 • 0 new comments -
[Build] Disable ORT_ENABLE_STREAM build error
#16345 commented on
Jul 7, 2025 • 0 new comments -
MaxPool: When Ceil_mode=1, MaxPool Generates Big Values.
#16350 commented on
Jul 7, 2025 • 0 new comments -
AveragePool: When Ceil_mode=1, AveragePool Generates Nan or 0 Values.
#16351 commented on
Jul 7, 2025 • 0 new comments -
[Training]
#16354 commented on
Jul 7, 2025 • 0 new comments -
multi-GPU inferencing
#16382 commented on
Jul 7, 2025 • 0 new comments -
Operator Pad reflect mode does not yield correct results
#16401 commented on
Jul 7, 2025 • 0 new comments -
[Web] Web ~40x slower than native
#16412 commented on
Jul 7, 2025 • 0 new comments -
[Performance] DML dynamic axes performance regression.
#16424 commented on
Jul 7, 2025 • 0 new comments -
C++ Runtime does not recognize supposedly correct input.
#16430 commented on
Jul 7, 2025 • 0 new comments -
Normalizer does not work as expected
#16451 commented on
Jul 7, 2025 • 0 new comments -
[Mobile] Unable to load models in Xamarin iOS
#16463 commented on
Jul 7, 2025 • 0 new comments -
[Performance] net.set_providers(['DmlExecutionProvider'], [{'device_id': 0}]) could get stuck forever (directml EP)
#16473 commented on
Jul 7, 2025 • 0 new comments -
no acceleration onnx on e5 2680v3
#16185 commented on
Jul 7, 2025 • 0 new comments -
[Performance] setIntraOpNumThreads doesn't offer enough parallelization in JAVA-API
#16192 commented on
Jul 7, 2025 • 0 new comments -
DmlExecutionProvider bound to PyTorch tensor stops running
#16197 commented on
Jul 7, 2025 • 0 new comments -
NullReferenceException when creating an object of class SessionOptions | Unity
#16205 commented on
Jul 7, 2025 • 0 new comments -
[quantization] Problem with QDQ of Pow/Sqrt/Div
#16219 commented on
Jul 7, 2025 • 0 new comments -
why the input doesn't place in cuda ?
#16225 commented on
Jul 7, 2025 • 0 new comments -
[Training][api:C++][feature request] Support Model Forward Output and Backward Gradient Extraction in ONNX runtime training
#16232 commented on
Jul 7, 2025 • 0 new comments -
TensorrtExecutionProvider::GetSupportedList graph_build.Resolve().IsOK() was false.
#16234 commented on
Jul 7, 2025 • 0 new comments -
Not returning anything for out-of-vocabulary text while batch inference using Tf-IDF ONNX Vectorizer model
#16251 commented on
Jul 7, 2025 • 0 new comments -
Inconsistent generation of vectors by TF-IDF ONNX Vectorizer Model
#16252 commented on
Jul 7, 2025 • 0 new comments -
[OOM] Unable to convert 30B Model
#16254 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Evaluation behavior with external arrays (C API)
#16255 commented on
Jul 7, 2025 • 0 new comments -
onnx use more memory than pytorch for some model
#16264 commented on
Jul 7, 2025 • 0 new comments -
[Web/Build] Failed to consume onnxruntime-common because of JS parser not up-to-date
#16265 commented on
Jul 7, 2025 • 0 new comments -
how to trace the error "assert node is not None" when use the onnxruntime.transformers.optimizer
#16268 commented on
Jul 7, 2025 • 0 new comments -
ROCm EP: Errors when trying to infer, which GPUs are supported?
#16271 commented on
Jul 7, 2025 • 0 new comments -
[Accuracy/Performance]
#16275 commented on
Jul 7, 2025 • 0 new comments -
Does OnnxruntimeV1.14 still support the Python Operator, and which the highest version supports this feature?
#16277 commented on
Jul 7, 2025 • 0 new comments -
Seg faults when creating InferenceSession for SAM backbone
#16300 commented on
Jul 7, 2025 • 0 new comments -
[Mobile] Error: Non string type of a tensor data is not allowed
#16301 commented on
Jul 7, 2025 • 0 new comments -
[Bug] Coqui VITS ONNX model can't be statically quantized.
#16738 commented on
Jul 7, 2025 • 0 new comments -
CUDA Custom Op CUDA failure
#16748 commented on
Jul 7, 2025 • 0 new comments -
clean build v1.15.1 fails three fp16 tests due to `difference between... exceeds threshold"
#16775 commented on
Jul 7, 2025 • 0 new comments -
[Performance] FP16 models incur large cast latency when run on CPUs without FP16 support
#16778 commented on
Jul 7, 2025 • 0 new comments -
Incorrect Output from Java Model
#16781 commented on
Jul 7, 2025 • 0 new comments -
Segmentation Fault when using TensorRT execution provider
#16790 commented on
Jul 7, 2025 • 0 new comments -
[Performance] using onnxruntime with ray and also fix for memory footprint too high
#16793 commented on
Jul 7, 2025 • 0 new comments -
[Performance] [Web] Using the `onnxruntime-web` package (`wasm` backend) with Node.js is 1.6x to 2x faster than in browsers and Deno?
#16798 commented on
Jul 7, 2025 • 0 new comments -
[Training] Proposal: Implement back propagation algorithm for C#
#16809 commented on
Jul 7, 2025 • 0 new comments -
[Performance]
#16817 commented on
Jul 7, 2025 • 0 new comments -
[Mobile] Failure to load whisper model .ort with react-native, regular and quantized versions
#16819 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime_providers_cuda.dll cannot be loaded due to "Can't find dependent libraries" under Windows 10 environment using Java
#16821 commented on
Jul 7, 2025 • 0 new comments -
op.SequenceEmpty(dtype=xxx) cannot be set to float16.
#16846 commented on
Jul 7, 2025 • 0 new comments -
[Performance]high latency variance
#16876 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Convolution layer issue profiling
#16926 commented on
Jul 7, 2025 • 0 new comments -
[Mobile][Kotlin] OnnxTensor.createTensor from floatBuffer takes up 7 seconds
#16937 commented on
Jul 7, 2025 • 0 new comments -
Crash at winrt::Windows::AI::MachineLearning::implementation::LearningModelSession::GetResults during inferencing
#16988 commented on
Jul 7, 2025 • 0 new comments -
Native assemblies aren't copied when Onnx is a transitive dependency and using netstandard
#17010 commented on
Jul 7, 2025 • 0 new comments -
Why onnxruntime extracts only 483MB json file?
#17013 commented on
Jul 7, 2025 • 0 new comments -
Why some models are not profiling the input weights, bias etc and the node index in the json file properly?
#17022 commented on
Jul 7, 2025 • 0 new comments -
Automatic deallocation (?) of the Ort::Sessions, memory leak?
#16497 commented on
Jul 7, 2025 • 0 new comments -
m2m 100 418M
#16480 commented on
Jul 7, 2025 • 0 new comments -
[Performance] A model with a large TreeEnsembleClassifier node takes too long to be loaded
#16511 commented on
Jul 7, 2025 • 0 new comments -
Setting `CUBLAS_WORKSPACE_CONFIG=":4096:8"` leads to `CUBLAS_STATUS_ALLOC_FAILED`
#16512 commented on
Jul 7, 2025 • 0 new comments -
ONNXRuntimeError: Training mode does not support BN opset 14 (or higher) yet.
#16867 commented on
Jul 7, 2025 • 0 new comments -
[Build] INVALID_ARGUMENT : Invalid rank for input: input Got: 4 Expected: 2 Please fix either the inputs or the model.
#16557 commented on
Jul 7, 2025 • 0 new comments -
[Build] libonnxruntime_providers_dnnl.so: undefined symbol: omp_get_max_threads
#16561 commented on
Jul 7, 2025 • 0 new comments -
Large model >2GB save_to_ort
#16573 commented on
Jul 7, 2025 • 0 new comments -
[Build] fatal error: too many errors emitted, stopping now [-ferror-limit=]
#16576 commented on
Jul 7, 2025 • 0 new comments -
[Build] Cannot build onnxruntime
#16583 commented on
Jul 7, 2025 • 0 new comments -
Conv3d precision error between pytorch and onnx
#16589 commented on
Jul 7, 2025 • 0 new comments -
[Training] Define a custom training with some ONNX models
#16597 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Performance degradation observed w.r.t DNNL-EP in v1.15.1 compared to v1.13.1
#16609 commented on
Jul 7, 2025 • 0 new comments -
[Build] No C++ library is generated after compilation completed
#16610 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Computation time of iteratively applying neural network in a single ONNX model using CUDA Execution Provider dominated by Memcpy
#16625 commented on
Jul 7, 2025 • 0 new comments -
[Build] Dependency on OMP/MPI Runtime
#16631 commented on
Jul 7, 2025 • 0 new comments -
[Performance]
#16637 commented on
Jul 7, 2025 • 0 new comments -
The input tensor cannot be reshaped to the requested shape after adding Gather output to model's output
#16670 commented on
Jul 7, 2025 • 0 new comments -
Access violation reading location when I use CreateArenaCfgV2 and CUDA
#16686 commented on
Jul 7, 2025 • 0 new comments -
One path in the graph requests feature X(>Y) but input tensor has Y features
#16695 commented on
Jul 7, 2025 • 0 new comments -
InferenceSession fails with segmentation fault when fp16 model is loaded with CPUExecutionProvider
#15494 commented on
Jul 7, 2025 • 0 new comments -
[ErrorCode:Fail] Load model from [...]\latin_ipa_forward.onnx failed:invalid vector subscript
#15495 commented on
Jul 7, 2025 • 0 new comments -
[Build] Openvino debug build fails on VS2019
#15496 commented on
Jul 7, 2025 • 0 new comments -
[Web] probability is not returned: `error code = 1`
#15511 commented on
Jul 7, 2025 • 0 new comments -
SimplifiedLayerNormalization loading error for converted FP16 databricks/dolly-v2-3b model
#15531 commented on
Jul 7, 2025 • 0 new comments -
[Performance] FP16 model can not get acceleration on GPU with ONNXRuntime-GPU
#15534 commented on
Jul 7, 2025 • 0 new comments -
Get results from Mask RCNN model with C++
#15541 commented on
Jul 7, 2025 • 0 new comments -
fatal error: gsl/gsl: No such file or directory
#15554 commented on
Jul 7, 2025 • 0 new comments -
Error running quantize_dynamic: Failed to find proper ai.onnx domain
#15563 commented on
Jul 7, 2025 • 0 new comments -
[Build] 1.14.0-dev-20230120-0204-3d6cea14f4 (This build breaks model on Intel)
#15567 commented on
Jul 7, 2025 • 0 new comments -
[Performance] CUDA fp16 didn't get speed up
#15585 commented on
Jul 7, 2025 • 0 new comments -
Error with custom spconv class in onnx runtime
#15594 commented on
Jul 7, 2025 • 0 new comments -
[Build] Java Nightly build
#15600 commented on
Jul 7, 2025 • 0 new comments -
[Build] the Linux build config
#15621 commented on
Jul 7, 2025 • 0 new comments -
Can't use onnxruntime with DirectML built from source
#15628 commented on
Jul 7, 2025 • 0 new comments -
[Performance] CNN model exported by PyTorch runs slower than Tensorflow 1.0
#15647 commented on
Jul 7, 2025 • 0 new comments -
onnxRuntimeException and DefaultLogger issues in AWS Lambda runtime
#15650 commented on
Jul 7, 2025 • 0 new comments -
ONNXRuntime in Docker
#15652 commented on
Jul 7, 2025 • 0 new comments -
ONNX with FloatTensorType when inferred from C++ returns different label everytime
#15665 commented on
Jul 7, 2025 • 0 new comments -
[Build] Compile Error if path too long
#15674 commented on
Jul 7, 2025 • 0 new comments -
[Performance]
#15265 commented on
Jul 7, 2025 • 0 new comments -
ONNX model with FBNetv3 architecture Conversion to TensorRT Problem
#15269 commented on
Jul 7, 2025 • 0 new comments -
[Build] ONNX Java Runtime - Handle UnsatisfiedLinkError
#15281 commented on
Jul 7, 2025 • 0 new comments -
[Documentation Request] Estimating (or Checking) Allocated Memory
#15326 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Timings feedback
#15328 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Gemm op is slower after quantization
#15332 commented on
Jul 7, 2025 • 0 new comments -
[Mobile] onnxruntime-c and onnxruntime-extensions-c pod conflict with DocumentReader pod
#15333 commented on
Jul 7, 2025 • 0 new comments -
[Performance] ONNXRUNTIME sometime DEAD in python multiprocessing
#15345 commented on
Jul 7, 2025 • 0 new comments -
ValueError: Message onnx.ModelProto exceeds maximum protobuf size of 2GB: 2235646909
#15349 commented on
Jul 7, 2025 • 0 new comments -
[Web] custom ops
#15374 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Running Large Language Models for dynamic input size is poor performance. (DirectML)
#15394 commented on
Jul 7, 2025 • 0 new comments -
Opset Coverage - Binary Size Tradeoff
#15397 commented on
Jul 7, 2025 • 0 new comments -
[Build] C++ API calling fail: error C2280: 'Ort::Value::Value(const Ort::Value &)' : attempt to reference a deleted function
#15418 commented on
Jul 7, 2025 • 0 new comments -
Mask-RCNN network is giving significantly different result with DirectML EP
#15459 commented on
Jul 7, 2025 • 0 new comments -
Error Unrecognized attribute: layout for operator DynamicQuantizeLSTM
#15465 commented on
Jul 7, 2025 • 0 new comments -
Please provide informative message on dlopen failures -- python API
#15476 commented on
Jul 7, 2025 • 0 new comments -
[Performance] WebAssembly 1x1 Conv almost 4x slower than native
#15483 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Model converted to mixed precision results in higher latency
#15490 commented on
Jul 7, 2025 • 0 new comments -
Inference slows down on gpu.
#15491 commented on
Jul 7, 2025 • 0 new comments -
[Bug?] Casting int8-->float
#15492 commented on
Jul 7, 2025 • 0 new comments -
[Predict] Prediction from ONNX is same for all images
#16001 commented on
Jul 7, 2025 • 0 new comments -
The result of Col2Im operator not close with Torch result on fp16 dtype
#16007 commented on
Jul 7, 2025 • 0 new comments -
[Performance] QUInt8 vs a basic ONNX
#16009 commented on
Jul 7, 2025 • 0 new comments -
RunOptions.only_execute_path_to_fetches not working
#16013 commented on
Jul 7, 2025 • 0 new comments -
Cannot open include file: numpy/arrayobject.h
#16027 commented on
Jul 7, 2025 • 0 new comments -
[Web] The onnxruntime-web example is loading wasm file twice if set to local path
#16028 commented on
Jul 7, 2025 • 0 new comments -
[Web] [WebGPU] Uncaught (in promise) DOMException: Unable to instantiate a Device in Firefox Nightly/Linux
#16029 commented on
Jul 7, 2025 • 0 new comments -
inference time decreasing when increasing batch size to a certain point and them the inference time increasing again.
#16030 commented on
Jul 7, 2025 • 0 new comments -
can we customize memory allocation functions(like malloc/free) for inference in C api?
#16032 commented on
Jul 7, 2025 • 0 new comments -
[Performance] How to solve the problem of releasing GPU memory in onnxruntime
#16033 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Huge gap between nn.Conv1d() and nn.Conv2d() - models exported by PyTorch
#16047 commented on
Jul 7, 2025 • 0 new comments -
Unexpected inference output from QLinearConv
#16105 commented on
Jul 7, 2025 • 0 new comments -
Memory leak in cpuinfo_x86_linux_init
#16117 commented on
Jul 7, 2025 • 0 new comments -
Segmentation Fault when optimizing Stable Diffusion models
#16140 commented on
Jul 7, 2025 • 0 new comments -
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] with swin-t
#16143 commented on
Jul 7, 2025 • 0 new comments -
Segmentation fault while loading CUDA Provider
#16146 commented on
Jul 7, 2025 • 0 new comments -
[Performance] ONNX Runtime doesn't parallelize operations in CPU models
#16158 commented on
Jul 7, 2025 • 0 new comments -
The prediction results from STFT has changed with a notable shift towards larger difference from PyTorch in ORT==1.15.0
#16163 commented on
Jul 7, 2025 • 0 new comments -
[MacOS] Unable to load libonnxruntime.dylib because binaries are not signed.
#16168 commented on
Jul 7, 2025 • 0 new comments -
[Build] line 2812, in <module> sys.exit(main())
#16179 commented on
Jul 7, 2025 • 0 new comments -
[CANN]EP: CANN cannot complete inference on Atlas200DK
#15677 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Can't get GPU speed-up when exe program is located inside the path with chinese character
#15678 commented on
Jul 7, 2025 • 0 new comments -
[ErrorCode:InvalidArgument] Invalid Feed Input Name:image
#15692 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Can we set model weight precision when converting keras model into onnx model?
#15695 commented on
Jul 7, 2025 • 0 new comments -
[Build] Onnxruntime-gpu for Jetpack 5.1.1 on Jetson Orin Nano Developer Kit
#15732 commented on
Jul 7, 2025 • 0 new comments -
GraphOptimization (ORT_ENABLE_ALL) is slower using ONNXRuntime-GPU
#15743 commented on
Jul 7, 2025 • 0 new comments -
Load onnx failed(segmentation fault) with version 1.14.1 (2)
#15745 commented on
Jul 7, 2025 • 0 new comments -
Inference using the CUDA EP returns nan
#15752 commented on
Jul 7, 2025 • 0 new comments -
[Build]
#15786 commented on
Jul 7, 2025 • 0 new comments -
How to set CalibrationDataReader when my datatype is time series?
#15836 commented on
Jul 7, 2025 • 0 new comments -
[Build]
#15863 commented on
Jul 7, 2025 • 0 new comments -
[Training] Training Onnx format Models
#15867 commented on
Jul 7, 2025 • 0 new comments -
Failed top create CUDAExecutionProvider
#15873 commented on
Jul 7, 2025 • 0 new comments -
[RunTimeError]Infer error shape in runtime and mismatch with onnx spec about Split opset 18
#15882 commented on
Jul 7, 2025 • 0 new comments -
[Performance] `CUDAExecutionProvider` uses 3x the memory of `CPUExecutionProvider`
#15886 commented on
Jul 7, 2025 • 0 new comments -
symbolic_shape_infer.py failure
#15898 commented on
Jul 7, 2025 • 0 new comments -
Linking executable with static libraries --> error LNK2038: mismatch detected
#15928 commented on
Jul 7, 2025 • 0 new comments -
Atlas200DK uses EP: CANN to infer resnet50 and reports "CANN errorEE9999: Inner Error!"
#15947 commented on
Jul 7, 2025 • 0 new comments -
[Performance] Redundant ReorderOutput / ReorderInput operators in Conv+Maxpool layers when graph optimization level is ALL
#15964 commented on
Jul 7, 2025 • 0 new comments -
float16 result not match with numpy or torch
#15977 commented on
Jul 7, 2025 • 0 new comments