forked from microsoft/onnxruntime
-
Notifications
You must be signed in to change notification settings - Fork 57
Backmerging With msft commits #730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### Description Fixes NPM packaging pipeline.
…icrosoft#25249) ### Description Add 2% more tolerance to `MatMulNBits` accuracy level int8 compared with f32/f16, to fix microsoft#25231. ### Motivation and Context See above.
### Description This pull request refactors the logging of provider options in the ONNX Runtime framework to improve telemetry functionality. The changes include consolidating logging logic, introducing platform-independent methods, and enhancing the telemetry interface for better extensibility.
### Description This makes the QAIRT/QNN version available in the Python client as `onnxruntime.capi.build_and_package_info.qnn_version`, similar to how it's already done for `cuda_version` and `rcom_version`. ### Motivation and Context Users in some situations need to bring their own QAIRT/QNN SDK. In these cases, it is important to know the correct version to supply to ensure compatibility.
### Description This enables to build TRT and TRT RTX in the same ORT build.
### Description - Add handling for input DQ and output Q. To avoid dummy qnn Quantize/Dequantize node translated, move udo translation from builder to qnn_node_group. - Change dlopen flag or HTP is not able to locate symbol. ### Motivation and Context To improve the UDO support for QDQ model.
…25043) ### Description <!-- Describe your changes. --> Use the non-CPU device type and id for host accessible memory to make the link between CPU and the non-CPU device explicit. Update the data transfer implementations to check vendor id. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description
Previously we have 2 artifacts:
- `${{ inputs.build_config }}_wasm`
- `${{ inputs.build_config }}_wasm_webgpu`
Now that we use a different file name so that we can simplify this part
and make it a single artifact.
### Description <!-- Describe your changes. --> - Add BaseOpBuilder::ProcessDataTypes - Add CheckCpuDataTypes, CheckHtpDataTypes and CheckGpuDataTypes - Check if datatypes are supported on QnnCpu and QnnHtp for BatchNorm - Add corresponding unit test for BatchNorm on QnnCpu and QnnHtp ### Motivation and Context - Due to varying datatype support for each op on various backends (QnnCpu, QnnHtp, QnnGpu), we need an infrastructure to check datatypes according to the document https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/operations.html#backend-supplements
### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description
Fixes build on macOS.
always include `"core/graph/onnx_protobuf.h"` before including other
onnx/protobuf headers.
```
.../build/MacOS/Debug/_deps/protobuf-src/src/google/protobuf/parse_context.h:328:47: error: implicit conversion loses integer precision: 'long' to 'int' [-Werror,-Wshorten-64-to-32]
328 | int chunk_size = buffer_end_ + kSlopBytes - ptr;
| ~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
```
### Description New binary `onnxruntime_ep_graph_test` uses `test_main.cc`, which contains deprecated declarations from `NvInfer.h`. Ignore these warnings when building TRT EP. ### Motivation and Context Fixes build for TRT EP. Signed-off-by: Kevin Chen <kevinch@nvidia.com>
…microsoft#25247) ### Description - Remove `OrtArrayOfConstObjects` from C API - Rework graph APIs that return an array of objects to take pre-allocated buffers as input. - Rename `Node_GetParentGraph` to `Node_GetGraph` - Fixes C/C++ API documentation generation: https://github.com/microsoft/onnxruntime/actions/runs/16029991022 ### Motivation and Context Make the graph C APIs easier to use. The `OrtArrayOfConstObjects` approach was too verbose to use and made the API harder to understand because the function signatures did not show the element data types. Example usage with `OrtArrayOfConstObjects`: ```c++ const OrtGraph* graph; // Assumed is initialized OrtArrayOfConstObjects* nodes = nullptr; RETURN_IF_ERROR(ort_api.Graph_GetNodes(graph, &nodes)); // Get array size_t num_nodes = 0; RETURN_IF_ERROR(ort_api.ArrayOfConstObjects_GetSize(nodes, &num_nodes)); // Get size // Use the nodes. for (size_t i = 0; i < num_nodes; i++) { const OrtNode* node = nullptr; RETURN_IF_ERROR(ort_api.ArrayOfConstObjects_GetElementAt(nodes, i, reinterpret_cast<const void**>(&node))); // Inspect OrtNode properties ... } // Have to manually release the OrtArrayOfConstObjects // A C++ ORT wrapper class would help via RAII, but the same C api calls are made under the hood. ort_api.ReleaseArrayOfConstObjects(nodes); ``` Example usage with "pre-allocated" buffers style: ```c++ const OrtGraph* graph; // Assumed is initialized // Get number of nodes. size_t num_nodes = 0; RETURN_IF_ERROR(ort_api.Graph_GetNumNodes(graph, &num_nodes)); // Pre-allocate buffer of OrtNode* and get nodes. std::vector<const OrtNode*> nodes(num_nodes); RETURN_IF_ERROR(ort_api.Graph_GetNodes(graph, nodes.data(), nodes.size())); // Use the nodes. for (size_t i = 0; i < num_nodes; i++) { const OrtNode* node = nodes[i]; // Inspect OrtNode properties. } // std::vector destructor cleans up for us. ``` --------- Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
) ### Description <!-- Describe your changes. --> This commit adds support for running on POWER9 and POWER10 processors on FreeBSD. The only major difference from Linux is that FreeBSD uses elf_aux_info() instead of getauxval(). ### Motivation and Context <!-- - Why is this change required? What problem does it solve? Supporting POWER9 and POWER10 on FreeBSD.
### Description Fix Windows cuda build that using cuda 12.8, where nvcc raises error that C:/DNDEBUG not found. The root cause is current build script directly pass flags starting with /D to nvcc. It does not cause issue in CI build since it uses cuda 12.2, and we have a flag to let nvcc pass unknown flags to host compiler. That is not ideal since different version of nvcc may have different behavior on this. Here we distinguish nvcc flags (starts with -) and MSVC flags (starts with /), and pass MSVC flags to host compiler explicitly.
### Description <!-- Describe your changes. --> Fix issue with Stream notification function. The stream can be nullptr so using a reference was incorrect. Try and improve readability. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix incorrect function signature.
### Description <!-- Describe your changes. --> - Change the entries container type from `std::unordered_map` to `std::map`. This enables deterministic iteration order. - Enforce internal container state consistency. `OrtKeyValuePairs` has several internal containers that must stay consistent. Previously, the internal containers were public. This change makes them private and also fixes the copy/move behavior. - Add unit tests. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Some fixes and improvements.
### Description Update Qnn default version to 2.36.0.250627
ankitm3k
approved these changes
Jul 4, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backmerging With msft commits