Skip to content

Conversation

@jatinwadhwa921
Copy link

Backmerging With msft commits

fs-eire and others added 20 commits July 1, 2025 23:16
### Description

Fixes NPM packaging pipeline.
…icrosoft#25249)

### Description
Add 2% more tolerance to `MatMulNBits` accuracy level int8 compared with
f32/f16, to fix microsoft#25231.

### Motivation and Context
See above.
### Description
This pull request refactors the logging of provider options in the ONNX
Runtime framework to improve telemetry functionality. The changes
include consolidating logging logic, introducing platform-independent
methods, and enhancing the telemetry interface for better extensibility.
### Description

This makes the QAIRT/QNN version available in the Python client as `onnxruntime.capi.build_and_package_info.qnn_version`, similar to how it's already done for `cuda_version` and `rcom_version`.

### Motivation and Context
Users in some situations need to bring their own QAIRT/QNN SDK. In these cases, it is important to know the correct version to supply to ensure compatibility.
### Description

This enables to build TRT and TRT RTX in the same ORT build.
### Description
- Add handling for input DQ and output Q. To avoid dummy qnn Quantize/Dequantize node translated, move udo translation from builder to qnn_node_group.
- Change dlopen flag or HTP is not able to locate symbol.

### Motivation and Context
To improve the UDO support for QDQ model.
…25043)

### Description
<!-- Describe your changes. -->
Use the non-CPU device type and id for host accessible memory to make
the link between CPU and the non-CPU device explicit.

Update the data transfer implementations to check vendor id.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description

Previously we have 2 artifacts:
- `${{ inputs.build_config }}_wasm`
- `${{ inputs.build_config }}_wasm_webgpu`

Now that we use a different file name so that we can simplify this part
and make it a single artifact.
### Description
<!-- Describe your changes. -->
- Add BaseOpBuilder::ProcessDataTypes
- Add CheckCpuDataTypes, CheckHtpDataTypes and CheckGpuDataTypes
- Check if datatypes are supported on QnnCpu and QnnHtp for BatchNorm
- Add corresponding unit test for BatchNorm on QnnCpu and QnnHtp

### Motivation and Context
- Due to varying datatype support for each op on various backends
(QnnCpu, QnnHtp, QnnGpu), we need an infrastructure to check datatypes according to the document

https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/operations.html#backend-supplements
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description

Fixes build on macOS.

always include `"core/graph/onnx_protobuf.h"` before including other
onnx/protobuf headers.

```
.../build/MacOS/Debug/_deps/protobuf-src/src/google/protobuf/parse_context.h:328:47: error: implicit conversion loses integer precision: 'long' to 'int' [-Werror,-Wshorten-64-to-32]
  328 |     int chunk_size = buffer_end_ + kSlopBytes - ptr;
      |         ~~~~~~~~~~   ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
```
### Description
New binary `onnxruntime_ep_graph_test` uses `test_main.cc`, which
contains deprecated declarations from `NvInfer.h`. Ignore these warnings
when building TRT EP.

### Motivation and Context
Fixes build for TRT EP.

Signed-off-by: Kevin Chen <kevinch@nvidia.com>
…microsoft#25247)

### Description
- Remove `OrtArrayOfConstObjects` from C API
- Rework graph APIs that return an array of objects to take
pre-allocated buffers as input.
- Rename `Node_GetParentGraph` to `Node_GetGraph`
- Fixes C/C++ API documentation generation:
https://github.com/microsoft/onnxruntime/actions/runs/16029991022


### Motivation and Context
Make the graph C APIs easier to use. The `OrtArrayOfConstObjects`
approach was too verbose to use and made the API harder to understand
because the function signatures did not show the element data types.

Example usage with `OrtArrayOfConstObjects`:

```c++
const OrtGraph* graph;  // Assumed is initialized

OrtArrayOfConstObjects* nodes = nullptr;
RETURN_IF_ERROR(ort_api.Graph_GetNodes(graph, &nodes));  // Get array

size_t num_nodes = 0;
RETURN_IF_ERROR(ort_api.ArrayOfConstObjects_GetSize(nodes, &num_nodes)); // Get size

// Use the nodes.
for (size_t i = 0; i < num_nodes; i++) {
  const OrtNode* node = nullptr;
  RETURN_IF_ERROR(ort_api.ArrayOfConstObjects_GetElementAt(nodes, i,
                                                           reinterpret_cast<const void**>(&node)));

   // Inspect OrtNode properties ...
}

// Have to manually release the OrtArrayOfConstObjects
// A C++ ORT wrapper class would help via RAII, but the same C api calls are made under the hood.
ort_api.ReleaseArrayOfConstObjects(nodes);
```

Example usage with "pre-allocated" buffers style:

```c++
const OrtGraph* graph;  // Assumed is initialized

// Get number of nodes.
size_t num_nodes = 0;
RETURN_IF_ERROR(ort_api.Graph_GetNumNodes(graph, &num_nodes));

// Pre-allocate buffer of OrtNode* and get nodes.
std::vector<const OrtNode*> nodes(num_nodes);
RETURN_IF_ERROR(ort_api.Graph_GetNodes(graph, nodes.data(), nodes.size()));

// Use the nodes.
for (size_t i = 0; i < num_nodes; i++) {
  const OrtNode* node = nodes[i];
  // Inspect OrtNode properties.
}

// std::vector destructor cleans up for us.
```

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
)

### Description
<!-- Describe your changes. -->
This commit adds support for running on POWER9 and POWER10 processors on
FreeBSD. The only major difference from Linux is that FreeBSD uses
elf_aux_info() instead of getauxval().


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
Supporting POWER9 and POWER10 on FreeBSD.
### Description

Fix Windows cuda build that using cuda 12.8, where nvcc raises error
that C:/DNDEBUG not found.

The root cause is current build script directly pass flags starting with
/D to nvcc. It does not cause issue in CI build since it uses cuda 12.2,
and we have a flag to let nvcc pass unknown flags to host compiler. That
is not ideal since different version of nvcc may have different behavior
on this.

Here we distinguish nvcc flags (starts with -) and MSVC flags (starts
with /), and pass MSVC flags to host compiler explicitly.
### Description
<!-- Describe your changes. -->
Fix issue with Stream notification function. The stream can be nullptr
so using a reference was incorrect.

Try and improve readability.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fix incorrect function signature.
### Description
<!-- Describe your changes. -->

- Change the entries container type from `std::unordered_map` to
`std::map`. This enables deterministic iteration order.

- Enforce internal container state consistency. `OrtKeyValuePairs` has
several internal containers that must stay consistent. Previously, the
internal containers were public. This change makes them private and also
fixes the copy/move behavior.

- Add unit tests.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Some fixes and improvements.
### Description

Update Qnn default version to 2.36.0.250627
@jatinwadhwa921 jatinwadhwa921 requested a review from ankitm3k July 4, 2025 11:51
@ankitm3k ankitm3k merged commit aa1730c into ovep-develop Jul 4, 2025
6 of 8 checks passed
@ankitm3k ankitm3k deleted the sync_msft_4_7_25 branch July 4, 2025 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.