Skip to content

Conversation

@Jaswanth51
Copy link

Description

Synchronizing intel/onnxruntime ovep-develop branch with latest changes from microsoft/onnxruntime master branch.

fs-eire and others added 11 commits August 5, 2025 08:07
### Description

fix WebAssembly build on macOS/arm64 by disable appending
"-Donnxruntime_USE_KLEIDIAI=ON" to the cmake_args

KleidiAI should not be enabled for WebAssembly build.
### Description

Fix build when at least one delay load DLL is needed for onnxruntime.dll

The old code contains non standard macro definition which is considered
as build error in latest VC++
This currently holds 2 major improvements:
- dynamic shape models should have much lower memory usage and in
addition to that the management is move towards ORT allocators
- the overhead for shape binding and address updates is reduce per
inference

---------

Co-authored-by: Gaurav Garg <gaugarg@nvidia.com>
…ataTransfer registered by a plugin EP in the Environment (microsoft#25346)

### Description
<!-- Describe your changes. -->
Add ability to get shared allocator from env.
Add ability to create a MemCpyFunc using the IDataTransfer from the
environment.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
…5461)

### Description

upgrade dawn to latest


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Fixes microsoft#24679.

### Motivation and Context
The original check for a leaf node was insufficient because a branch
child and a leaf child could have the same index. The bug described in
issue microsoft#24679 is not a rare occasion; in fact, it is something likely to
be faced in estimators with small and balanced trees. I encountered it
myself in a unit test.

The corrected check ensures that for a node to be considered a leaf,
both of its children must be leaves and share the same index.
…rosoft#25451)

### Description
remove support for multiple package variants (`Full` and `Training`) in
the Apple packaging pipeline, consolidating the codebase to only support
the `Full` variant. The changes simplify the code by eliminating the
`PackageVariant` enum, related logic, and configuration files for the
`Training` variant.

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
### Description
Tests previously not running in ADO pipelines correctly -- instead would
time out
### Description
Add support for onnxruntime_perf_test to register plugin EP dll and run
plugin EP.

As support for plugin execution providers (EPs) requires additional
options and most single-character options have already been used,
multi-character options are now necessary to ensure clarity and
readability. Therefore, support for `Abseil flags` is added, which
enables multi-character options and provides cross-platform
compatibility.


**New options:**

- `--plugin_ep_libs [registration names and libraries]` Specifies a list
of plugin execution provider (EP) registration names and their
corresponding shared libraries to register.
[Usage]: `--plugin_ep_libs "plugin_ep_name_1|plugin_ep_1.dll
plugin_ep_name_2|plugin_ep_2.dll ... "`

  
- `--plugin_eps [Plugin EPs]` Specifies a semicolon-separated list of
plugin execution providers (EPs) to use.
      [Usage]: `--plugin_eps "plugin_ep_1;plugin_ep_2;... "`

- `--plugin_ep_options [EP options]` Specifies provider options for each
EP listed in --plugin_eps. Options (key-value pairs) for each EP are
separated by space and EPs are separated by semicolons.
      [Usage]:
`--plugin_ep_options "ep_1_option_1_key|ep_1_option_1_value
...;ep_2_option_1_key|ep_2_option_1_value ...;..."` or
`--plugin_ep_options ";ep_2_option_1_key|ep_2_option_1_value ...;..."`
or
`--plugin_ep_options "ep_1_option_1_key|ep_1_option_1_value
...;;ep_3_option_1_key|ep_3_option_1_value ...;..."`

- `--list_ep_devices` Prints all available device indices and their
properties (including metadata). This option makes the program exit
early without performing inference.

- ` --select_ep_devices [list of device indices]` A semicolon-separated
list of device indices to add to the session and run with.

**Usage:**

1. Use `--plugin_ep_libs` and `--list_ep_devices` to list all the
devices.

````sh
--list_ep_devices --plugin_ep_libs "TensorRTEp|C:\TensorRTEp.dll example_ep|C:\example_plugin_ep.dll"
````
   It will print the devices info
````
===== EP device id 0 ======
name: CPUExecutionProvider
vendor: Microsoft
metadata:
  version: 1.23.0

===== EP device id 1 ======
name: example_ep
vendor: Contoso
metadata:
  supported_devices: CrackGriffin 7+
  version: 0.1.0

===== EP device id 2 ======
name: TensorRTEp
vendor: Nvidia
metadata:
  gpu_type: data center
  version: 0.1.0
````

2. Use `--select_ep_devices` to select the device by index. And add
`--plugin_eps` to specify the EP name. The EP name should match the name
when ep library passes in to create the ep factory.

````sh
--plugin_ep_libs "TensorRTEp|C:\TensorRTEp.dll" --select_ep_devices 2 --plugin_eps TensorRTEp -r 1 C:\mul_op\mul_1.onnx
````

3. Or simply use `-e` to specify the EP name. ORT perf test will add all
the devices created by the plugin EP.
The EP name should match the name when ep library passes in to create
the ep factory.

````sh
--plugin_ep_libs "TensorRTEp|C:\TensorRTEp.dll" --plugin_eps TensorRTEp -r 1 C:\mul_op\mul_1.onnx
````
### Description
<!-- Describe your changes. -->
Relax restriction on DML EP so other CPU based EPs can be used. 


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
microsoft#25504
@Jaswanth51 Jaswanth51 requested a review from ankitm3k August 7, 2025 03:08
@ankitm3k ankitm3k merged commit 055300f into ovep-develop Aug 7, 2025
6 of 8 checks passed
@ankitm3k ankitm3k deleted the sync_msft_07082025 branch August 7, 2025 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants