forked from microsoft/onnxruntime
-
Notifications
You must be signed in to change notification settings - Fork 57
Backmerging with Msft commits #692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### Description Integrate some neural compressor code since the ORT side in the repo is in maintenance mode. ### Motivation and Context Enable k-quant quantization.
### Description <!-- Describe your changes. --> Add initial selection policy implementations. Update device discovery - get vendor and vendor id for CPU from cpuid_info - trim metadata to known useful fields - NPU detection via dxcore only Bug fixes/updates from PRs for C# and python bindings Add some tests for selection policy - TODO: Add more tests ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Desire to boil oceans. --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
### Description <!-- Describe your changes. --> C# API updates for auto ep selection and the compilation API. Also includes bugfix to OrtKeyValuePairs::Remove. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
### Description As titled. ### Motivation and Context Dependency no need.
…icrosoft#24629) ### Description - Enables automatic selection of QNN EP for PREFER_NPU policy - Fixes cpuid vendor id for Qualcomm to be `'Q' | ('C' << 8) | ('O' << 16) | ('M' << 24);` Sample code from unit test: ```c++ // Tests autoEP feature to automatically select an EP that supports the NPU. // Currently only works on Windows. TEST_F(QnnHTPBackendTests, AutoEp_PreferNpu) { ASSERT_ORTSTATUS_OK(Ort::GetApi().RegisterExecutionProviderLibrary(*ort_env, kQnnExecutionProvider, ORT_TSTR("onnxruntime_providers_qnn.dll"))); Ort::SessionOptions so; so.SetEpSelectionPolicy(OrtExecutionProviderDevicePolicy_PREFER_NPU); const ORTCHAR_T* ort_model_path = ORT_MODEL_FOLDER "nhwc_resize_sizes_opset18.quant.onnx"; Ort::Session session(*ort_env, ort_model_path, so); EXPECT_TRUE(SessionHasEp(session, kQnnExecutionProvider)); ASSERT_ORTSTATUS_OK(Ort::GetApi().UnregisterExecutionProviderLibrary(*ort_env, kQnnExecutionProvider)); } ``` ### Motivation and Context A recent feature allows ORT to automatically select an EP according to policies set by the user (e.g., prefer npu or prefer gpu). This PR allows QNN EP to be potentially selected when the user sets the `PREFER_NPU` policy.
…the same type… (microsoft#24633) ### Description <!-- Describe your changes. --> Fix debug assertion when there are two devices of the same type that don't match the vendor. e.g. WebGPU and DML. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
When under wasm we can't check for metal by looking at backend because it will always be WEBGPU. Because of this we'll take the DP4A path on metal that results in sub-optimal performance. Use vendor to check for metal instead.
…microsoft#24640) ### Description enable use_vcpkg for QNN Nuget package build and Python arm64ec build
### Description
Python API updates for auto ep selection and the compilation API.
- Adds Python API `SessionOptions.add_provider()` (equivalent to C API's
`SessionOptionsAppendExecutionProvider`)
- Adds Python API `SessionOptions.add_provider_for_devices()`
(equivalent to C API's `SessionOptionsAppendExecutionProvider_V2`)
- Adds Python API `SessionOptions.set_provider_selection_policy()`
(equivalent to C API's `SessionOptionsSetEpSelectionPolicy`)
- Adds Python API class `ModelCompiler` to compile models (wraps C API's
`OrtModelCompilationOptions` and `CompileModel()`)
- TODO: Finish delegate callback. Need to add a `void*` parameter to
delegate function.
### Sample program that uses autoep APIs
Adapted from a unit test.
```python
def test_cuda_prefer_gpu_and_inference(self):
"""
Test selecting CUDA EP via the PREFER_GPU policy and running inference.
"""
ep_lib_path = "onnxruntime_providers_cuda.dll"
ep_registration_name = "CUDAExecutionProvider"
if sys.platform != "win32":
self.skipTest("Skipping test because device discovery is only supported on Windows")
if not os.path.exists(ep_lib_path):
self.skipTest(f"Skipping test because EP library '{ep_lib_path}' cannot be found")
onnxrt.register_execution_provider_library(ep_registration_name, os.path.realpath(ep_lib_path))
# Set a policy to prefer GPU. Cuda should be selected.
sess_options = onnxrt.SessionOptions()
sess_options.set_provider_selection_policy(onnxrt.OrtExecutionProviderDevicePolicy.PREFER_GPU)
self.assertTrue(sess_options.has_providers())
# Run sample model and check output
sess = onnxrt.InferenceSession(get_name("mul_1.onnx"), sess_options=sess_options)
x = np.array([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], dtype=np.float32)
input_name = sess.get_inputs()[0].name
res = sess.run([], {input_name: x})
output_expected = np.array([[1.0, 4.0], [9.0, 16.0], [25.0, 36.0]], dtype=np.float32)
np.testing.assert_allclose(output_expected, res[0], rtol=1e-05, atol=1e-08)
```
### Sample program that uses compile APIs
Adapted from a unit test that compiles using EP selection policy.
```python
def test_compile_with_files_prefer_npu_policy(self):
"""
Tests compiling a model (to/from files) using an EP selection policy (PREFER_NPU).
"""
ep_lib_path = "onnxruntime_providers_qnn.dll"
ep_registration_name = "QNNExecutionProvider"
onnxrt.register_execution_provider_library(ep_registration_name, ep_lib_path)
input_model_path = get_name("nhwc_resize_scales_opset18.onnx")
output_model_path = os.path.join(self._tmp_dir_path, "model.compiled0.onnx")
session_options = onnxrt.SessionOptions()
session_options.set_provider_selection_policy(onnxrt.OrtExecutionProviderDevicePolicy.PREFER_NPU)
model_compiler = onnxrt.ModelCompiler(
session_options,
input_model_path,
embed_compiled_data_into_model=True,
external_initializers_file_path=None,
)
model_compiler.compile_to_file(output_model_path)
self.assertTrue(os.path.exists(output_model_path))
onnxrt.unregister_execution_provider_library(ep_registration_name)
```
Adapted from a unit test that compiles using explicit EPs.
```python
def test_compile_with_input_and_output_files(self):
"""
Tests compiling a model (to/from files) using explicit EP.
"""
provider = None
provider_options = dict()
if "QNNExecutionProvider" in available_providers:
provider = "QNNExecutionProvider"
provider_options["backend_type"] = "htp"
# TODO(adrianlizarraga): Allow test to run for other compiling EPs (e.g., OpenVINO)
input_model_path = get_name("nhwc_resize_scales_opset18.onnx")
output_model_path = os.path.join(self._tmp_dir_path, "model.compiled1.onnx")
session_options = onnxrt.SessionOptions()
if provider:
session_options.add_provider(provider, provider_options)
model_compiler = onnxrt.ModelCompiler(
session_options,
input_model_path,
embed_compiled_data_into_model=True,
external_initializers_file_path=None,
)
model_compiler.compile_to_file(output_model_path)
self.assertTrue(os.path.exists(output_model_path))
```
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description As titled.
…ite-default (microsoft#24607) Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 6.2.6 to 6.3.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/vitejs/vite/releases">vite's releases</a>.</em></p> <blockquote> <h2>v6.3.4</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.3.4/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>v6.3.3</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.3.3/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>v6.3.2</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.3.2/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>create-vite@6.3.1</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/create-vite@6.3.1/packages/create-vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>v6.3.1</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.3.1/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>create-vite@6.3.0</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/create-vite@6.3.0/packages/create-vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>v6.3.0</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.3.0/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>v6.3.0-beta.2</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.3.0-beta.2/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>v6.3.0-beta.1</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.3.0-beta.1/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>v6.3.0-beta.0</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.3.0-beta.0/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> <h2>v6.2.7</h2> <p>Please refer to <a href="https://github.com/vitejs/vite/blob/v6.2.7/packages/vite/CHANGELOG.md">CHANGELOG.md</a> for details.</p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md">vite's changelog</a>.</em></p> <blockquote> <h2><!-- raw HTML omitted -->6.3.4 (2025-04-30)<!-- raw HTML omitted --></h2> <ul> <li>fix: check static serve file inside sirv (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19965">#19965</a>) (<a href="https://github.com/vitejs/vite/commit/c22c43de612eebb6c182dd67850c24e4fab8cacb">c22c43d</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19965">#19965</a></li> <li>fix(optimizer): return plain object when using <code>require</code> to import externals in optimized dependenci (<a href="https://github.com/vitejs/vite/commit/efc5eab253419fde0a6a48b8d2f233063d6a9643">efc5eab</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19940">#19940</a></li> <li>refactor: remove duplicate plugin context type (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19935">#19935</a>) (<a href="https://github.com/vitejs/vite/commit/d6d01c2292fa4f9603e05b95d81c8724314c20e0">d6d01c2</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19935">#19935</a></li> </ul> <h2><!-- raw HTML omitted -->6.3.3 (2025-04-24)<!-- raw HTML omitted --></h2> <ul> <li>fix: ignore malformed uris in tranform middleware (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19853">#19853</a>) (<a href="https://github.com/vitejs/vite/commit/e4d520141bcd83ad61f16767348b4a813bf9340a">e4d5201</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19853">#19853</a></li> <li>fix(assets): ensure ?no-inline is not included in the asset url in the production environment (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/1949">#1949</a> (<a href="https://github.com/vitejs/vite/commit/16a73c05d35daa34117a173784895546212db5f4">16a73c0</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19496">#19496</a></li> <li>fix(css): resolve relative imports in sass properly on Windows (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19920">#19920</a>) (<a href="https://github.com/vitejs/vite/commit/ffab44270488f54ae344801024474b597249071b">ffab442</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19920">#19920</a></li> <li>fix(deps): update all non-major dependencies (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19899">#19899</a>) (<a href="https://github.com/vitejs/vite/commit/a4b500ef9ccc9b19a2882156a9ba8397e69bc6b2">a4b500e</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19899">#19899</a></li> <li>fix(ssr): fix execution order of re-export (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19841">#19841</a>) (<a href="https://github.com/vitejs/vite/commit/ed29dee2eb2e3573b2bc337e1a9124c65222a1e5">ed29dee</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19841">#19841</a></li> <li>fix(ssr): fix live binding of default export declaration and hoist exports getter (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19842">#19842</a>) (<a href="https://github.com/vitejs/vite/commit/80a91ff82426a4c88d54b9f5ec9a4205cb13899b">80a91ff</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19842">#19842</a></li> <li>perf: skip sourcemap generation for renderChunk hook of import-analysis-build plugin (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19921">#19921</a>) (<a href="https://github.com/vitejs/vite/commit/55cfd04b10f98cde7a96814a69b9813543ea79c2">55cfd04</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19921">#19921</a></li> <li>test(ssr): test <code>ssrTransform</code> re-export deps and test stacktrace with first line (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19629">#19629</a>) (<a href="https://github.com/vitejs/vite/commit/9399cdaf8c3b2efd5f4015d57dc3b0e4e5b91a9d">9399cda</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19629">#19629</a></li> </ul> <h2><!-- raw HTML omitted -->6.3.2 (2025-04-18)<!-- raw HTML omitted --></h2> <ul> <li>fix: match default asserts case insensitive (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19852">#19852</a>) (<a href="https://github.com/vitejs/vite/commit/cbdab1d6a30e07263ec51b2ca042369e736adec6">cbdab1d</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19852">#19852</a></li> <li>fix: open first url if host does not match any urls (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19886">#19886</a>) (<a href="https://github.com/vitejs/vite/commit/6abbdce3d77990409e12380e72c7ec9dd3f8bec5">6abbdce</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19886">#19886</a></li> <li>fix(css): respect <code>css.lightningcss</code> option in css minification process (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19879">#19879</a>) (<a href="https://github.com/vitejs/vite/commit/b5055e0dd4c0e084115c3dbfead5736a54807e0c">b5055e0</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19879">#19879</a></li> <li>fix(deps): update all non-major dependencies (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19698">#19698</a>) (<a href="https://github.com/vitejs/vite/commit/bab4cb92248adf6b9b18df12b2bf03889b0bd1eb">bab4cb9</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19698">#19698</a></li> <li>feat(css): improve lightningcss messages (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19880">#19880</a>) (<a href="https://github.com/vitejs/vite/commit/c713f79b5a4bd98542d8dbe4c85ba4cce9b1f358">c713f79</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19880">#19880</a></li> </ul> <h2><!-- raw HTML omitted -->6.3.1 (2025-04-17)<!-- raw HTML omitted --></h2> <ul> <li>fix: avoid using <code>Promise.allSettled</code> in preload function (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19805">#19805</a>) (<a href="https://github.com/vitejs/vite/commit/35c7f35e2b67f2158ededf2af58ecec53b3f16c5">35c7f35</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19805">#19805</a></li> <li>fix: backward compat for internal plugin <code>transform</code> calls (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19878">#19878</a>) (<a href="https://github.com/vitejs/vite/commit/a152b7cbac72e05668f8fc23074d531ecebb77a5">a152b7c</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19878">#19878</a></li> </ul> <h2>6.3.0 (2025-04-16)</h2> <ul> <li>fix(hmr): avoid infinite loop happening with <code>hot.invalidate</code> in circular deps (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19870">#19870</a>) (<a href="https://github.com/vitejs/vite/commit/d4ee5e8655a85f4d6bebc695b063d69406ab53ac">d4ee5e8</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19870">#19870</a></li> <li>fix(preview): use host url to open browser (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19836">#19836</a>) (<a href="https://github.com/vitejs/vite/commit/50034340401b4043bb0b158f18ffb7ae1b7f5c86">5003434</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19836">#19836</a></li> </ul> <h2>6.3.0-beta.2 (2025-04-11)</h2> <ul> <li>fix: addWatchFile doesn't work if base is specified (fixes <a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19792">#19792</a>) (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19794">#19794</a>) (<a href="https://github.com/vitejs/vite/commit/8bed1de5710f2a097af0e22a196545446d98f988">8bed1de</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19792">#19792</a> <a href="https://redirect.github.com/vitejs/vite/issues/19794">#19794</a></li> <li>fix: correct the behavior when multiple transform filter options are specified (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19818">#19818</a>) (<a href="https://github.com/vitejs/vite/commit/7200deec91a501fb84734e23906f80808734540c">7200dee</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19818">#19818</a></li> <li>fix: fs check with svg and relative paths (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19782">#19782</a>) (<a href="https://github.com/vitejs/vite/commit/62d7e81ee189d65899bb65f3263ddbd85247b647">62d7e81</a>), closes <a href="https://redirect.github.com/vitejs/vite/issues/19782">#19782</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/vitejs/vite/commit/b040d547a17c4bfe8aba44534228667a50612318"><code>b040d54</code></a> release: v6.3.4</li> <li><a href="https://github.com/vitejs/vite/commit/c22c43de612eebb6c182dd67850c24e4fab8cacb"><code>c22c43d</code></a> fix: check static serve file inside sirv (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19965">#19965</a>)</li> <li><a href="https://github.com/vitejs/vite/commit/efc5eab253419fde0a6a48b8d2f233063d6a9643"><code>efc5eab</code></a> fix(optimizer): return plain object when using <code>require</code> to import externals ...</li> <li><a href="https://github.com/vitejs/vite/commit/d6d01c2292fa4f9603e05b95d81c8724314c20e0"><code>d6d01c2</code></a> refactor: remove duplicate plugin context type (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19935">#19935</a>)</li> <li><a href="https://github.com/vitejs/vite/commit/db9eb97b2f530a3985b29c5d1a529772f1ab1893"><code>db9eb97</code></a> release: v6.3.3</li> <li><a href="https://github.com/vitejs/vite/commit/e4d520141bcd83ad61f16767348b4a813bf9340a"><code>e4d5201</code></a> fix: ignore malformed uris in tranform middleware (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19853">#19853</a>)</li> <li><a href="https://github.com/vitejs/vite/commit/55cfd04b10f98cde7a96814a69b9813543ea79c2"><code>55cfd04</code></a> perf: skip sourcemap generation for renderChunk hook of import-analysis-build...</li> <li><a href="https://github.com/vitejs/vite/commit/ffab44270488f54ae344801024474b597249071b"><code>ffab442</code></a> fix(css): resolve relative imports in sass properly on Windows (<a href="https://github.com/vitejs/vite/tree/HEAD/packages/vite/issues/19920">#19920</a>)</li> <li><a href="https://github.com/vitejs/vite/commit/16a73c05d35daa34117a173784895546212db5f4"><code>16a73c0</code></a> fix(assets): ensure ?no-inline is not included in the asset url in the produc...</li> <li><a href="https://github.com/vitejs/vite/commit/9399cdaf8c3b2efd5f4015d57dc3b0e4e5b91a9d"><code>9399cda</code></a> test(ssr): test <code>ssrTransform</code> re-export deps and test stacktrace with first ...</li> <li>Additional commits viewable in <a href="https://github.com/vitejs/vite/commits/v6.3.4/packages/vite">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/microsoft/onnxruntime/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ft#24641) ### Description This PR adds a check for the package version for dev channel. This PR should be able to help avoid publishing packages like "-rc.*" to dev channel automatically. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description <!-- Describe your changes. --> Add support for selection policy delegate - split API function into one for the policy enum and one for the delegate - add `void*` for user state - required to wire up using the delegate in other languages. Add C# support for specifying the selection policy delegate. Address comments from initial C# autoep support PR. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description This PR adds the support for 8-bit quantization in the `MatMulNBits` operation in WebGPU. It does below things: 1. Unify to use `MatMulNBitsProgram` as the fallback path which is the original generation path for block size = 32. Now make it support any blocks size without limitations. And remove the original complicated programs. 2. Enable `MatMulNBitsWideTileProgram` for all platforms.
### Description If indices is a scalar(0 dimensional tensor) , gather OP produces incorrect output shape. Fix the gather op bug in VSINPU EP. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Signed-off-by: Kee <xuke537@hotmail.com>
### Description <!-- Describe your changes. --> Fix type mismatch using float in place of unsigned int. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
fix shader compile; don't know how this made it past ci
…24645) ### Description Python Cuda Publishing pipeline references old test pipeline
### Description The random failure on Web CI is hard to investigate because it's not reproducible. Add this step to upload the log to help investigate the issue.
…ft#24650) ### Description Fix the outputSize computation causing duplicate indices. The outputSize should be the size of indices tensor without counting the last dimension. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix the issue microsoft#24070
### Description <!-- Describe your changes. --> header file "dawn/dawn_proc.h" is only used in a non-monolithic build of dawn.
The patch optimizes pool operators when output size is small and kernel size is big ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
…crosoft#24634) ### Description Follow up to microsoft#24614 Example Python program (adapted from unit tests) that specifies a custom EP selection function to select a OrtEpDevice(s) for compiling: ```python def test_compile_with_ep_selection_delegate(self): # ... # User's custom EP selection function. def my_delegate( ep_devices: Sequence[onnxrt.OrtEpDevice], model_metadata: dict[str, str], runtime_metadata: dict[str, str], max_selections: int, ) -> Sequence[onnxrt.OrtEpDevice]: self.assertTrue(len(model_metadata) > 0) self.assertTrue(ep_devices and max_selections > 0) # Select the first and last devices (if there are more than one) selected_devices = [ep_devices[0]] if max_selections > 2 and len(ep_devices) > 1: selected_devices.append(ep_devices[-1]) # ORT CPU EP is always last return selected_devices session_options = onnxrt.SessionOptions() session_options.set_provider_selection_policy_delegate(my_delegate) model_compiler = onnxrt.ModelCompiler( session_options, input_model_path, embed_compiled_data_into_model=True, external_initializers_file_path=None, ) model_compiler.compile_to_file(output_model_path) ``` How to raise an exception from the Python EP selection function: ```python # User's custom EP selection function. custom_error_message = "MY ERROR" def my_delegate_that_fails( ep_devices: Sequence[onnxrt.OrtEpDevice], model_metadata: dict[str, str], runtime_metadata: dict[str, str], max_selections: int, ) -> Sequence[onnxrt.OrtEpDevice]: self.assertTrue(len(ep_devices) >= 1) raise ValueError(custom_error_message) sess_options = onnxrt.SessionOptions() sess_options.set_provider_selection_policy_delegate(my_delegate_that_fails) # Create session and expect ORT to raise a Fail exception that contains our message. with self.assertRaises(Fail) as context: onnxrt.InferenceSession(get_name("mul_1.onnx"), sess_options=sess_options) self.assertIn(custom_error_message, str(context.exception)) ``` ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
…e APIs (microsoft#24661) ### Description Fixes documentation errors in comments within onnxruntime_c_api.h and onnxruntime__cxx_api.h. ### Motivation and Context The [Generate C/C++ API docs](https://github.com/microsoft/onnxruntime/actions/runs/14855108283/job/41706460753#logs) action is failing with error: ```shell Run mkdir -p build/doxygen /mnt/vss/_work/onnxruntime/onnxruntime/include/onnxruntime/core/session/onnxruntime_cxx_api.h:775: error: explicit link request to 'OrtKeyValuePair' could not be resolved (warning treated as error, aborting now) ```
### Description Added ScatterND operator to Native WebGPU EP. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Required to increase coverage.
…icrosoft#24666) ### Description <!-- Describe your changes. --> Handle user selection policy delegate throwing or returning too many selections in C# code and create error message. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
… unit tests (microsoft#24667) ### Description Cleans up the usage of `ep_name` and `ep_registration_name` in the autoEP Python unit tests. ### Motivation and Context Addresses comments from a previous PR: microsoft#24634 > nit: the registration name and EP names don't need to match. could we call this 'ep_name' to avoid potentially creating an assumption that they always do?
- Use ResizeNearestNeighbor Op for Resize with interpolation_mode=Nearest and rank-4 inputs. - Add a Unit test to verify the modified translation. ### Description ResizeNearestNeighbor Op is faster for Resize with interpolation_mode=Nearest and rank-4 inputs. ### Motivation and Context This commit matches Resize Op behavior in QNN-EP with QNN Offline converter path. This fix also improves inference time.
### Description Build are not reproducible, remove information that contains local information from the build ### Motivation and Context Reproducible build is important to ensure package is reliable Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Clément Péron <peron.clem@gmail.com> Co-authored-by: Andrew Davis <afd@ti.com>
…4746) fixes error for https://huggingface.co/Xenova/musicgen-small on webgpu native 
### Description Publish Windows debug symbols to not only Azure DevOps but also msdl.microsoft.com . See https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/symbol-path for how to consume it.
### Description <!-- Describe your changes. --> Fix typos in multiple files ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description Expose and test GetTensorSizeInBytes in C# and Python ### Motivation and Context microsoft#24680
To include the following change: dmlc/dlpack#165
### Description See the [example repository](https://github.com/jordanozang/onnxruntime_minimal_static) for a minimal example of using the static CMake Config. - Add libraries built during the static onnxruntime build (onnxruntime_common, onnxruntime_mlas, etc.) to the onnxruntime export set. Additionally, add an onnxruntime::onnxruntime interface target that behaves much the same as that target in the shared build case. ``find_package(onnxruntime REQUIRED)`` ``target_link_libraries(example PRIVATE onnxruntime::onnxruntime)`` should now work. - Minor modifications to ensure that dependency targets like Boost::mp11 are treated as imported targets and not part of the build interface. - Static webgpu builds will currently not generate this CMake export ### Motivation and Context - Resolves Issue microsoft#21351 - Builds on Pull Request microsoft#21348
WebNN doesn't provide a dedicated op for `MatMulInteger`, this PR supports `MatMulInteger` by decomposing it into `DequantizeLinear A, B -> MatMul -> Cast (to int32)` and makes some code optimization BTW.
…24753) ### Description <!-- Describe your changes. --> Some CPUs don't show up in SetupApi info for some reason. Create default entry if that is the case. Manually tested by disabling the lookup of GUID_DEVCLASS_PROCESSOR info. Not sure of a better way to test. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Fix crash as other code assumes there will always be CPU device.
This PR is a follow-up of microsoft#24547 . Previously the pipelines had some issues that prevented me to modify these files. Now the issue is solved.
The [RotaryEmbedding](https://onnx.ai/onnx/operators/onnx__RotaryEmbedding.html#rotaryembedding) op has been released in opset 23 and has some differences compared to the original contributed op: - The order of input indexes changed - The position_ids input is optional - If the input is 3D, the num_heads must be provided - If it is full rotation, we need to slice the gathered cosine/sine to get the shape [batch_size, sequence_length, rotary_embedding_dim / 2]
### Description <!-- Describe your changes. --> Fix typos in multiple files ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Signed-off-by: co63oc <co63oc@users.noreply.github.com>
### Description <!-- Describe your changes. --> Fix typos in bert_defs.cc and contrib_defs.cc ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Signed-off-by: co63oc <co63oc@users.noreply.github.com>
### Description Use aligned load and preloading. There is ~10% token generation speed up. ### Motivation and Context Optimize perf
- Currently QNN EP only supports MaxPool for rank 4. - This change adds support for rank 3 input by adding Reshapes before and after the Op to ensure that the MaxPool gets input rank 4. - Updated all attributes if converting rank 3 input to rank 4 by updating stride, pads, dilations and kernel size. - Added unit tests which takes input rank 3 to validate MaxPool on NPU. ### Description This change extends the support of QNN EP's MaxPool operation to handle input tensors of rank 3. To achieve this, Reshape Ops are added before and after the MaxPool Op to ensure that the input to MaxPool is always of rank 4, as required. Additionally, the attributes such as stride, pads, dilations, and kernel size are updated accordingly to accommodate the conversion of rank 3 inputs to rank 4. Unit tests have been added to validate the functionality of MaxPool on NPU with rank 3 inputs. ### Motivation and Context This change is required to enhance the flexibility and usability of QNN EP's MaxPool operation by supporting a broader range of input tensor ranks. Previously, the operation was limited to only supporting rank 4 inputs, which restricted the support in certain scenarios. By adding support for rank 3 inputs, this change solves the problem of limited compatibility and enhances the overall functionality and makes sure that the MaxPool op offloads to the NPU (QNN HTP Backend)
Remove onnxruntime-mlas section
The `symbolFolder` parameter in publish-symbolrequestprod-api.yml actually was an unused-parameter. The yaml was copied from another project and I didn't check the code. Delete tools/ci_build/github/azure-pipelines/templates/py-packaging-training-cuda-stage.yml.
…iven. (microsoft#24781) ### Description <!-- Describe your changes. --> Always write to profiling file if `profiling_file_path` is given. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Previously, on Windows, if the ETW path is enabled, the profiling data will not be written to the file even if `profiling_file_path` is given. I thought that this behavior was confusing.
…ft#24779) ### Description <!-- Describe your changes. --> Adds MultiHeadAttention operator support to MIGraphX EP to leverage the existing MIGraphX parser and Implimentation ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Needed for Model enablement
### Description <!-- Describe your changes. --> Adds enablement for MIGraphX EP to use MIGraphX's QuickGelu parser and op ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Required for model support
### Description Memtype Memhandle is applicable only for Graph IO tensors. For other tensors we can leave it as RAW ### Motivation and Context Compose failed for some models as Memtype is set as MemHandle for static tensors.
…soft#24752) Enable MaxPool Op with "auto_pad" param set as VALID. VALID runs with all pad values set to 0. ### Description Remove the assert from QNN_EP for MaxPool Op with "auto_pad" as VALID since the Op with this config is supported on QNN backend. ### Motivation and Context QNN_EP rejects MaxPool Op with "auto_pad" as VALID with message the QNN Pool does not support this config. QNN Pool Op supports auto_pad=VALID and all the pad values are set to 0. Signed-off-by: quic-ankus <quic_ankus@quicinc.com>
ankitm3k
approved these changes
May 16, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backmerging with Msft commits