Skip to content

Conversation

@Jaswanth51
Copy link

Description

Synchronizing intel/onnxruntime ovep-develop branch with latest changes from microsoft/onnxruntime master branch.

dependabot bot and others added 23 commits August 25, 2025 09:13
### Description
Fixed the macro `ORT_API_CALL` by replacing `_stdcall` with `__stdcall`


### Motivation and Context
Recently, I found an issue that prevents ONNX Runtime from being built using the MinGW toolchain on Windows.

After investigating, I discovered that the ONNX Runtime C API header contains a typo in the `ORT_API_CALL` preprocessor macro.
It is incorrectly defined as `_stdcall` instead of the correct `__stdcall` (with two leading underscores).
This causes build failures on compilers like MinGW that are strict about this syntax.
…ft#25782)

### Description

allow custom CMAKE_C_STANDARD and CMAKE_CXX_STANDARD

Fixes microsoft#25756



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
This change adds support for Q2 quantized matmulnbits, in webgpu.


### Motivation and Context
An alternate way to support bitnets is through adding support for lower
bits in matmulnbits, this reuses our shaders and is more maintainable
than a separate op. The model size grows a bit however for a 2B
parameter model using 1.58bpw vs 2bpw the size difference is just 100MB.
The simpler dequantization also improves perf, on an Intel XE matmul
looks to be 20% faster using q2 weights vs q4 weights for the same
matrix dimensions.

Q2 version of the bitnet model is here
https://huggingface.co/sushraja/bitnet-b1.58-2B-4T-fp16-onnx/tree/main/bitnet_q2
In the flash attention algorithm, each thread in a subgroup needs to
access the same range (0-15) of data in workgroup memory `q_tile` and
`v_tile`. If we use `subgroupShuffle`, there will be bank conflicts for
`var k_local = k_tile[capped_sg_id][i];` since the sg_size is 32 and
thread16~thread31 are accessing the same bank address.

To avoid the bank conflicts, we can directly access the same address in
workgroup memory by all threads which is a broadcast and well optimized
in the NV GPUs. See ~10% improvement for phi4 prefill (1K) in NV RTX
2000 Ada. And as the input gets longer(total_sequence_length), the
optimization effect gets better (~12% for 2K).

Before
```
Batch size: 1, prompt tokens: 1000, tokens to generate: 128
Prompt processing (time to first token):
        avg (us):       2.0991e+06
        avg (tokens/s): 476.394
        p50 (us):       2.08457e+06
        stddev (us):    36140.3
        n:              5 * 1000 token(s)
Token generation:
        avg (us):       25477.8
        avg (tokens/s): 39.2498
        p50 (us):       25028.2
        stddev (us):    4841.89
        n:              635 * 1 token(s)
```
After
```
Batch size: 1, prompt tokens: 1000, tokens to generate: 128
Prompt processing (time to first token):
        avg (us):       1.91138e+06
        avg (tokens/s): 523.183
        p50 (us):       1.92379e+06
        stddev (us):    44768
        n:              5 * 1000 token(s)
Token generation:
        avg (us):       25237.2
        avg (tokens/s): 39.624
        p50 (us):       24860.9
        stddev (us):    4874.52
        n:              635 * 1 token(s)
```
Quoting cppreference.com:
```
  (the [[noreturn]] attribute) Indicates that the function will not
  return control flow to the calling function after it finishes (e.g.
  functions that terminate the application, throw exceptions, loop
  indefinitely, etc.). This attribute applies to the name of the
  function being declared in function declarations only.
  
  If a function previously declared with `[[noreturn]]` is invoked and
  that invocation eventually returns, the behavior is runtime-undefined.
```

The `SafeIntOn*` member functions immediately throw, so if they are used
in a function with non-void return type, g++ 14 issues a warning that
there exist control paths in the function where no value is returned.

Fix this by marking the member functions explicitly noreturn.

This is needed so onnxruntime builds correctly with `-Wall -Wextra`.
…icrosoft#25832)

This PR addresses accessibility issues with focus indicators on the ONNX
Runtime website documentation where contrast ratios were insufficient
for keyboard navigation users. The accessibility audit revealed that
focus states for key navigation elements like "Learn more about ONNX
Runtime & Generative AI", "Quickstart", "Tutorials", "Install ONNX
Runtime", and "Hardware Acceleration" had contrast ratios as low as
1.152:1, well below the WCAG 2.1 AA requirement of 3:1 for UI
components.

## Changes Made

### 1. Enhanced List Group Item Focus Contrast
- **Before**: `color: #555` on `background-color: #f5f5f5` (6.8:1 ratio)
- **After**: `color: #333` on `background-color: #f5f5f5` (**11.6:1
ratio**)

### 2. Improved Info List Group Item Focus Contrast  
- **Before**: `color: #31708f` on `background-color: #c4e3f3` (4.1:1
ratio)
- **After**: `color: #1e4a5f` on `background-color: #c4e3f3` (**7.1:1
ratio**)

### 3. Added Visible Focus Indicators for Form Inputs
Previously, search and filter inputs only removed the default outline
(`outline: 0`) without providing alternative focus indicators, making
them inaccessible to keyboard users.

- **Added**: `border: 2px solid #0050C5` and `background-color: #f8f9fa`
on focus
- **Contrast ratio**: **6.7:1** (exceeds requirements)

## Accessibility Compliance

All changes now exceed WCAG 2.1 AA standards:
- ✅ **3:1 minimum** for UI components and focus indicators
- ✅ **4.5:1 minimum** for normal text (all exceed 7:1)
- ✅ **Keyboard navigation** fully supported with visible focus
indicators
- ✅ **Screen reader compatibility** improved with clear focus states

## Impact

- Low vision users can now clearly see focused elements during keyboard
navigation
- All mentioned navigation elements meet accessibility standards
- No functionality broken - purely visual accessibility enhancements
- Compliance with MAS 1.4.11 Non-text Contrast requirements

## Files Modified

- `csharp/ApiDocs/_exported_templates/default/styles/docfx.css` -
Enhanced input focus indicators
- `csharp/ApiDocs/_exported_templates/default/styles/docfx.vendor.css` -
Improved text contrast ratios

Fixes microsoft#24995.

<!-- START COPILOT CODING AGENT TIPS -->
---

✨ Let Copilot coding agent [set things up for
you](https://github.com/microsoft/onnxruntime/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: MaanavD <24942306+MaanavD@users.noreply.github.com>
…ft#25819)

The DocFX tab controls on onnxruntime.ai were not accessible via
keyboard navigation, violating MAS 2.1.1 keyboard accessibility
requirements. Users could not navigate between language tabs (Python,
C#, Java, JavaScript, C++) using keyboard-only input.

## Problem
The existing implementation in `docfx.js` only handled mouse click
events but lacked keyboard event handlers. This prevented keyboard users
from:
- Navigating between tabs using arrow keys
- Activating tabs using Enter/Space keys
- Jumping to first/last tabs using Home/End keys

## Solution
Added comprehensive keyboard navigation support following the WAI-ARIA
tabs design pattern:

```javascript
// Added keyboard event listener alongside existing click handler
container.addEventListener('keydown', function (event) { return handleKeyDown(event, state); });
```

The `handleKeyDown` function implements:
- **Arrow key navigation**: Left/Right and Up/Down keys move focus
between tabs with wrapping
- **Tab activation**: Enter and Space keys activate the focused tab
- **Quick navigation**: Home/End keys jump to first/last tabs
- **Proper focus management**: Only the active tab has `tabIndex="0"`,
others have `tabIndex="-1"`
- **Event handling**: `preventDefault()` and `stopPropagation()` for
handled keys

## Accessibility Features
- Follows WAI-ARIA tabs pattern specifications
- Maintains proper ARIA attributes (`role="tab"`, `aria-selected`, etc.)
- Provides visual focus indicators via existing CSS
- Supports both horizontal and vertical arrow key navigation
- Implements circular navigation (wrapping at boundaries)

## Testing
Validated functionality with comprehensive keyboard navigation tests:
- ✅ Arrow keys navigate between tabs with proper wrapping
- ✅ Enter/Space keys activate focused tabs and switch content panels
- ✅ Home/End keys jump to first/last tabs correctly  
- ✅ Focus management works with proper `tabIndex` handling
- ✅ Visual feedback shows focused vs selected tab states

This ensures keyboard users can fully access all tab functionality
without requiring mouse interaction.

Fixes microsoft#24997.

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: MaanavD <24942306+MaanavD@users.noreply.github.com>
This pull request introduces several improvements and refactorings to
the quantized Mixture-of-Experts (QMoE) operator in ONNX Runtime,
focusing on enhanced support for FP32 mode, improved SwiGLU activation
handling, and better test coverage. The most important changes are
grouped below by theme.

### Operator Registration and Type Support

- Added explicit registration and support for `QMoE` operator with both
`MLFloat16` and `float` data types, enabling FP32 (non-quantized) mode
in addition to quantized modes. This includes updates to kernel
registration and schema/type constraints.
[[1]](diffhunk://#diff-fd949b2a9885f634c37c2048da9e35d227ed20adf1d7baf5de488f304a78bde9L109-R110)
[[2]](diffhunk://#diff-fd949b2a9885f634c37c2048da9e35d227ed20adf1d7baf5de488f304a78bde9L275-R277)
[[3]](diffhunk://#diff-81f57d9adc2cce94f85a2949a895b7ff82efcc13d05e23ee6567661f0fecb7c0L1467-R1467)
[[4]](diffhunk://#diff-81f57d9adc2cce94f85a2949a895b7ff82efcc13d05e23ee6567661f0fecb7c0L1548-R1548)

### SwiGLU Activation Improvements

- Refactored `ApplySwiGLUActivation` to accept configurable
`activation_alpha` and `activation_beta` parameters, matching CUDA
behavior and allowing flexibility in activation function tuning. Also,
dropped support for non-interleaved memory layouts (now not
implemented).
[[1]](diffhunk://#diff-4e4afb8dcdade0abe18bd8bea68b148b4090cd86d60a1b1422c049960231737dR49-R60)
[[2]](diffhunk://#diff-edb344a38502bba9a0083ab98e274ec1b5b2606639a61df7be474a600a7b99d2L29-R61)
[[3]](diffhunk://#diff-f85806c745243652a0336da094126687a6c0d14b19fe760abe73df1d940dc4cbL12-R13)
- Now reads `activation_alpha` and `activation_beta` attributes from
operator parameters, defaulting to values appropriate for SwiGLU.

### QMoE Operator Implementation Refactor

- Refactored the QMoE operator to clarify separation between quantized
and FP32 implementations, and restructured internal methods for better
maintainability. Added template parameterization for data types and
improved handling of expert weights and biases.
[[1]](diffhunk://#diff-e54124baa488af74400fae0f0dbd5cf7d4f1e307c0a5ba0e9dc79622e1315cd5R13-R35)
[[2]](diffhunk://#diff-e54124baa488af74400fae0f0dbd5cf7d4f1e307c0a5ba0e9dc79622e1315cd5L38-R55)
[[3]](diffhunk://#diff-e54124baa488af74400fae0f0dbd5cf7d4f1e307c0a5ba0e9dc79622e1315cd5L58-L59)

### Shape Checking and Layout

- Removed legacy shape/layout support in QMoE input validation,
enforcing only the new memory layout for expert weights and improving
consistency and forward compatibility.

### Test and Documentation Updates

- Updated unit tests for QMoE to use correct zero-point values for
quantized weights (e.g., 0x88 for int4, 128 for int8), ensuring that
test cases accurately reflect expected zero-output behavior for zero
weights. Also clarified comments and expected outputs for SwiGLU and
quantized scenarios.
[[1]](diffhunk://#diff-27ea1ef8d40401d116e653d6b935304a7ad68ee8300d04ea98e814c585abee75L1340-R1349)
[[2]](diffhunk://#diff-27ea1ef8d40401d116e653d6b935304a7ad68ee8300d04ea98e814c585abee75L1379-R1380)
[[3]](diffhunk://#diff-27ea1ef8d40401d116e653d6b935304a7ad68ee8300d04ea98e814c585abee75L1404-R1413)
[[4]](diffhunk://#diff-27ea1ef8d40401d116e653d6b935304a7ad68ee8300d04ea98e814c585abee75L1525-R1538)

These changes collectively improve the flexibility, correctness, and
maintainability of the QMoE operator in ONNX Runtime.


Unit test result
```
sRunning test: batch_size=1, sequence_length=8, quant_bits=4, use_swiglu=True, swiglu_interleaved=True
Parity check - SwiGLU(interleaved=True) 4-bit: max_diff = 0.000372
.Running test: batch_size=1, sequence_length=8, quant_bits=8, use_swiglu=True, swiglu_interleaved=True
Parity check - SwiGLU(interleaved=True) 8-bit: max_diff = 0.000392
.Running test: batch_size=1, sequence_length=32, quant_bits=4, use_swiglu=True, swiglu_interleaved=True
Parity check - SwiGLU(interleaved=True) 4-bit: max_diff = 0.000470
.Running test: batch_size=1, sequence_length=32, quant_bits=8, use_swiglu=True, swiglu_interleaved=True
Parity check - SwiGLU(interleaved=True) 8-bit: max_diff = 0.000442
.Running test: batch_size=4, sequence_length=8, quant_bits=4, use_swiglu=True, swiglu_interleaved=True
Parity check - SwiGLU(interleaved=True) 4-bit: max_diff = 0.000470
.Running test: batch_size=4, sequence_length=8, quant_bits=8, use_swiglu=True, swiglu_interleaved=True
Parity check - SwiGLU(interleaved=True) 8-bit: max_diff = 0.000442
.Running test: batch_size=4, sequence_length=32, quant_bits=4, use_swiglu=True, swiglu_interleaved=True
Parity check - SwiGLU(interleaved=True) 4-bit: max_diff = 0.000609
.Running test: batch_size=4, sequence_length=32, quant_bits=8, use_swiglu=True, swiglu_interleaved=True
Parity check - SwiGLU(interleaved=True) 8-bit: max_diff = 0.000702
.
----------------------------------------------------------------------
Ran 9 tests in 46.754s

OK (skipped=1)
```

---------

Co-authored-by: Tianlei Wu <tlwu@microsoft.com>
…ile build (microsoft#25849)

### Description
`ABSL_FLAGS_STRIP_NAMES `is set to 1 by default to disable flag
registration when building for Android, iPhone, and "embedded devices".
So, running onnxruntime_perf_test on Android will see that flags are not
registered.

<img width="872" height="182" alt="image (2)"
src="https://github.com/user-attachments/assets/eb6a6772-cdff-4d60-a3c7-4352477e956c"
/>

Set `ABSL_FLAGS_STRIP_NAMES ` to 0 by default for all builds.
### Description
The phi4 mini in Edge is using ai.onnx v21. Without this change, it
results a `MemcpyToHost` inserted and slows the generation speed.
This change uses subgroupShuffle for sg_size=64 to perform the matmul.
It also uses a loop instead of loop unrolling to reduce the register
pressure.

Phi4 prefill for 1K tokens becomes 8.8s from 11.32s on Qualcomm Adreno
X1-85 GPU.
### Description
This change adds CUDA Graph support to the NV TensorRT RTX Execution
Provider (EP).

### Motivation and Context
Integrating CUDA Graphs into the NV TRT RTX EP provides:
Lower latency by minimizing per-kernel launch overhead.
Better throughput for repeated inference runs.
Improved efficiency on GPUs with high kernel launches overhead
sensitivity.

---------

Co-authored-by: Maximilian Mueller <maximilianm@nvidia.com>
Co-authored-by: Gaurav Garg <gaugarg@nvidia.com>
### Description
Enable einsum op with QK equations for attention in QNN EP.


### Motivation and Context
Current einsum op in QNN doesn't support equations with capital alphabets. Loose this constraint to allow more usecases.

Signed-off-by: Mu-Chein Hsu <quic_muchhsu@quicinc.com>
…#25833)

### Description
<!-- Describe your changes. -->
While memory profiling some models I noticed multiple file mapping
failures.
`WindowsEnv::MapFileIntoMemory()` While it properly checks for the
mapping offset to be granularity
  aligned, it calculates it as page aligned.
Also, while saving external tensors we do not need to align big tensors
to windows granularity or anything
  that is platform dependent. Set it to 4096 for all platforms.
  Granularity matters only for calculating mapping address.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Multiple failures for file mapping for certain models.
This saves some hundreds of Mbs for some models.
### Description
<!-- Describe your changes. -->
Fix packaging pipelines


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
During CIs and local builds Ort::Status() gets inherited from the base
due to using directives,
  however, that does not work for packaging pipelines.
Having default ctor is important for storing Status in containers if
needed.
Bumps [actions/setup-java](https://github.com/actions/setup-java) from 4
to 5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-java/releases">actions/setup-java's
releases</a>.</em></p>
<blockquote>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<h3>Breaking Changes</h3>
<ul>
<li>Upgrade to node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/setup-java/pull/888">actions/setup-java#888</a></li>
</ul>
<p>Make sure your runner is updated to this version or newer to use this
release. v2.327.1 <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">Release
Notes</a></p>
<h3>Dependency Upgrades</h3>
<ul>
<li>Upgrade Publish Immutable Action by <a
href="https://github.com/HarithaVattikuti"><code>@​HarithaVattikuti</code></a>
in <a
href="https://redirect.github.com/actions/setup-java/pull/798">actions/setup-java#798</a></li>
<li>Upgrade eslint-plugin-jest from 27.9.0 to 28.11.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-java/pull/730">actions/setup-java#730</a></li>
<li>Upgrade undici from 5.28.5 to 5.29.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-java/pull/833">actions/setup-java#833</a></li>
<li>Upgrade form-data to bring in fix for critical vulnerability by <a
href="https://github.com/gowridurgad"><code>@​gowridurgad</code></a> in
<a
href="https://redirect.github.com/actions/setup-java/pull/887">actions/setup-java#887</a></li>
<li>Upgrade actions/checkout from 4 to 5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-java/pull/896">actions/setup-java#896</a></li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li>Prevent default installation of JetBrains pre-releases by <a
href="https://github.com/priyagupta108"><code>@​priyagupta108</code></a>
in <a
href="https://redirect.github.com/actions/setup-java/pull/859">actions/setup-java#859</a></li>
<li>Improve Error Handling for Setup-Java Action to Help Debug
Intermittent Failures by <a
href="https://github.com/gowridurgad"><code>@​gowridurgad</code></a> in
<a
href="https://redirect.github.com/actions/setup-java/pull/848">actions/setup-java#848</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/gowridurgad"><code>@​gowridurgad</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-java/pull/848">actions/setup-java#848</a></li>
<li><a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-java/pull/888">actions/setup-java#888</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-java/compare/v4...v5.0.0">https://github.com/actions/setup-java/compare/v4...v5.0.0</a></p>
<h2>v4.7.1</h2>
<h2>What's Changed</h2>
<h3>Documentation changes</h3>
<ul>
<li>Add Documentation to Recommend Using GraalVM JDK 17 Version to
17.0.12 to Align with GFTC License Terms by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-java/pull/704">actions/setup-java#704</a></li>
<li>Remove duplicated GraalVM section in documentation by <a
href="https://github.com/Marcono1234"><code>@​Marcono1234</code></a> in
<a
href="https://redirect.github.com/actions/setup-java/pull/716">actions/setup-java#716</a></li>
</ul>
<h3>Dependency updates:</h3>
<ul>
<li>Upgrade <code>@​action/cache</code> from 4.0.0 to 4.0.2 by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-java/pull/766">actions/setup-java#766</a></li>
<li>Upgrade <code>@​actions/glob</code> from 0.4.0 to 0.5.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-java/pull/744">actions/setup-java#744</a></li>
<li>Upgrade ts-jest from 29.1.2 to 29.2.5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-java/pull/743">actions/setup-java#743</a></li>
<li>Upgrade <code>@​action/cache</code> to 4.0.3 by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-java/pull/773">actions/setup-java#773</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-java/compare/v4...v4.7.1">https://github.com/actions/setup-java/compare/v4...v4.7.1</a></p>
<h2>v4.7.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Configure Dependabot settings by <a
href="https://github.com/HarithaVattikuti"><code>@​HarithaVattikuti</code></a>
in <a
href="https://redirect.github.com/actions/setup-java/pull/722">actions/setup-java#722</a></li>
<li>README Update: Added a permissions section by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/setup-java/pull/723">actions/setup-java#723</a></li>
<li>Upgrade <code>cache</code> from version 3.2.4 to 4.0.0 by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-java/pull/724">actions/setup-java#724</a></li>
<li>Upgrade <code>@actions/http-client</code> from 2.2.1 to 2.2.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-java/pull/728">actions/setup-java#728</a></li>
<li>Upgrade <code>actions/publish-immutable-action</code> from 0.0.3 to
0.0.4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-java/pull/727">actions/setup-java#727</a></li>
<li>Upgrade <code>@types/jest</code> from 29.5.12 to 29.5.14 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-java/pull/729">actions/setup-java#729</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/actions/setup-java/commit/dded0888837ed1f317902acf8a20df0ad188d165"><code>dded088</code></a>
Bump actions/checkout from 4 to 5 (<a
href="https://redirect.github.com/actions/setup-java/issues/896">#896</a>)</li>
<li><a
href="https://github.com/actions/setup-java/commit/0913e9a06eb8b69c62db76aa61f580c2b3a5b4e0"><code>0913e9a</code></a>
Upgrade to node 24 (<a
href="https://redirect.github.com/actions/setup-java/issues/888">#888</a>)</li>
<li><a
href="https://github.com/actions/setup-java/commit/e9343db97e09d87a3c50e544105d99fe912c204b"><code>e9343db</code></a>
Bumps form-data (<a
href="https://redirect.github.com/actions/setup-java/issues/887">#887</a>)</li>
<li><a
href="https://github.com/actions/setup-java/commit/ae2b61dbc685e60e4427b2e8ed4f0135c6ea8597"><code>ae2b61d</code></a>
Bump undici from 5.28.5 to 5.29.0 (<a
href="https://redirect.github.com/actions/setup-java/issues/833">#833</a>)</li>
<li><a
href="https://github.com/actions/setup-java/commit/c190c18febcf6c040d80b10ea201a05a2c320263"><code>c190c18</code></a>
Bump eslint-plugin-jest from 27.9.0 to 29.0.1 (<a
href="https://redirect.github.com/actions/setup-java/issues/730">#730</a>)</li>
<li><a
href="https://github.com/actions/setup-java/commit/67aec007b3fcabe15ca665bfccc1e255dd52e30d"><code>67aec00</code></a>
Fix: prevent default installation of JetBrains pre-releases (<a
href="https://redirect.github.com/actions/setup-java/issues/859">#859</a>)</li>
<li><a
href="https://github.com/actions/setup-java/commit/ebb356cc4e59bcf94f518203228485f5d40e4b58"><code>ebb356c</code></a>
Improve Error Handling for Setup-Java Action to Help Debug Intermittent
Failu...</li>
<li><a
href="https://github.com/actions/setup-java/commit/f4f1212c880fdec8162ea9a6493f4495191887b4"><code>f4f1212</code></a>
Update publish-immutable-actions.yml (<a
href="https://redirect.github.com/actions/setup-java/issues/798">#798</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/setup-java/compare/v4...v5">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-java&package-manager=github_actions&previous-version=4&new-version=5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…at info (microsoft#25841)

### Description
This PR adds a new API that applications can use to verify compatibility
of a precompiled model with the underlying system, using only the
compatibility info string from the model's metadata.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
- This is a feature to enable apps to check compatibility of a
precompiled model without necessarily having the model locally on the
device. This enables precompiled models to be stored remotely and
downloaded once the application has been able to confirm the validity of
a given model with EPs on the device.

### Testing
- New unit tests pass 
- For regression testing, built a private version of WinML + AMD NPU EP
with these changes. Ran the Cpp Selfcontained Desktop sample
successfully; ran with compilation and also re-ran using the
already-compiled model to verify that session initialization continued
to work as expected.

---------

Co-authored-by: Aditya Rastogi <adityar@ntdev.microsoft.com>
)

### Description
<!-- Describe your changes. -->
According to the [WebNN
spec](https://www.w3.org/TR/webnn/#api-mlgraphbuilder-batchnorm), the
batchNorm should have input names "mean" and "variance" instead of
"input_mean" and "input_var".


### Motivation and Context
This issue causes any BatchNorm with mean/variance inputs to fall back
to wasm.
…dLayerNorm (microsoft#25850)

### Description
Use similar shaders as SkipSimplifiedLayerNorm in SimplifiedLayerNorm,
to fix the performance issues with SimplifiedLayerNorm.

### Motivation and Context
Prior to this change, generation in Bitnet was bottlenecked on
SimplifiedLayerNorm
<img width="332" height="378" alt="image"
src="https://github.com/user-attachments/assets/3bc16ac1-ef7d-46bf-b403-92fc9192a2df"
/>
with this change performance has now improved to match
SkipSimplifiedLayerNorm
<img width="699" height="179" alt="image"
src="https://github.com/user-attachments/assets/30009d85-d5d9-4585-987a-b39ecf52e0b5"
/>
…s int32 (microsoft#25646)

### Description
This PR makes DequantizeLinear support non-zero zero_point when input
data type is int32.



### Motivation and Context
For WebNN use case, we have some scenarios that input data type is int32
and the zero_point is not zero for DequantizeLinear.
@Jaswanth51 Jaswanth51 requested a review from ankitm3k August 28, 2025 03:58
Copy link

@preetha-intel preetha-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backmerging with Master

@Jaswanth51 Jaswanth51 merged commit b9a1885 into ovep-develop Aug 28, 2025
6 of 8 checks passed
@preetha-intel preetha-intel deleted the sync_msft_28082025 branch August 28, 2025 04:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.