Sync with Microsoft ONNX Runtime - 26042026 by ai-fw-intg · Pull Request #1064 · intel/onnxruntime

ai-fw-intg · 2026-04-28T15:56:08Z

Automated daily backmerge from ORT main to ovep-develop. No conflicts detected. Do NOT squash or rebase - use merge commit only.

…uested (microsoft#28027) ### Description When MultiHeadAttention has only 1 output (no present_key/present_value outputs), past key/value inputs should be completely ignored, matching CPU EP semantics. The WebGPU EP was passing pastKey/pastValue TensorViews to shader creation functions even when outputCount <= 1, which affected shader cache keys and allowed past data to leak into the attention computation. This caused the test "MultiHeadAttention Basic, one head and head-size=4 with pastKey and pastValue" to fail with output [17,18,19,20] (pastValue data) instead of expected [9,10,11,12] (V data). The failing output matches exactly what happens when past IS used: Q·pastKey=75 dominates Q·K=35, so softmax gives ~100% weight to pastValue. ### Fix In `applyAttention()`, introduce `effectivePastKey`/`effectivePastValue` that are set to `undefined` when `outputCount <= 1`. All downstream usage (shader creation, input arrays) uses these effective values instead of the raw parameters. This ensures: - Shader cache keys correctly reflect the "no past" configuration - Past tensors are never passed to any shader creation function - Behavior matches CPU EP (which ignores past when present outputs are null) - GQA is unaffected (always has outputCount >= 3) - Vanilla Attention is unaffected (always passes undefined for past)

### Description In the CPU RNN operator's \\Assign_Y_h\\ function, when \\sequence_lens\\ contains a value of 0, the computation \\sequence_lens[batch] - 1 = -1\\ produces a negative offset into the Y output buffer. \\CopyVector\\ then reads \\hidden_size\\ floats from heap memory before the buffer, leaking heap data into the \\Y_h\\ output tensor. LSTM and GRU already handle zero-length sequences correctly (early return + zero-fill in compute path), but the basic RNN operator had neither protection. ### Changes - **rnn.cc \\Compute()\\**: Add early return when \\max_sequence_length == 0\\ — zero-fills Y and Y_h outputs and returns immediately (matches existing LSTM/GRU pattern) - **rnn.cc \\Assign_Y_h()\\**: Add bounds check on \\last_time_step\\ before computing buffer offset — guards against both negative index (\\seq_lens=0\\) and index >= seq_length, zero-fills Y_h for invalid entries Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…icrosoft#28241) ### Description CI Python packaging pipelines now specify their packaging type (nightly vs. release) via an explicit pipeline parameter rather than the implicitly defined pipeline var `NIGHTLY_BUILD`. ### Motivation and Context Much less error prone than an implicitly defined pipeline variable.

### Description Fixes 3 ICM fixes: https://portal.microsofticm.com/imp/v5/incidents/details/31000000572208/summary https://portal.microsofticm.com/imp/v5/incidents/details/31000000573313/summary https://portal.microsofticm.com/imp/v5/incidents/details/31000000575583/summary ### Motivation and Context Fix ICM issues --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

This pull request makes a small change to the CUDA label encoder kernel to address unused parameter warnings. The change marks the `attr_name` parameter as unused in the `TryGetScalarTensorAttribute` function when building with the plugin execution provider. * Code quality improvement: * Marked the `attr_name` parameter as unused with `ORT_UNUSED_PARAMETER(attr_name);` to suppress compiler warnings when building with `BUILD_CUDA_EP_AS_PLUGIN`.

### Description Pass base timestamp for vitisai profiling Notify EP that profiling has started with the base timestamp (in nanoseconds since epoch) The VitisAI EP can use this to: 1. Calculate relative timestamps (event_ts - base_ts) for the profiling timeline 2. Store the absolute base timestamp if needed for other purposes ### Motivation and Context Due to onnxruntime default profiling json file just have the offset timestamp, it doesn't provider the base timestamp for VitisAI EP, To combine the VaitisAI timeline profiling info and the onnxruntime default profiling json file info, We need pass the timestamp for VitisAI EP. --------- Signed-off-by: Andrew Luo <junpengl@amd.com> Co-authored-by: Andrew Luo <junpengl@amd.com>

vraspar and others added 7 commits April 27, 2026 11:40

Merge remote-tracking branch 'origin/master' into sync_msft_26042026

19efec4

ai-fw-intg requested review from Jaswanth51, ankitm3k, jatinwadhwa921 and vthaniel April 28, 2026 15:56

ankitm3k approved these changes Apr 28, 2026

View reviewed changes

ankitm3k merged commit 6750358 into ovep-develop Apr 28, 2026
6 of 8 checks passed

ankitm3k deleted the sync_msft_26042026 branch April 28, 2026 18:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync with Microsoft ONNX Runtime - 26042026#1064

Sync with Microsoft ONNX Runtime - 26042026#1064
ankitm3k merged 7 commits intoovep-developfrom
sync_msft_26042026

ai-fw-intg commented Apr 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

ai-fw-intg commented Apr 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants