[OVEP] OpenVINO EP 1.26.0 Development Release Updates by ankitm3k · Pull Request #28297 · microsoft/onnxruntime

ankitm3k · 2026-04-30T17:41:25Z

Summary

Periodic upstream sync of Intel's OVEP branch (ovep_1_26_release) into ORT main. All changes are scoped to the OpenVINO EP and its tests.

OpenVINO 2026.0 / 2026.1 support

Add V_2026_0 / V_2026_1 version enums; capability.cc default bumped to V_2026_1.
Register FLOAT8E4M3FN / FLOAT8E5M2 initializer types on CPU / GPU / NPU.
Disable OVEP-level QDQ-stripping on OV ≥ 2026.1 (OV handles it internally).
Add ReduceSum to no-dimension-supported ops.

KV-cache / stateful CausalLM

Rename ReorderKVCache → SetReorderKVCacheStatus across backend interfaces.
Populate src_idx / dst_idx in PreProcessInferRequest with shape validation; clean state after inference and on RewindKVCache.
FuseCacheReorder: beam_idx and src_idx/dst_idx paths are now mutually exclusive; reject models that already carry reorder inputs.
Behavior change: RewindKVCache(index > 0) now throws when reorder is enabled (physical KV-cache eviction pass is a TODO).

NPU / provider options

Force disable_dynamic_shapes=true on NPU unless enable_causallm is set.
Preserve user-supplied NPU_COMPILATION_MODE_PARAMS; skip it when importing precompiled blobs.
Preserve factory-level device_type when session options don't override it (fixes NPU mis-selection from Python).
Behavior change: removed the ORT_OPENVINO_NPU_COMPILER_TYPE env override — OV's default NPU compiler is used now.

External initializers / weight sharing

Drop the 32 MB embed threshold — always externalize when multiple external initializers are in memory.
DumpOpenVINOEPModel rebuilds a self-contained proto when initializer data was stripped.
AddExternalWeight validates re-adds against existing offset/size/location (parity with ABI EP); fix race in device-tensor mapping.
ov_bin_manager: bounds-checked pointer view over mapped weights (fixes read-only blob import).
qdq_stripping: use std::from_chars so offsets/lengths > 4 GB parse correctly.

Perf-count dump

New ORT_OPENVINO_PERF_COUNT=<dir> env var writes per-subgraph CSV (Layer Name,Status,Layer Type,Real Time (us),Exec Type), replacing the old stdout-only debug dump. Requires ov::enable_profiling on the compiled model; logs a warning and no-ops otherwise.

Misc

API: IBackend::Infer is no longer const (needed for perf-dump bookkeeping).
Filter orphaned graph outputs from OVEP sub-graphs.
Better error message for "cannot export dynamically compiled model" (points to reshape_input).
Human-readable ovep_exception::type strings.
ov::shutdown() on DLL unload.

Tests

Add OVEP_ExtInit_DynamicEmbed_Tests and OVEP_ExtInit_EmptyRawData_Tests; refactor setup into SetUpTestSuite.
Narrow OVEP exclusions in embed_layer_norm, fused_matmul, matmul_4bits, quantize_linear (skip only unsupported sub-cases).
perftest: reset outputs per run to support data-dependent output shapes (e.g. NonZero).

Testing

Validated against the OpenVINO versions this release targets (2025.3 – 2026.1) on CPU / GPU / NPU:

New OVEP tests pass: OVEP_ExtInit_Tests, OVEP_ExtInit_DynamicEmbed_Tests, OVEP_ExtInit_EmptyRawData_Tests
Narrowed contrib-op exclusions verified against EmbedLayerNorm, FusedMatMul, MatMulNBits, QuantizeLinear
Stateful CausalLM flow exercised for KV-cache reorder + rewind
ORT_OPENVINO_PERF_COUNT=<dir> verified to produce per-subgraph CSVs
2+ GB external-initializers-in-memory model loads on CPU / GPU / NPU

…759) * ov_factory: Use 'GPU_DEVICE_ID' property to match with ORT device_id * clean up comment

Sync msft 24 7 25

Backmerging with Msft commits

Sync with Microsoft ONNX Runtime - 31/07/2025

* Add on-the-fly bfloat16->float16 conversion pass * Fix undetected bfloat16 initializers * Remove the option and make the logic implicit * Add tests * Rename detection function * Fix CI for strict aliasing rules --------- Co-authored-by: Vishnudas Thaniel S <vishnudas.thaniel.s@intel.com>

…lisers

Mild weight as input implemented to keep quantization parameters as initializers for QDQ nodes

Sync with Microsoft ONNX Runtime - 05/08/2025

Sync with Microsoft ONNX Runtime - 07/08/2025

Sync with Microsoft ONNX Runtime - 08/08/2025

DeQuantizeLinear is dangling which needs to be handled in capability.cc

Sync with Microsoft ONNX Runtime - 12/08/2025

Not setting default precision if it is not set via provider option.

#776) * Fix failing case where input onnx model is used with shared context enabled * Update onnxruntime/core/providers/openvino/openvino_execution_provider.cc Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: MayureshV1 <47039074+MayureshV1@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* [OVEP] Support for providing layout to input/output to OpenVINO * [OVEP] Minor bug fixes for layout feature

Sync with Microsoft ONNX Runtime - [18/08/2025]

CVS-165537 Minor fixes to partially enable contrib op tests

Sync with Microsoft ONNX Runtime - 2404202

#1046) * Fix incorrect device selection(NPU) with python app running OVEP GPU backend * apply reviwer comment * apply new change * remove last change and extra space * remove space * cleaning * fix the name * Revert "cleaning" This reverts commit c58f3d6. * Revert "remove space" This reverts commit ba1939b. * Revert "remove last change and extra space" This reverts commit 0cc2294. * Revert "apply new change" This reverts commit bddb18b. * Revert "fix the name" This reverts commit c800fdc. * revert back new changes --------- Co-authored-by: Ankit Maheshkar <ankit.maheshkar@intel.com> Co-authored-by: MayureshV1 <47039074+MayureshV1@users.noreply.github.com>

Sync with Microsoft ONNX Runtime - 26042026

lint fixes for OVEP develop branch

Backmerging PR

ankitm3k · 2026-04-30T17:44:06Z

@adrianlizarraga Please review & merge this PR.

Copilot

Pull request overview

Upstream sync of Intel’s OpenVINO EP development branch into ONNX Runtime main, adding OpenVINO 2026.0/2026.1 support and updating OVEP behavior/tests around external initializers, stateful CausalLM KV-cache handling, and profiling/perf-count dumping.

Changes:

Add OV 2026.0/2026.1 version handling and expand supported initializer/op capability metadata.
Update KV-cache reorder/stateful infer-request interfaces and logic (incl. new failure behavior on RewindKVCache(index>0) when reorder is enabled).
Improve external initializer handling/weight sharing + add/adjust OVEP tests; add env-driven perf-count CSV dumping; adjust perftest to reset outputs for data-dependent shapes.

Reviewed changes

Copilot reviewed 27 out of 27 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
onnxruntime/test/providers/openvino/openvino_ep_ext_init.cc	Refactors/extents OVEP external-initializer tests; adds new edge-case coverage (dynamic embed + empty raw_data).
onnxruntime/test/providers/cpu/tensor/quantize_linear_test.cc	Adjusts skips/exclusions for QuantizeLinear coverage across EPs.
onnxruntime/test/perftest/ort_test_session.cc	Resets outputs between runs to handle data-dependent output shapes in perf tests.
onnxruntime/test/contrib_ops/matmul_4bits_test.cc	Updates tolerances and removes OpenVINO build guard to broaden test coverage.
onnxruntime/test/contrib_ops/fused_matmul_op_test.cc	Narrows OpenVINO exclusion to specific failing cases; keeps TensorRT excluded for unsupported dtype.
onnxruntime/test/contrib_ops/embed_layer_norm_op_test.cc	Only excludes OpenVINO when 3rd output is requested; otherwise runs normally.
onnxruntime/core/providers/openvino/qdq_transformations/qdq_stripping.cc	Switches offset/length parsing to `from_chars` for large (>4GB) external-data metadata.
onnxruntime/core/providers/openvino/ov_versions/data_ops.h	Adds `V_2026_0` / `V_2026_1` version enums.
onnxruntime/core/providers/openvino/ov_versions/data_ops.cc	Updates supported types/ops by OV version; adds FLOAT8 initializer types and `ReduceSum` no-dimension support.
onnxruntime/core/providers/openvino/ov_versions/capability.cc	Bumps default OV version mapping and filters orphaned graph outputs for subgraph outputs.
onnxruntime/core/providers/openvino/ov_stateful_patch_utils.cc	Updates cache reorder fusion to enforce mutual exclusivity of `beam_idx` vs `src_idx`/`dst_idx` paths.
onnxruntime/core/providers/openvino/ov_shared_context.h	Adds external-weight re-add validation and improves metadata insertion under lock.
onnxruntime/core/providers/openvino/ov_shared_context.cc	Adds locking around device-tensor mapping creation to fix races.
onnxruntime/core/providers/openvino/ov_interface.h	Renames KV-cache reorder API and adds cleanup hook for reorder status.
onnxruntime/core/providers/openvino/ov_interface.cc	Implements reorder tensor population/validation, cleanup after inference, and updated rewind behavior.
onnxruntime/core/providers/openvino/ov_bin_manager.cc	Adds bounds checks and changes mapped-blob tensor view construction for import.
onnxruntime/core/providers/openvino/openvino_provider_factory.cc	Updates provider option handling (NPU dynamic-shape defaulting; preserve user compilation params; preserve factory device_type).
onnxruntime/core/providers/openvino/openvino_provider_dllmain.cc	Calls `ov::shutdown()` on DLL unload.
onnxruntime/core/providers/openvino/openvino_execution_provider.cc	Renames dynamic option hook to call `SetReorderKVCacheStatus`.
onnxruntime/core/providers/openvino/ibackend.h	Makes `Infer` non-const and renames reorder interface.
onnxruntime/core/providers/openvino/exceptions.h	Adds human-readable exception type strings and adjusts error formatting.
onnxruntime/core/providers/openvino/backends/basic_backend.h	Adds perf-count dump helpers/state and updates interfaces for non-const infer + KV reorder rename.
onnxruntime/core/providers/openvino/backends/basic_backend.cc	Implements env-driven perf-count CSV dumping and skips compilation params on precompiled blob import.
onnxruntime/core/providers/openvino/backend_utils.h	Adds perf-count dump path accessor and changes perf-count print signatures.
onnxruntime/core/providers/openvino/backend_utils.cc	Implements `ORT_OPENVINO_PERF_COUNT` handling and CSV formatting for profiling output.
onnxruntime/core/providers/openvino/backend_manager.h	Renames backend-manager reorder interface.
onnxruntime/core/providers/openvino/backend_manager.cc	Updates export error messaging, gates OVEP QDQ stripping by OV version, adjusts external-initializer embedding heuristic, and updates debug model dumping.

Comments suppressed due to low confidence (1)

onnxruntime/core/providers/openvino/ov_interface.cc:534

PreProcessInferRequest() sets src_idx/dst_idx twice: the block starting at if (is_kvcache_reorder_added) is duplicated later in the same function. This duplicates allocations/fills and makes the logic harder to maintain. Remove the second block (or factor into a helper) so the reorder tensors are prepared exactly once per inference.

  if (is_kvcache_reorder_added) {
    ov::Shape dst_idx_shape = ovInfReq.get_tensor("dst_idx").get_shape();
    const auto kv_num_heads = dst_idx_shape[1];
    const auto kv_head_size = dst_idx_shape[3];
    if (kv_src_indices.size() > 0) {
      ov::Tensor src_idx_tensor = ov::Tensor(ov::element::i32, {kv_src_indices.size()});
      const auto src_idx_ptr = src_idx_tensor.data<int32_t>();
      for (size_t i = 0; i < kv_src_indices.size(); ++i) {
        src_idx_ptr[i] = static_cast<int32_t>(kv_src_indices[i]);
      }
      ovInfReq.set_tensor("src_idx", src_idx_tensor);

      ov::Tensor dst_idx_tensor = ov::Tensor(ov::element::i32, {1, kv_num_heads, kv_dst_indices.size(), kv_head_size});
      const auto dst_idx_ptr = dst_idx_tensor.data<int32_t>();
      for (size_t i = 0; i < kv_num_heads; ++i) {
        for (size_t j = 0; j < kv_dst_indices.size(); ++j) {
          std::fill_n(dst_idx_ptr + (i * kv_dst_indices.size() + j) * kv_head_size, kv_head_size, kv_dst_indices[j]);
        }
      }
      ovInfReq.set_tensor("dst_idx", dst_idx_tensor);
    } else {
      FillTensor("src_idx", ov::element::i32, {0}, 0);
      FillTensor("dst_idx", ov::element::i32, {1, kv_num_heads, 0, kv_head_size}, 0);
    }
  }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Agent-Logs-Url: https://github.com/intel/onnxruntime/sessions/ccb21443-e4ea-4375-8aaa-2f953e78af4f Co-authored-by: MayureshV1 <47039074+MayureshV1@users.noreply.github.com>

Agent-Logs-Url: https://github.com/intel/onnxruntime/sessions/b680222b-444d-4bb6-a487-d6a402683cea Co-authored-by: MayureshV1 <47039074+MayureshV1@users.noreply.github.com>

Agent-Logs-Url: https://github.com/intel/onnxruntime/sessions/b680222b-444d-4bb6-a487-d6a402683cea Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: MayureshV1 <47039074+MayureshV1@users.noreply.github.com>

MayureshV1 · 2026-05-01T06:43:57Z

@adrianlizarraga.. Addressed your review comments.

## Summary Periodic upstream sync of Intel's OVEP branch (`ovep_1_26_release`) into ORT main. All changes are scoped to the OpenVINO EP and its tests. ### OpenVINO 2026.0 / 2026.1 support - Add `V_2026_0` / `V_2026_1` version enums; `capability.cc` default bumped to `V_2026_1`. - Register FLOAT8E4M3FN / FLOAT8E5M2 initializer types on CPU / GPU / NPU. - Disable OVEP-level QDQ-stripping on OV ≥ 2026.1 (OV handles it internally). - Add `ReduceSum` to no-dimension-supported ops. ### KV-cache / stateful CausalLM - Rename `ReorderKVCache` → `SetReorderKVCacheStatus` across backend interfaces. - Populate `src_idx` / `dst_idx` in `PreProcessInferRequest` with shape validation; clean state after inference and on `RewindKVCache`. - `FuseCacheReorder`: `beam_idx` and `src_idx`/`dst_idx` paths are now mutually exclusive; reject models that already carry reorder inputs. - **Behavior change:** `RewindKVCache(index > 0)` now throws when reorder is enabled (physical KV-cache eviction pass is a TODO). ### NPU / provider options - Force `disable_dynamic_shapes=true` on NPU unless `enable_causallm` is set. - Preserve user-supplied `NPU_COMPILATION_MODE_PARAMS`; skip it when importing precompiled blobs. - Preserve factory-level `device_type` when session options don't override it (fixes NPU mis-selection from Python). - **Behavior change:** removed the `ORT_OPENVINO_NPU_COMPILER_TYPE` env override — OV's default NPU compiler is used now. ### External initializers / weight sharing - Drop the 32 MB embed threshold — always externalize when multiple external initializers are in memory. - `DumpOpenVINOEPModel` rebuilds a self-contained proto when initializer data was stripped. - `AddExternalWeight` validates re-adds against existing offset/size/location (parity with ABI EP); fix race in device-tensor mapping. - `ov_bin_manager`: bounds-checked pointer view over mapped weights (fixes read-only blob import). - `qdq_stripping`: use `std::from_chars` so offsets/lengths > 4 GB parse correctly. ### Perf-count dump - New `ORT_OPENVINO_PERF_COUNT=<dir>` env var writes per-subgraph CSV (`Layer Name,Status,Layer Type,Real Time (us),Exec Type`), replacing the old stdout-only debug dump. Requires `ov::enable_profiling` on the compiled model; logs a warning and no-ops otherwise. ### Misc - **API:** `IBackend::Infer` is no longer `const` (needed for perf-dump bookkeeping). - Filter orphaned graph outputs from OVEP sub-graphs. - Better error message for "cannot export dynamically compiled model" (points to `reshape_input`). - Human-readable `ovep_exception::type` strings. - `ov::shutdown()` on DLL unload. ### Tests - Add `OVEP_ExtInit_DynamicEmbed_Tests` and `OVEP_ExtInit_EmptyRawData_Tests`; refactor setup into `SetUpTestSuite`. - Narrow OVEP exclusions in `embed_layer_norm`, `fused_matmul`, `matmul_4bits`, `quantize_linear` (skip only unsupported sub-cases). - `perftest`: reset outputs per run to support data-dependent output shapes (e.g. NonZero). ## Testing Validated against the OpenVINO versions this release targets (2025.3 – 2026.1) on CPU / GPU / NPU: - New OVEP tests pass: `OVEP_ExtInit_Tests`, `OVEP_ExtInit_DynamicEmbed_Tests`, `OVEP_ExtInit_EmptyRawData_Tests` - Narrowed contrib-op exclusions verified against EmbedLayerNorm, FusedMatMul, MatMulNBits, QuantizeLinear - Stateful CausalLM flow exercised for KV-cache reorder + rewind - `ORT_OPENVINO_PERF_COUNT=<dir>` verified to produce per-subgraph CSVs - 2+ GB external-initializers-in-memory model loads on CPU / GPU / NPU --------- Signed-off-by: Jonathan Clohessy <jonathan.clohessy@arm.com> Signed-off-by: bfilipek <bartlomiej.filipek@intel.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Christian Bourjau <christian.bourjau@quantco.com> Co-authored-by: Ryan Metcalfe <107415876+RyanMetcalfeInt8@users.noreply.github.com> Co-authored-by: jatinwadhwa921 <110383850+jatinwadhwa921@users.noreply.github.com> Co-authored-by: Jaswanth Gannamaneni <jaswanth.gannamaneni@intel.com> Co-authored-by: Klimenko, Mikhail <mikhail.klimenko@intel.com> Co-authored-by: Vishnudas Thaniel S <vishnudas.thaniel.s@intel.com> Co-authored-by: n1harika <niharika.sathish@intel.com> Co-authored-by: TejalKhade28 <tejal.khade@intel.com> Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com> Co-authored-by: liang <gxgaoliang@126.com> Co-authored-by: Javier Martinez <javier.e.martinez@intel.com> Co-authored-by: MayureshV1 <47039074+MayureshV1@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: sfatimar <sahar.fatima@intel.com> Co-authored-by: Garth Long <garth.long@intel.com> Co-authored-by: Eric Crawford <eric.r.crawford@intel.com> Co-authored-by: derdeljan-msft <derdeljan@microsoft.com> Co-authored-by: Jonathan Clohessy <jonathan.clohessy@arm.com> Co-authored-by: Akshay Sonawane <111780983+apsonawane@users.noreply.github.com> Co-authored-by: Christopher Warrington <chwarr@microsoft.com> Co-authored-by: Ishwar Raut <iraut@nvidia.com> Co-authored-by: Gaurav Garg <gaugarg@nvidia.com> Co-authored-by: Xinpeng Dou <15529241576@163.com> Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com> Co-authored-by: adrastogi <aditya.rastogi@microsoft.com> Co-authored-by: Aditya Rastogi <adityar@ntdev.microsoft.com> Co-authored-by: qti-hungjuiw <hungjuiw@qti.qualcomm.com> Co-authored-by: qti-yuduo <yuduow@qti.qualcomm.com> Co-authored-by: Pradeep Sakhamoori <psakhamoori@microsoft.com> Co-authored-by: Adam Pocock <adam.pocock@oracle.com> Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: mingyue <131847423+mingyueliuh@users.noreply.github.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Susanta Bhattacharjee <susanta.bhattacharjee@intel.com> Co-authored-by: jatinwadhwa921 <jatin.wadhwa@intel.com> Co-authored-by: Jozef Wludzik <jozef.wludzik@intel.com> Co-authored-by: Bartlomiej Filipek <bartlomiej.filipek@intel.com> Co-authored-by: Kotomi-Du <yaru.du@intel.com> Co-authored-by: Rajeev Sekar <rajeevsekar21@gmail.com> Co-authored-by: Mayuresh M Varerkar <mayuresh.m.varerkar@intel.com> Co-authored-by: Mikhail Dvoretckii <mikhail.dvoretckii@intel.com> Co-authored-by: bopeng1234 <bo.peng@intel.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: fs-eire <7679871+fs-eire@users.noreply.github.com> Co-authored-by: Wenqin Yang <wenqin.yang@intel.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: xieofxie <xieofxie@126.com> Co-authored-by: hualxie <hualxie@microsoft.com> Co-authored-by: Jiajia Qin <jiajiaqin@microsoft.com> Co-authored-by: Joshua Lochner <admin@xenova.com> Co-authored-by: Christian Bourjau <cbourjau@users.noreply.github.com> Co-authored-by: Xiaofei Han <xiaofeihan@microsoft.com> Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com> Co-authored-by: chunghow-qti <chunghow@qti.qualcomm.com> Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com> Co-authored-by: Jiawei Shao <jiawei.shao@intel.com> Co-authored-by: Tianlei Wu <tlwu@microsoft.com> Co-authored-by: czekun <chen.zekun@intel.com> Co-authored-by: Ryan Metcalfe <ryan.metcalfe@intel.com> Co-authored-by: Jaskaran Singh Nagi <jaskaran.singh.nagi@intel.com> Co-authored-by: ai-fw-intg <sys_ai_fw_intg@intel.com> Co-authored-by: Rajeev Sekar <rajeev.sekar@intel.com> Co-authored-by: RajeevSekar <117911837+RajeevSekar@users.noreply.github.com> Co-authored-by: Nazanin Beheshti <nazanin.beheshti@intel.com>

RyanMetcalfeInt8 and others added 30 commits July 23, 2025 22:42

ov_factory: Use 'GPU_DEVICE_ID' property to match with ORT device_id (#…

3faf7d9

…759) * ov_factory: Use 'GPU_DEVICE_ID' property to match with ORT device_id * clean up comment

Merge branch 'master' into sync_msft_24_7_25

99e516b

Merge pull request #760 from intel/sync_msft_24_7_25

9c7a151

Sync msft 24 7 25

[OVEP] Fix for upsample optype (#761)

2306b4a

Merge branch 'master' into synccc_msft_29_7_25

1833f04

Merge pull request #762 from intel/synccc_msft_29_7_25

420ec3a

Backmerging with Msft commits

Merge branch 'master' into sync_msft_31072025

5a583f9

Merge pull request #764 from intel/sync_msft_31072025

ba8e3e7

Sync with Microsoft ONNX Runtime - 31/07/2025

[OVEP] Remove checks from load_config (#765)

f4da9f1

[OVEP] Mild weight sharing- quantization paramters are kept as initia…

47a231a

…lisers

Merge pull request #766 from intel/niharika/mild_weight_sharing

c533007

Mild weight as input implemented to keep quantization parameters as initializers for QDQ nodes

Merge branch 'master' into sync_msft_05082025

ddc64b9

Merge pull request #769 from intel/sync_msft_05082025

71f8877

Sync with Microsoft ONNX Runtime - 05/08/2025

Merge branch 'master' into sync_msft_07082025

1170738

Merge pull request #770 from intel/sync_msft_07082025

055300f

Sync with Microsoft ONNX Runtime - 07/08/2025

Merge branch 'master' into sync_msft_08082025

7f7091e

Merge pull request #772 from intel/sync_msft_08082025

c0c1ed7

Sync with Microsoft ONNX Runtime - 08/08/2025

Cluster Change to avoid Dangling DQLinear

e4f8acb

Error in subgraph

aa31709

Merge pull request #774 from intel/sahar/psu_lora

e77892e

DeQuantizeLinear is dangling which needs to be handled in capability.cc

Merge branch 'master' into sync_msft_12082025

7ce9c96

Merge pull request #775 from intel/sync_msft_12082025

07bf616

Sync with Microsoft ONNX Runtime - 12/08/2025

Fix to set precision from config (#778)

725744a

Not setting default precision if it is not set via provider option.

Fix the load_config not work when set INFERENCE_PRECISION_HINT (#777)

609dfbf

[OVEP] Support for providing layout to input/output to OpenVINO (#767)

a6359ee

* [OVEP] Support for providing layout to input/output to OpenVINO * [OVEP] Minor bug fixes for layout feature

Merge branch 'master' into sync_msft_18082025

f05d669

Merge pull request #780 from intel/sync_msft_18082025

78e46e2

Sync with Microsoft ONNX Runtime - [18/08/2025]

Merge branch 'master' into sync_msft_20082025

6dd04a5

RajeevSekar and others added 14 commits April 22, 2026 09:50

Merge branch 'ovep-develop' into rajeev/contrib_ops_test

8156acd

Merge pull request #1050 from intel/rajeev/contrib_ops_test

e662bf1

CVS-165537 Minor fixes to partially enable contrib op tests

Merge remote-tracking branch 'origin/master' into sync_msft_24042026

9f07ece

Merge remote-tracking branch 'origin/master' into sync_msft_26042026

19efec4

Merge pull request #1063 from intel/sync_msft_24042026

9ab8a5d

Sync with Microsoft ONNX Runtime - 2404202

Merge pull request #1064 from intel/sync_msft_26042026

6750358

Sync with Microsoft ONNX Runtime - 26042026

lint fixes

6533ad6

Merge pull request #1065 from intel/rajeev/lint-runner

def12b3

lint fixes for OVEP develop branch

Merge pull request #1069 from intel/jatin_fix_resize_op

c01cf58

Merge branch 'master' into syncing_msft_30_4_2026

53de33a

Merge pull request #1070 from intel/syncing_msft_30_4_2026

1a4e22d

Backmerging PR

undo OVEP CI yml

4f4e365

undo OVEP CI yml v2

37d9ebe

adrianlizarraga reviewed Apr 30, 2026

View reviewed changes

Comment thread onnxruntime/test/providers/cpu/tensor/quantize_linear_test.cc

adrianlizarraga reviewed Apr 30, 2026

View reviewed changes

Comment thread onnxruntime/test/perftest/ort_test_session.cc Outdated

adrianlizarraga requested a review from Copilot April 30, 2026 20:57

Copilot started reviewing on behalf of adrianlizarraga April 30, 2026 20:57 View session

adrianlizarraga reviewed Apr 30, 2026

View reviewed changes

Comment thread onnxruntime/core/providers/openvino/ov_interface.cc

Copilot AI reviewed Apr 30, 2026

View reviewed changes

Copilot AI mentioned this pull request May 1, 2026

Address Adrian's review comments on PR #28297 (perftest output reset + DML EP skip) intel/onnxruntime#1072

Merged

MayureshV1 approved these changes May 1, 2026

View reviewed changes

adrianlizarraga approved these changes May 1, 2026

View reviewed changes

adrianlizarraga merged commit d6c363c into microsoft:main May 1, 2026
87 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OVEP] OpenVINO EP 1.26.0 Development Release Updates#28297

[OVEP] OpenVINO EP 1.26.0 Development Release Updates#28297
adrianlizarraga merged 372 commits intomicrosoft:mainfrom
intel:ovep_1_26_release

ankitm3k commented Apr 30, 2026 •

edited

Loading

Uh oh!

ankitm3k commented Apr 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MayureshV1 commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

ankitm3k commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

OpenVINO 2026.0 / 2026.1 support

KV-cache / stateful CausalLM

NPU / provider options

External initializers / weight sharing

Perf-count dump

Misc

Tests

Testing

Uh oh!

ankitm3k commented Apr 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MayureshV1 commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

ankitm3k commented Apr 30, 2026 •

edited

Loading