[OVEP GPU] add GQA in support list for GPU backend #830

Kotomi-Du · 2025-10-10T22:28:14Z

GQA is originally supported by OV starting from 2025.1. This PR is to align with OV support.

ankitm3k · 2025-10-13T05:35:35Z

onnxruntime/core/providers/openvino/backends/basic_backend.h

      "beam_idx",
      "past_key_values",
      "present",
+      "total_seq_len",


@Kotomi-Du Does the stateful model post translation into OVIR comprise of total_seq_len input always? Is this a general case for all LLMs now (since which OV toolkit version this was added)?

it is the input name from Msft generic model (specifically Phisilica model), not the Epctx OVIR model OV toolkit generated

ankitm3k · 2025-10-13T05:38:17Z

onnxruntime/core/providers/openvino/ov_versions/data_ops.cc

    {"Atanh", V_2020_4, {"CPU"}},
    {"Atanh", V_2022_1, {"GPU"}},
    {"Attention", V_2023_0, {"CPU", "GPU"}},
+    {"GroupQueryAttention", V_2025_1, {"CPU", "GPU"}},


Please add the JIRA in the PR description that enables GQA Op for CPU & GPU plugins in the ONNX OV frontend. Please make sure this change doesn't conflict with GQA support for NPU in your validation process (FYI we are not targeting to support GQA for NPU currently)

If GQA is enabled for CPU it will also be marked for NPU. Can you revert the change

GQA should NOT be marked as supported on CPU. NPU uses CPU's capability. As @preetha-intel mentioned, if it is marked as supported on CPU it would result in the op targeting to run on NPU and in case of compilation failure would run on OV CPU instead of MLAS which is currently a CP+ production requirement.

ankitm3k

LGTM

Kotomi-Du requested review from RyanMetcalfeInt8, ankitm3k and preetha-intel October 10, 2025 22:28

Kotomi-Du mentioned this pull request Oct 11, 2025

[OVEP] Enable stateful mode for Phi-silica models #821

Open

support GQA

c4f7cdd

ankitm3k force-pushed the support_GQA_GPU branch from 5708bda to c4f7cdd Compare October 13, 2025 05:08

ankitm3k reviewed Oct 13, 2025

View reviewed changes

ankitm3k approved these changes Oct 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[OVEP GPU] add GQA in support list for GPU backend #830

[OVEP GPU] add GQA in support list for GPU backend #830

Uh oh!

Kotomi-Du commented Oct 10, 2025 •

edited

Loading

Uh oh!

ankitm3k Oct 13, 2025

Uh oh!

Kotomi-Du Oct 14, 2025

Uh oh!

ankitm3k Oct 13, 2025

Uh oh!

Kotomi-Du Oct 14, 2025

Uh oh!

preetha-intel Oct 17, 2025

Uh oh!

MayureshV1 Oct 17, 2025 •

edited

Loading

Uh oh!

ankitm3k left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[OVEP GPU] add GQA in support list for GPU backend #830

Are you sure you want to change the base?

[OVEP GPU] add GQA in support list for GPU backend #830

Uh oh!

Conversation

Kotomi-Du commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ankitm3k Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

Kotomi-Du Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

ankitm3k Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

Kotomi-Du Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

preetha-intel Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

MayureshV1 Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ankitm3k left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Kotomi-Du commented Oct 10, 2025 •

edited

Loading

MayureshV1 Oct 17, 2025 •

edited

Loading