Skip to content

[WebGPU] Unexpected Output with Phi-3 Mini 4K Instruct Model from ORT GenAI #25180

Open
@Honry

Description

@Honry

Describe the issue

WebNN developer preview provides a text-generation demo with some LLM models (Phi-3 Mini 4K Instruct, DeepSeek R1 Distill Qwen, TinyLLama, QWen2) which are generated from ONNXRuntime GenAI.

These models have the similar architecture (GQA, MatMulNBits, RotaryEmbedding...), when tested them with WebGPU EP, only the Phi-3 Mini 4K Instruct got unexpected result. other models worked fine.
Image

Image

To reproduce

Test Phi-3 Mini 4K Instruct:

Test others:

Urgency

No response

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.23.0-dev.20250612-70f14d7670

Execution Provider

'webgpu' (WebGPU)

Metadata

Metadata

Assignees

No one assigned

    Labels

    ep:WebGPUort-web webgpu providerep:WebNNWebNN execution providerplatform:webissues related to ONNX Runtime web; typically submitted using template

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions