Skip to content

Failed to load qwen3.5-9b – com.microsoft:CausalConvWithState not registered #700

@suki-lqh

Description

@suki-lqh

Bug Report: Failed to load qwen3.5-9b-generic-gpu:2 – com.microsoft:CausalConvWithState not registered

1. Environment Info

  • OS: Windows 11 25H2
  • .NET Version: .NET 8.0
  • NuGet Packages:
    • Microsoft.AI.Foundry.Local.WinML 1.1.0
    • Microsoft.ML.OnnxRuntime.Gpu.Linux 1.25.1
    • OpenAI 2.10.0
  • Affected Model: qwen3.5-9b-generic-gpu:2
  • Model Path: D:\foundry_models\Microsoft\qwen3.5-9b-generic-gpu-2

2. Issue Description

When loading the ONNX model qwen3.5-9b-generic-gpu:2 via Microsoft.AI.Foundry.Local, the application crashes with an unhandled exception.
ONNX Runtime GenAI fails to load text.onnx because the custom operator com.microsoft:CausalConvWithState is not registered.

3. Full Exception Stack Trace

Unhandled exception. Microsoft.AI.Foundry.Local.FoundryLocalException: Error loading model qwen3.5-9b-generic-gpu:2: Error: Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: Load model from D:\foundry_models\Microsoft\qwen3.5-9b-generic-gpu-2\v2\text.onnx failed:Fatal error: com.microsoft:CausalConvWithState(-1) is not a registered function/op
   at Microsoft.ML.OnnxRuntimeGenAI.Result.VerifySuccess(IntPtr) + 0x47
   at Microsoft.ML.OnnxRuntimeGenAI.Model..ctor(Config) + 0x26
   at Microsoft.Neutron.OpenAI.Provider.OnnxLoadedModel..ctor(String, Config, GenAIConfig, InferenceModel, OnnxEP) + 0x56
   at Microsoft.AI.Foundry.Local.ModelManager.<LoadModelAsync>d__9.MoveNext() + 0xc8c
--- End of stack trace from previous location ---
   at Microsoft.AI.Foundry.Local.NativeInterop.<>c__DisplayClass10_0.<<ExecuteCommandManaged>b__5>d.MoveNext() + 0xa0
--- End of stack trace from previous location ---
   at Microsoft.AI.Foundry.Local.NativeInterop.<ExecuteWithTracker>d__9.MoveNext() + 0xb8

4. Root Cause Analysis

  1. Platform mismatch: Referenced Microsoft.ML.OnnxRuntime.Gpu.Linux Linux build running on Windows, causing custom operator binaries missing.
  2. The ONNX model qwen3.5-9b-generic-gpu relies on custom operator com.microsoft:CausalConvWithState, which is not registered in current OnnxRuntimeGenAI runtime.
  3. Version incompatibility between Foundry.Local.WinML 1.1.0 and OnnxRuntime 1.25.1.

5. Suggested Resolution

  1. Replace Microsoft.ML.OnnxRuntime.Gpu.Linux with Windows-compatible Microsoft.ML.OnnxRuntime.Gpu or Microsoft.ML.OnnxRuntime.DirectML.
  2. Upgrade Microsoft.AI.Foundry.Local.WinML to a newer version that supports Qwen3.5 series ONNX models.
  3. Ensure runtime extensions and custom operator libraries for CausalConvWithState are properly registered in OnnxRuntime GenAI.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions