Skip to content

qwen3-vl does not work. #2376

@jclab-joseph

Description

@jclab-joseph

Describe the bug

I tried auto-opt for Qwen3-VL with the commit applied at #2345 but failed.

To Reproduce

% pip install 'olive-ai[auto-opt] @ git+https://github.com/microsoft/Olive.git@6c1a86971c6b5a9df513410979dc67984c992397'

% python -m olive auto-opt  --model_name_or_path Qwen/Qwen3-VL-Embedding-2B --trust_remote_code  --output_path models/Qwen3-VL-Embedding-2B-int8-webgpu  --device gpu --provider WebGpuExecutionProvider  --use_ort_genai --precision int8 --log_level 1
Loading HuggingFace model from Qwen/Qwen3-VL-Embedding-2B
[2026-03-31 11:22:31,736] [INFO] [run.py:99:run_engine] Running workflow default_workflow
[2026-03-31 11:22:31,770] [INFO] [cache.py:142:__init__] Using cache directory: <DIR>/.olive-cache/default_workflow
[2026-03-31 11:22:31,771] [INFO] [accelerator_creator.py:200:create_accelerator] Running workflow on accelerator spec: gpu-webgpu
[2026-03-31 11:22:31,781] [INFO] [engine.py:208:run] Running Olive on accelerator: gpu-webgpu
[2026-03-31 11:22:31,781] [INFO] [engine.py:836:_create_system] Creating target system ...
[2026-03-31 11:22:31,781] [INFO] [engine.py:839:_create_system] Target system created in 0.000029 seconds
[2026-03-31 11:22:31,781] [INFO] [engine.py:842:_create_system] Creating host system ...
[2026-03-31 11:22:31,781] [INFO] [engine.py:845:_create_system] Host system created in 0.000018 seconds
[2026-03-31 11:22:32,048] [INFO] [engine.py:668:_run_pass] Running pass conversion:onnxconversion
`torch_dtype` is deprecated! Use `dtype` instead!
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/__main__.py", line 11, in <module>
    main(called_as_console_script=False)
    ~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/cli/launcher.py", line 75, in main
    service.run()
    ~~~~~~~~~~~^^
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/telemetry/telemetry_extensions.py", line 137, in wrapper
    return func(*args, **kwargs)
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/cli/auto_opt.py", line 177, in run
    return self._run_workflow()
           ~~~~~~~~~~~~~~~~~~^^
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/cli/base.py", line 45, in _run_workflow
    workflow_output = olive_run(run_config)
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/workflows/run/run.py", line 178, in run
    return run_engine(package_config, run_config)
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/workflows/run/run.py", line 139, in run_engine
    return engine.run(
           ~~~~~~~~~~^
        run_config.input_model,
        ^^^^^^^^^^^^^^^^^^^^^^^
    ...<5 lines>...
        run_config.engine.log_severity_level,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/telemetry/telemetry_extensions.py", line 137, in wrapper
    return func(*args, **kwargs)
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/engine/engine.py", line 210, in run
    self.run_accelerator(
    ~~~~~~~~~~~~~~~~~~~~^
        input_model_config,
        ^^^^^^^^^^^^^^^^^^^
    ...<2 lines>...
        accelerator_spec,
        ^^^^^^^^^^^^^^^^^
    )
    ^
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/engine/engine.py", line 287, in run_accelerator
    self._run_no_search(input_model_config, input_model_id, accelerator_spec, artifacts_dir)
    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/engine/engine.py", line 331, in _run_no_search
    should_prune, signal, model_ids = self._run_passes(input_model_config, input_model_id, accelerator_spec)
                                      ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/engine/engine.py", line 624, in _run_passes
    model_config, model_id = self._run_pass(
                             ~~~~~~~~~~~~~~^
        pass_name,
        ^^^^^^^^^^
    ...<2 lines>...
        accelerator_spec,
        ^^^^^^^^^^^^^^^^^
    )
    ^
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/engine/engine.py", line 719, in _run_pass
    output_model_config = host.run_pass(p, input_model_config, output_model_path)
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/systems/local.py", line 45, in run_pass
    output_model = the_pass.run(model, output_model_path)
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/passes/olive_pass.py", line 243, in run
    output_model = self._run_for_config(model, self.config, output_model_path)
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/passes/onnx/conversion.py", line 652, in _run_for_config
    output_model = self._run_for_config_internal(model, config, output_model_path)
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/passes/onnx/conversion.py", line 695, in _run_for_config_internal
    return self._convert_model_on_device(model, config, output_model_path, device, torch_dtype)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/passes/onnx/conversion.py", line 713, in _convert_model_on_device
    pytorch_model = model.load_model(cache_model=False)
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/model/handler/hf.py", line 75, in load_model
    model = load_model_from_task(self.task, self.model_path, **self.get_load_kwargs())
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/common/hf/utils.py", line 62, in load_model_from_task
    model = from_pretrained(model_class, model_name_or_path, "model", **kwargs)
  File "<DIR>/.venv/lib/python3.13/site-packages/olive/common/hf/utils.py", line 94, in from_pretrained
    return cls.from_pretrained(get_pretrained_name_or_path(model_name_or_path, mlflow_dir), **kwargs)
           ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<DIR>/.venv/lib/python3.13/site-packages/transformers/models/auto/auto_factory.py", line 384, in from_pretrained
    raise ValueError(
    ...<2 lines>...
    )
ValueError: Unrecognized configuration class <class 'transformers.models.qwen3_vl.configuration_qwen3_vl.Qwen3VLConfig'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of AfmoeConfig, ApertusConfig, ArceeConfig, AriaTextConfig, BambaConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BitNetConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, BltConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, Cohere2Config, CpmAntConfig, CTRLConfig, CwmConfig, Data2VecTextConfig, DbrxConfig, DeepseekV2Config, DeepseekV3Config, DiffLlamaConfig, DogeConfig, Dots1Config, ElectraConfig, Emu3Config, ErnieConfig, Ernie4_5Config, Ernie4_5_MoeConfig, Exaone4Config, ExaoneMoeConfig, FalconConfig, FalconH1Config, FalconMambaConfig, FlexOlmoConfig, FuyuConfig, GemmaConfig, Gemma2Config, Gemma3Config, Gemma3TextConfig, Gemma3nConfig, Gemma3nTextConfig, GitConfig, GlmConfig, Glm4Config, Glm4MoeConfig, Glm4MoeLiteConfig, GlmMoeDsaConfig, GotOcr2Config, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GptOssConfig, GPTJConfig, GraniteConfig, GraniteMoeConfig, GraniteMoeHybridConfig, GraniteMoeSharedConfig, HeliumConfig, HunYuanDenseV1Config, HunYuanMoEV1Config, Jais2Config, JambaConfig, JetMoeConfig, Lfm2Config, Lfm2MoeConfig, LlamaConfig, Llama4Config, Llama4TextConfig, LongcatFlashConfig, MambaConfig, Mamba2Config, MarianConfig, MBartConfig, MegatronBertConfig, MiniMaxConfig, MiniMaxM2Config, MinistralConfig, Ministral3Config, MistralConfig, MixtralConfig, MllamaConfig, ModernBertDecoderConfig, MoshiConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, NanoChatConfig, NemotronConfig, NemotronHConfig, OlmoConfig, Olmo2Config, Olmo3Config, OlmoHybridConfig, OlmoeConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, Phi3Config, Phi4MultimodalConfig, PhimoeConfig, PLBartConfig, ProphetNetConfig, Qwen2Config, Qwen2MoeConfig, Qwen3Config, Qwen3_5Config, Qwen3_5MoeConfig, Qwen3_5MoeTextConfig, Qwen3_5TextConfig, Qwen3MoeConfig, Qwen3NextConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, SeedOssConfig, SmolLM3Config, SolarOpenConfig, StableLmConfig, Starcoder2Config, TrOCRConfig, VaultGemmaConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, xLSTMConfig, XmodConfig, YoutuConfig, ZambaConfig, Zamba2Config.

Steps to reproduce the behavior.

Other information

  • OS: MacOS (M4)
  • Olive version: main
  • ONNXRuntime package and version: X
  • Transformers package version: transformers 5.4.0

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions