Skip to content

Compatibility Issues with Qwen Series Models (VL, QVQ-max) via LiteLLM Proxy #161

@Uc207Pr4f57t9-251

Description

@Uc207Pr4f57t9-251

I've been attempting to integrate Alibaba Cloud's Qwen series models (specifically dashscope/qwen-vl-max and dashscope/qvq-max) into Bytebot using the recommended LiteLLM proxy setup (local Docker Compose). While basic connectivity was established after significant debugging (related to agent authentication and build caching), severe compatibility issues remain with these specific models, preventing their effective use.

1. Qwen-VL Models (e.g., qwen-vl-max) - Non-Standard Tool Calling:

  • Problem: When bytebot-agent sends a request with tools defined and tool_choice: "auto", qwen-vl-max (via LiteLLM) does not populate the standard tool_calls field in the response. Instead, it returns tool_calls: null and embeds the intended tool calls as JSON code blocks (e.g., ```json { "name": "...", "input": {...} } ```) directly within the message.content field, often mixed with natural language "thinking" text.
  • Impact: bytebot-agent's current response parser (formatChatCompletionResponse in proxy.service.ts) only checks the message.tool_calls field. Since it's null, the agent fails to recognize or execute the tool calls, treating the entire content (including the JSON blocks) as plain text output to the user.
  • Debugging Done:
    • Confirmed via direct curl/Python requests to litellm-proxy that the issue persists even when bytebot-agent is bypassed, proving it's an incompatibility between Qwen-VL's output format and the standard expected by the agent.
    • Ensured reasoning_effort parameter was removed from bytebot-agent source code (proxy.service.ts) via --no-cache builds, ruling it out as the cause.
    • Configured tool_choice: "auto" via LiteLLM UI ("Default Parameters") for the model, which did not resolve the issue.

2. QVQ/QVQ-max Models - HTTPS Connection Error:

  • Problem: Attempts to use dashscope/qvq-max consistently fail with the error litellm.BadRequestError: DashscopeException - current user api does not support http call.
  • Debugging Done:
    • Verified multiple times that the api_base configured in the LiteLLM UI for this model is correctly set to HTTPS: https://dashscope.aliyuncs.com/compatible-mode/v1.
    • This error occurs even when the Docker Desktop global proxy and any system-level proxies (like Clash) are completely disabled, ruling out proxy HTTPS-stripping.
    • Other Dashscope models (like qwen-vl-max) connect successfully over HTTPS using the same LiteLLM proxy setup.
  • Conclusion: This suggests a specific issue either with the Dashscope endpoint for qvq-max when accessed via the OpenAI compatibility layer, or with how the LiteLLM adapter handles HTTPS requests only for this specific model variant.

Environment:

  • Bytebot: Built locally from source (recent edge equivalent).
  • LiteLLM: Running via Docker using ghcr.io/berriai/litellm:main-stable image.
  • Models Tested: dashscope/qwen-vl-max, dashscope/qvq-max.
  • Setup: Local Docker Compose on Windows, using the project's provided postgres container for both bytebot-agent and litellm-proxy databases (bytebotdb and litellm_logs_db respectively). LiteLLM configured with master_key and encryption_key.

Workarounds Attempted:

  • Qwen-VL: Manually modified bytebot-agent's formatChatCompletionResponse function in proxy.service.ts to parse ```json ... ``` blocks from message.content when message.tool_calls is null. (Code based on our discussion can be provided if needed). This works but requires modifying agent source.
  • QVQ-max: No workaround found. Model remains unusable due to the persistent HTTPS error.

Suggestions:

  1. Enhance bytebot-agent Parser: Update formatChatCompletionResponse to include fallback logic that parses JSON code blocks from message.content if message.tool_calls is null/empty. This would provide out-of-the-box compatibility with models like Qwen-VL.
  2. Investigate QVQ HTTPS Issue: This seems like a deeper issue, potentially within LiteLLM's Dashscope adapter or the Dashscope endpoint itself. Collaboration with the LiteLLM team might be needed.
  3. Document Compatibility: Update Bytebot documentation regarding known compatibility issues with specific Qwen models and potential workarounds (like the agent code modification).
  4. Agent Authentication: Address the underlying issue where bytebot-agent ignores standard proxy keys and requires hardcoding or the OPENAI_API_KEY workaround.

Thanks for looking into this. Qwen models are important in certain regions, and improving compatibility would be very beneficial.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions