Compatibility Issues with Qwen Series Models (VL, QVQ-max) via LiteLLM Proxy

I've been attempting to integrate Alibaba Cloud's Qwen series models (specifically `dashscope/qwen-vl-max` and `dashscope/qvq-max`) into Bytebot using the recommended LiteLLM proxy setup (local Docker Compose). While basic connectivity was established after significant debugging (related to agent authentication and build caching), severe compatibility issues remain with these specific models, preventing their effective use.

**1. Qwen-VL Models (e.g., `qwen-vl-max`) - Non-Standard Tool Calling:**

* **Problem:** When `bytebot-agent` sends a request with `tools` defined and `tool_choice: "auto"`, `qwen-vl-max` (via LiteLLM) does **not** populate the standard `tool_calls` field in the response. Instead, it returns `tool_calls: null` and embeds the intended tool calls as JSON code blocks (e.g., ` ```json { "name": "...", "input": {...} } ``` `) directly within the `message.content` field, often mixed with natural language "thinking" text.
* **Impact:** `bytebot-agent`'s current response parser (`formatChatCompletionResponse` in `proxy.service.ts`) only checks the `message.tool_calls` field. Since it's null, the agent fails to recognize or execute the tool calls, treating the entire `content` (including the JSON blocks) as plain text output to the user.
* **Debugging Done:**
    * Confirmed via direct `curl`/Python `requests` to `litellm-proxy` that the issue persists even when `bytebot-agent` is bypassed, proving it's an incompatibility between Qwen-VL's output format and the standard expected by the agent.
    * Ensured `reasoning_effort` parameter was removed from `bytebot-agent` source code (`proxy.service.ts`) via `--no-cache` builds, ruling it out as the cause.
    * Configured `tool_choice: "auto"` via LiteLLM UI ("Default Parameters") for the model, which did not resolve the issue.

**2. QVQ/QVQ-max Models - HTTPS Connection Error:**

* **Problem:** Attempts to use `dashscope/qvq-max` consistently fail with the error `litellm.BadRequestError: DashscopeException - current user api does not support http call`.
* **Debugging Done:**
    * Verified multiple times that the `api_base` configured in the LiteLLM UI for this model **is correctly set to HTTPS**: `https://dashscope.aliyuncs.com/compatible-mode/v1`.
    * This error occurs even when the Docker Desktop global proxy and any system-level proxies (like Clash) are completely disabled, ruling out proxy HTTPS-stripping.
    * Other Dashscope models (like `qwen-vl-max`) connect successfully over HTTPS using the *same* LiteLLM proxy setup.
* **Conclusion:** This suggests a specific issue either with the Dashscope endpoint for `qvq-max` when accessed via the OpenAI compatibility layer, or with how the LiteLLM adapter handles HTTPS requests *only* for this specific model variant.

**Environment:**

* **Bytebot:** Built locally from source (recent `edge` equivalent).
* **LiteLLM:** Running via Docker using `ghcr.io/berriai/litellm:main-stable` image.
* **Models Tested:** `dashscope/qwen-vl-max`, `dashscope/qvq-max`.
* **Setup:** Local Docker Compose on Windows, using the project's provided `postgres` container for both `bytebot-agent` and `litellm-proxy` databases (`bytebotdb` and `litellm_logs_db` respectively). LiteLLM configured with `master_key` and `encryption_key`.

**Workarounds Attempted:**

* **Qwen-VL:** Manually modified `bytebot-agent`'s `formatChatCompletionResponse` function in `proxy.service.ts` to parse ` ```json ... ``` ` blocks from `message.content` when `message.tool_calls` is null. (Code based on our discussion can be provided if needed). This works but requires modifying agent source.
* **QVQ-max:** No workaround found. Model remains unusable due to the persistent HTTPS error.

**Suggestions:**

1.  **Enhance `bytebot-agent` Parser:** Update `formatChatCompletionResponse` to include fallback logic that parses JSON code blocks from `message.content` if `message.tool_calls` is null/empty. This would provide out-of-the-box compatibility with models like Qwen-VL.
2.  **Investigate QVQ HTTPS Issue:** This seems like a deeper issue, potentially within LiteLLM's Dashscope adapter or the Dashscope endpoint itself. Collaboration with the LiteLLM team might be needed.
3.  **Document Compatibility:** Update Bytebot documentation regarding known compatibility issues with specific Qwen models and potential workarounds (like the agent code modification).
4.  **Agent Authentication:** Address the underlying issue where `bytebot-agent` ignores standard proxy keys and requires hardcoding or the `OPENAI_API_KEY` workaround.

Thanks for looking into this. Qwen models are important in certain regions, and improving compatibility would be very beneficial.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Compatibility Issues with Qwen Series Models (VL, QVQ-max) via LiteLLM Proxy #161

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Compatibility Issues with Qwen Series Models (VL, QVQ-max) via LiteLLM Proxy #161

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions