Skip to content

fix: Fix vLLM provider sending hard-coded reasoning_effort values that fail server-side validation#2170

Merged
Hmbown merged 1 commit into
Hmbown:mainfrom
idling11:fix/vllm-reasoning-effort
May 26, 2026
Merged

fix: Fix vLLM provider sending hard-coded reasoning_effort values that fail server-side validation#2170
Hmbown merged 1 commit into
Hmbown:mainfrom
idling11:fix/vllm-reasoning-effort

Conversation

@idling11
Copy link
Copy Markdown
Contributor

@idling11 idling11 commented May 26, 2026

Commit Message — vLLM reasoning_effort fix (#2169)

Summary

Fix vLLM provider sending hard-coded reasoning_effort values that
fail server-side validation.

Closes: #2169

Problem

CodeWhale's apply_reasoning_effort function hard-coded invalid values
for the vLLM provider:

  • "max" branch sent reasoning_effort: "max" — vLLM only supports
    none, low, medium, high
  • "low"/"medium"/"high" branch always sent "high", ignoring the
    user's actual configured value

Result: 400 Bad Request on every chat completion attempt.

Fix

Two-line change in crates/tui/src/client.rs:

Branch Before After
medium / high hard-coded "high" pass through actual low / medium / high
max hard-coded "max" downgrade to "high" (vLLM-compatible)

Files Changed

File Change
crates/tui/src/client.rs +11 / -2 lines

How to Test

cargo test -p codewhale-tui -- client

Config:

[providers.vllm]
reasoning_effort = "medium"

Send any chat message — should no longer receive 400.

Greptile Summary

Fixes two bugs in apply_reasoning_effort for ApiProvider::Vllm: the low/medium/high branch was always sending "high" regardless of the user's config, and the max branch was sending "max" which vLLM rejects with a 400.

  • low/medium/high branch: a new inner match now maps "low"/"minimal" → "low", "medium"/"mid" → "medium", and all others → "high", mirroring the existing Openrouter/Novita logic.
  • max/xhigh branch: now sends "high" (vLLM's maximum) instead of the invalid "max" string, with a clarifying comment.

Confidence Score: 4/5

The fix is targeted and correct; both vLLM branches now send valid values. The only gap is missing test coverage for the new code paths.

The two-line bug — always sending 'high' for configurable efforts and the invalid 'max' for the top tier — is correctly patched. The inner match mirrors proven logic already used for Openrouter/Novita. No vLLM-specific unit tests accompany the change, so if someone later edits the inner match arms, there's nothing to catch a regression.

crates/tui/src/client.rs — the two new vLLM branches in apply_reasoning_effort have no dedicated tests.

Important Files Changed

Filename Overview
crates/tui/src/client.rs Fixes hard-coded reasoning_effort values for vLLM: low/medium/high are now passed through correctly, and 'max' is downgraded to 'high'. Logic is correct; no vLLM-specific tests were added to cover the new branches.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[apply_reasoning_effort called\nwith vLLM provider] --> B{normalized effort}
    B -- off/disabled/none --> C[chat_template_kwargs:\nenable_thinking: false]
    B -- low/minimal/medium/mid/high/empty --> D[chat_template_kwargs:\nenable_thinking: true]
    D --> E{inner match}
    E -- low or minimal --> F[reasoning_effort: 'low']
    E -- medium or mid --> G[reasoning_effort: 'medium']
    E -- high or empty or other --> H[reasoning_effort: 'high']
    B -- xhigh/max/highest --> I[chat_template_kwargs:\nenable_thinking: true]
    I --> J[reasoning_effort: 'high'\n downgraded from max]
Loading

Comments Outside Diff (1)

  1. crates/tui/src/client.rs, line 936-947 (link)

    P2 No tests added for vLLM reasoning effort mapping

    Every other provider in apply_reasoning_effort has at least one test (Deepseek, NvidiaNim, Fireworks, Openrouter each have dedicated #[test] cases), but the two vLLM branches changed in this PR have no coverage at all. Without tests, a future refactor of the inner match arms or a mistaken copy-paste could silently reintroduce the hard-coded "high" regression. A test along the lines of the existing reasoning_effort_maps_openrouter_scale_without_deepseek_max_label test that covers low → "low", medium → "medium", high → "high", and max → "high" for ApiProvider::Vllm would lock in the fix.

    Fix in Codex Fix in Claude Code Fix in Cursor

Fix All in Codex Fix All in Claude Code Fix All in Cursor

Reviews (1): Last reviewed commit: "fix: vLLM provider — pass through reason..." | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the reasoning effort handling for the vLLM provider in crates/tui/src/client.rs. Instead of hardcoding "high", it now dynamically maps the user-chosen reasoning effort value ("low"/"minimal" to "low", "medium"/"mid" to "medium", and defaults to "high"). Additionally, it downgrades "max" to "high" to prevent sending an invalid value to vLLM. There are no review comments to address, and I have no additional feedback to provide.

@Hmbown Hmbown merged commit 81480ba into Hmbown:main May 26, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] vLLM provider validation error: reasoning_effort "max" is invalid

2 participants