fix: Fix vLLM provider sending hard-coded `reasoning_effort` values that fail server-side validation by idling11 · Pull Request #2170 · Hmbown/CodeWhale

idling11 · 2026-05-26T08:44:23Z

Commit Message — vLLM reasoning_effort fix (#2169)

Summary

Fix vLLM provider sending hard-coded reasoning_effort values that
fail server-side validation.

Closes: #2169

Problem

CodeWhale's apply_reasoning_effort function hard-coded invalid values
for the vLLM provider:

"max" branch sent reasoning_effort: "max" — vLLM only supports
none, low, medium, high
"low"/"medium"/"high" branch always sent "high", ignoring the
user's actual configured value

Result: 400 Bad Request on every chat completion attempt.

Fix

Two-line change in crates/tui/src/client.rs:

Branch	Before	After
`medium` / `high`	hard-coded `"high"`	pass through actual `low` / `medium` / `high`
`max`	hard-coded `"max"`	downgrade to `"high"` (vLLM-compatible)

Files Changed

File	Change
`crates/tui/src/client.rs`	+11 / -2 lines

How to Test

cargo test -p codewhale-tui -- client

Config:

[providers.vllm]
reasoning_effort = "medium"

Send any chat message — should no longer receive 400.

Greptile Summary

Fixes two bugs in apply_reasoning_effort for ApiProvider::Vllm: the low/medium/high branch was always sending "high" regardless of the user's config, and the max branch was sending "max" which vLLM rejects with a 400.

low/medium/high branch: a new inner match now maps "low"/"minimal" → "low", "medium"/"mid" → "medium", and all others → "high", mirroring the existing Openrouter/Novita logic.
max/xhigh branch: now sends "high" (vLLM's maximum) instead of the invalid "max" string, with a clarifying comment.

Confidence Score: 4/5

The fix is targeted and correct; both vLLM branches now send valid values. The only gap is missing test coverage for the new code paths.

The two-line bug — always sending 'high' for configurable efforts and the invalid 'max' for the top tier — is correctly patched. The inner match mirrors proven logic already used for Openrouter/Novita. No vLLM-specific unit tests accompany the change, so if someone later edits the inner match arms, there's nothing to catch a regression.

crates/tui/src/client.rs — the two new vLLM branches in apply_reasoning_effort have no dedicated tests.

Important Files Changed

Filename	Overview
crates/tui/src/client.rs	Fixes hard-coded reasoning_effort values for vLLM: low/medium/high are now passed through correctly, and 'max' is downgraded to 'high'. Logic is correct; no vLLM-specific tests were added to cover the new branches.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[apply_reasoning_effort called\nwith vLLM provider] --> B{normalized effort}
    B -- off/disabled/none --> C[chat_template_kwargs:\nenable_thinking: false]
    B -- low/minimal/medium/mid/high/empty --> D[chat_template_kwargs:\nenable_thinking: true]
    D --> E{inner match}
    E -- low or minimal --> F[reasoning_effort: 'low']
    E -- medium or mid --> G[reasoning_effort: 'medium']
    E -- high or empty or other --> H[reasoning_effort: 'high']
    B -- xhigh/max/highest --> I[chat_template_kwargs:\nenable_thinking: true]
    I --> J[reasoning_effort: 'high'\n downgraded from max]

Comments Outside Diff (1)

crates/tui/src/client.rs, line 936-947 (link)

No tests added for vLLM reasoning effort mapping

Every other provider in apply_reasoning_effort has at least one test (Deepseek, NvidiaNim, Fireworks, Openrouter each have dedicated #[test] cases), but the two vLLM branches changed in this PR have no coverage at all. Without tests, a future refactor of the inner match arms or a mistaken copy-paste could silently reintroduce the hard-coded "high" regression. A test along the lines of the existing reasoning_effort_maps_openrouter_scale_without_deepseek_max_label test that covers low → "low", medium → "medium", high → "high", and max → "high" for ApiProvider::Vllm would lock in the fix.

_{Reviews (1): Last reviewed commit: "fix: vLLM provider — pass through reason..." | Re-trigger Greptile}

…high (Hmbown#2169)

gemini-code-assist

Code Review

This pull request updates the reasoning effort handling for the vLLM provider in crates/tui/src/client.rs. Instead of hardcoding "high", it now dynamically maps the user-chosen reasoning effort value ("low"/"minimal" to "low", "medium"/"mid" to "medium", and defaults to "high"). Additionally, it downgrades "max" to "high" to prevent sending an invalid value to vLLM. There are no review comments to address, and I have no additional feedback to provide.

fix: vLLM provider — pass through reasoning_effort, downgrade max to …

2a5db58

…high (Hmbown#2169)

gemini-code-assist Bot reviewed May 26, 2026

View reviewed changes

idling11 mentioned this pull request May 26, 2026

[Bug] vLLM provider validation error: reasoning_effort "max" is invalid #2169

Closed

Hmbown merged commit 81480ba into Hmbown:main May 26, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Fix vLLM provider sending hard-coded `reasoning_effort` values that fail server-side validation#2170

fix: Fix vLLM provider sending hard-coded `reasoning_effort` values that fail server-side validation#2170
Hmbown merged 1 commit into
Hmbown:mainfrom
idling11:fix/vllm-reasoning-effort

idling11 commented May 26, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

idling11 commented May 26, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Commit Message — vLLM reasoning_effort fix (#2169)

Summary

Problem

Fix

Files Changed

How to Test

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

idling11 commented May 26, 2026 •

edited by greptile-apps Bot

Loading