fix: Fix vLLM provider sending hard-coded reasoning_effort values that fail server-side validation#2170
Merged
Conversation
Contributor
There was a problem hiding this comment.
Code Review
This pull request updates the reasoning effort handling for the vLLM provider in crates/tui/src/client.rs. Instead of hardcoding "high", it now dynamically maps the user-chosen reasoning effort value ("low"/"minimal" to "low", "medium"/"mid" to "medium", and defaults to "high"). Additionally, it downgrades "max" to "high" to prevent sending an invalid value to vLLM. There are no review comments to address, and I have no additional feedback to provide.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Commit Message — vLLM reasoning_effort fix (#2169)
Summary
Fix vLLM provider sending hard-coded
reasoning_effortvalues thatfail server-side validation.
Closes: #2169
Problem
CodeWhale's
apply_reasoning_effortfunction hard-coded invalid valuesfor the vLLM provider:
"max"branch sentreasoning_effort: "max"— vLLM only supportsnone,low,medium,high"low"/"medium"/"high"branch always sent"high", ignoring theuser's actual configured value
Result: 400 Bad Request on every chat completion attempt.
Fix
Two-line change in
crates/tui/src/client.rs:medium/high"high"low/medium/highmax"max""high"(vLLM-compatible)Files Changed
crates/tui/src/client.rsHow to Test
cargo test -p codewhale-tui -- clientConfig:
Send any chat message — should no longer receive 400.
Greptile Summary
Fixes two bugs in
apply_reasoning_effortforApiProvider::Vllm: thelow/medium/highbranch was always sending"high"regardless of the user's config, and themaxbranch was sending"max"which vLLM rejects with a 400.low/medium/highbranch: a new innermatchnow maps"low"/"minimal" → "low","medium"/"mid" → "medium", and all others →"high", mirroring the existing Openrouter/Novita logic.max/xhighbranch: now sends"high"(vLLM's maximum) instead of the invalid"max"string, with a clarifying comment.Confidence Score: 4/5
The fix is targeted and correct; both vLLM branches now send valid values. The only gap is missing test coverage for the new code paths.
The two-line bug — always sending 'high' for configurable efforts and the invalid 'max' for the top tier — is correctly patched. The inner match mirrors proven logic already used for Openrouter/Novita. No vLLM-specific unit tests accompany the change, so if someone later edits the inner match arms, there's nothing to catch a regression.
crates/tui/src/client.rs — the two new vLLM branches in apply_reasoning_effort have no dedicated tests.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[apply_reasoning_effort called\nwith vLLM provider] --> B{normalized effort} B -- off/disabled/none --> C[chat_template_kwargs:\nenable_thinking: false] B -- low/minimal/medium/mid/high/empty --> D[chat_template_kwargs:\nenable_thinking: true] D --> E{inner match} E -- low or minimal --> F[reasoning_effort: 'low'] E -- medium or mid --> G[reasoning_effort: 'medium'] E -- high or empty or other --> H[reasoning_effort: 'high'] B -- xhigh/max/highest --> I[chat_template_kwargs:\nenable_thinking: true] I --> J[reasoning_effort: 'high'\n downgraded from max]Comments Outside Diff (1)
crates/tui/src/client.rs, line 936-947 (link)Every other provider in
apply_reasoning_efforthas at least one test (Deepseek, NvidiaNim, Fireworks, Openrouter each have dedicated#[test]cases), but the two vLLM branches changed in this PR have no coverage at all. Without tests, a future refactor of the innermatcharms or a mistaken copy-paste could silently reintroduce the hard-coded"high"regression. A test along the lines of the existingreasoning_effort_maps_openrouter_scale_without_deepseek_max_labeltest that coverslow → "low",medium → "medium",high → "high", andmax → "high"forApiProvider::Vllmwould lock in the fix.Reviews (1): Last reviewed commit: "fix: vLLM provider — pass through reason..." | Re-trigger Greptile