UN-3478 [FIX] Support OpenAI gpt-5 / o-series in openai-compatible adapter#1983
Conversation
OpenAI gpt-5 / o1 / o3 routed through the openai-compatible adapter were rejected by the upstream API with "temperature does not support 0.1" and "max_tokens not supported, use max_completion_tokens" because LiteLLM's `custom_openai` provider does not apply the reasoning-model parameter transformations that the `openai` provider does. Add an `Enable Reasoning` toggle (matching the OpenAI adapter pattern) that drops `temperature` and `max_tokens` from the top-level kwargs and routes `reasoning_effort` and `max_completion_tokens` via LiteLLM's `extra_body` (which is forwarded as-is to the upstream API). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Enable Reasoning toggle alone is not enough: users dropping a gpt-5 or o-series model name into the adapter will not know they need to flip a switch, so the upstream API still returns `temperature does not support 0.1` / `max_tokens is not supported`. Pattern-match OpenAI's known reasoning families (gpt-5, o1, o3, o4) after stripping the `custom_openai/` / `openai/` prefixes and route the request through the reasoning code path automatically. Keep the `enable_reasoning` opt-in for gateway aliases that hide the underlying reasoning model behind a custom name. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds OpenAI reasoning-model detection, a schema toggle to enable reasoning, reasoning-aware validation that routes reasoning parameters into ChangesOpenAI Reasoning Model Support
🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/adapters/base1.py`:
- Around line 405-420: The code currently uses the raw adapter_metadata
max_tokens when building extra_body for reasoning, risking an unvalidated value
leaking into the payload; instead, after validating adapter_metadata via
OpenAICompatibleLLMParameters (the validated dict), pull the validated
max_tokens from validated (e.g., validated.get("max_tokens")) and use that value
when setting extra_body["max_completion_tokens"] (only add it when not None),
keeping other logic (reasoning_effort, enable_reasoning, extra_body assignment)
the same.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 448295e2-7dca-4140-8cef-be840aba9ade
📒 Files selected for processing (3)
unstract/sdk1/src/unstract/sdk1/adapters/base1.pyunstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.jsonunstract/sdk1/tests/test_openai_compatible_adapter.py
|
| Filename | Overview |
|---|---|
| unstract/sdk1/src/unstract/sdk1/adapters/base1.py | Core reasoning-model logic added to OpenAICompatibleLLMParameters.validate(); module constants defining provider prefixes and regex pattern are placed after OpenAILLMParameters which already references _OPENAI_PROVIDER_PREFIX — works at runtime but unusual ordering. |
| unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json | Adds enable_reasoning toggle and conditional reasoning_effort field via JSON Schema allOf/if/then; both if-branches correctly include required:[enable_reasoning] following prior review fixes. |
| unstract/sdk1/tests/test_openai_compatible_adapter.py | Comprehensive new tests for reasoning auto-detection, explicit opt-in/out, re-validation idempotency, and schema structure; addresses all prior review feedback with correctly isolated test scenarios. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[validate called with adapter_metadata] --> B[validate_model: add custom_openai/ prefix]
B --> C{enable_reasoning explicitly True?}
C -- Yes --> G[reasoning_path = True]
C -- No --> D{enable_reasoning absent AND reasoning_effort or extra_body.reasoning_effort present?}
D -- Yes --> G
D -- No --> E{_is_openai_reasoning_model: gpt-5, o1, o3, o4?}
E -- Yes --> G
E -- No --> F[Non-reasoning path]
F --> F1[Pydantic validates params]
F1 --> F2[Pop reasoning_effort]
F2 --> Z[Return validated dict with temperature + max_tokens]
G --> H[Pydantic validates params excluding enable_reasoning]
H --> I[Build extra_body with reasoning_effort + max_completion_tokens]
I --> J[Pop temperature, max_tokens, reasoning_effort from top-level]
J --> K[Return validated dict with extra_body only]
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
unstract/sdk1/src/unstract/sdk1/adapters/base1.py:286-287
The three module constants (`_OPENAI_PROVIDER_PREFIX`, `_CUSTOM_OPENAI_PROVIDER_PREFIX`, `_OPENAI_REASONING_MODEL_PATTERN`) are defined at line 352, but `OpenAILLMParameters.validate_model` — defined earlier at line 340 — already references `_OPENAI_PROVIDER_PREFIX`. Python resolves the name at call time, so there is no runtime error, but the forward-reference is easy to miss during maintenance and is inconsistent with the module's convention of defining constants before their first use. Moving the three lines above `OpenAILLMParameters` eliminates the forward dependency.
```suggestion
# LiteLLM provider prefixes hoisted to module constants so the same literal is
# not duplicated across `OpenAILLMParameters`, `OpenAICompatibleLLMParameters`,
# and the reasoning-model detector below.
_OPENAI_PROVIDER_PREFIX = "openai/"
_CUSTOM_OPENAI_PROVIDER_PREFIX = "custom_openai/"
_OPENAI_REASONING_MODEL_PATTERN = re.compile(r"^(o1|o3|o4|gpt-5)(?:[-/]|$)")
class OpenAILLMParameters(BaseChatCompletionParameters):
"""See https://docs.litellm.ai/docs/providers/openai/."""
```
### Issue 2 of 2
unstract/sdk1/src/unstract/sdk1/adapters/base1.py:425-426
**Auto-detection silently overrides an explicit `enable_reasoning: false` for known model families**
The PR description explains this is intentional — the schema default for `enable_reasoning` is `false`, so a fresh form submission for `gpt-5` arrives as `{"enable_reasoning": false, ...}` and auto-detection must still fire. However, this means there is no way for a user to opt out of reasoning mode for a model whose name matches the auto-detect pattern (e.g. a vLLM deployment named `gpt-5`). The behavior is not covered by any test, leaving it invisible to future maintainers who may try to "fix" it by adding the same `"enable_reasoning" not in adapter_metadata` guard that was applied to the inference branch above. A focused test for this exact scenario — `{"model": "gpt-5", "enable_reasoning": false}` → reasoning path still active — would both document the intent and prevent accidental regression.
Reviews (5): Last reviewed commit: "[FIX] Preserve reasoning state across re..." | Re-trigger Greptile
- base1.py: read max_tokens for extra_body from the Pydantic-validated dict so `int | None` coercion is applied before the value is forwarded to the upstream API (CodeRabbit). - custom_openai.json: anchor `allOf` `if` branches on `required: [enable_reasoning]` so the conditional does not fire vacuously when the property is absent from the submitted instance (Greptile). - test_openai_compatible_adapter.py: switch the `infers_reasoning_from_effort_field` test to a non-auto-detected model name so the test actually exercises the `has_reasoning_effort` branch instead of passing via auto-detection (Greptile). Also assert the new `required: [enable_reasoning]` clause on both schema branches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
chandrasekharan-zipstack
left a comment
There was a problem hiding this comment.
LGTM, @pk-zipstack try to make the comments concise and generic to make sense when read independently
- base1.py: hoist `openai/` and `custom_openai/` LiteLLM provider prefixes into `_OPENAI_PROVIDER_PREFIX` / `_CUSTOM_OPENAI_PROVIDER_PREFIX` module constants and use them in `_is_openai_reasoning_model`, `OpenAILLMParameters.validate_model`, and `OpenAICompatibleLLMParameters.validate_model`. Clears Sonar python:S1192 (the new helper pushed each literal past the duplication threshold). - test_openai_compatible_adapter.py: replace bare `== 0.1` float-equality assertions on the Pydantic-default `temperature` with a `pytest.approx(0.1)` constant. Clears Sonar python:S1244. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Greptile P1: the inference branch ("if reasoning_effort is present,
infer reasoning enabled") fired even when the user explicitly submitted
`enable_reasoning: false`, silently overriding the opt-out — for
example, when editing an adapter that previously had reasoning on and
still has a leftover `reasoning_effort` in its stored metadata.
Guard the inference branch with `"enable_reasoning" not in
adapter_metadata` so it only fires on a re-validation pass where the
field has been consumed/stripped, never when the user deliberately
opted out. Auto-detection is intentionally NOT gated by this guard:
the schema default for `enable_reasoning` is `false`, so a fresh form
submission for `gpt-5` arrives as `enable_reasoning: false` and the
adapter must still rescue the call.
Adds a regression test using a non-auto-detected model name to isolate
the inference branch from the auto-detect branch.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
unstract/sdk1/src/unstract/sdk1/adapters/base1.py (1)
401-440:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPreserve reasoning state when re-validating a reasoning-enabled alias.
On the reasoning path this method strips both
enable_reasoningand top-levelreasoning_effort, leaving onlyextra_body. A latervalidate()on that returned dict for a non-auto-detected alias will miss this guarded inference, drop the existingextra_body, and fall back to normal top-level params. That makes explicitenable_reasoning=Truenon-idempotent across re-validation.Suggested fix
has_reasoning_effort = ( "reasoning_effort" in adapter_metadata and adapter_metadata.get("reasoning_effort") is not None ) + existing_extra_body = adapter_metadata.get("extra_body") + has_reasoning_extra_body = ( + isinstance(existing_extra_body, dict) + and existing_extra_body.get("reasoning_effort") is not None + ) # Infer reasoning only when `enable_reasoning` is ABSENT (e.g. on a # re-validation pass that already stripped the field). Skip the inference # if the user explicitly submitted `enable_reasoning: false` with a # leftover `reasoning_effort` — that's an explicit opt-out, not an # implicit opt-in. if ( not enable_reasoning - and has_reasoning_effort and "enable_reasoning" not in adapter_metadata + and (has_reasoning_effort or has_reasoning_extra_body) ): enable_reasoning = True if not enable_reasoning and _is_openai_reasoning_model(adapter_metadata["model"]): enable_reasoning = True - reasoning_effort = adapter_metadata.get("reasoning_effort") or "medium" + reasoning_effort = ( + adapter_metadata.get("reasoning_effort") + or ( + existing_extra_body.get("reasoning_effort") + if isinstance(existing_extra_body, dict) + else None + ) + or "medium" + ) @@ if enable_reasoning: # Read max_tokens from the validated dict so Pydantic's `int | None` # coercion (e.g. "4096" -> 4096) has already been applied before the # value is forwarded to the upstream API via extra_body. max_tokens = validated.get("max_tokens") + if max_tokens is None and isinstance(existing_extra_body, dict): + max_tokens = existing_extra_body.get("max_completion_tokens") extra_body = {"reasoning_effort": reasoning_effort} if max_tokens is not None: extra_body["max_completion_tokens"] = max_tokens🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@unstract/sdk1/src/unstract/sdk1/adapters/base1.py` around lines 401 - 440, The validation currently strips enable_reasoning and top-level reasoning_effort (moving them into validated["extra_body"]) which makes a later call to this same validation lose the original reasoning flags; to fix, when you detect reasoning is enabled (either via auto-detection or because adapter_metadata originally contained enable_reasoning/reasoning_effort) preserve that state in the validated dict so re-validation is idempotent — e.g., when enable_reasoning is True add enable_reasoning=True and the original reasoning_effort (or ensure extra_body with reasoning_effort is left intact) into validated before returning, and avoid removing both enable_reasoning and reasoning_effort unconditionally; update the logic around adapter_metadata, validated, extra_body, and the pop calls so subsequent OpenAICompatibleLLMParameters(**validation_metadata).model_dump() calls still see the reasoning flags.
🧹 Nitpick comments (1)
unstract/sdk1/tests/test_openai_compatible_adapter.py (1)
257-260: ⚡ Quick winStrengthen the regression contract for explicit disable.
This test should also assert that leftover
reasoning_effortis removed and thattemperatureremains the expected default value, not just present.Suggested test hardening
- assert "extra_body" not in validated - assert validated["max_tokens"] == 1024 - assert "temperature" in validated + assert "extra_body" not in validated + assert "reasoning_effort" not in validated + assert validated["max_tokens"] == 1024 + assert validated["temperature"] == _DEFAULT_TEMPERATURE🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@unstract/sdk1/tests/test_openai_compatible_adapter.py` around lines 257 - 260, Add two stronger assertions to the test: ensure the removed field "reasoning_effort" is not present (assert "reasoning_effort" not in validated) and assert that validated["temperature"] equals the adapter's canonical default rather than merely existing (e.g. assert validated["temperature"] == openai_compatible_adapter.DEFAULT_TEMPERATURE or the appropriate DEFAULT_TEMPERATURE constant used by the code under test). This uses the existing validated variable and the module-level DEFAULT_TEMPERATURE symbol to make the regression contract explicit.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@unstract/sdk1/src/unstract/sdk1/adapters/base1.py`:
- Around line 401-440: The validation currently strips enable_reasoning and
top-level reasoning_effort (moving them into validated["extra_body"]) which
makes a later call to this same validation lose the original reasoning flags; to
fix, when you detect reasoning is enabled (either via auto-detection or because
adapter_metadata originally contained enable_reasoning/reasoning_effort)
preserve that state in the validated dict so re-validation is idempotent — e.g.,
when enable_reasoning is True add enable_reasoning=True and the original
reasoning_effort (or ensure extra_body with reasoning_effort is left intact)
into validated before returning, and avoid removing both enable_reasoning and
reasoning_effort unconditionally; update the logic around adapter_metadata,
validated, extra_body, and the pop calls so subsequent
OpenAICompatibleLLMParameters(**validation_metadata).model_dump() calls still
see the reasoning flags.
---
Nitpick comments:
In `@unstract/sdk1/tests/test_openai_compatible_adapter.py`:
- Around line 257-260: Add two stronger assertions to the test: ensure the
removed field "reasoning_effort" is not present (assert "reasoning_effort" not
in validated) and assert that validated["temperature"] equals the adapter's
canonical default rather than merely existing (e.g. assert
validated["temperature"] == openai_compatible_adapter.DEFAULT_TEMPERATURE or the
appropriate DEFAULT_TEMPERATURE constant used by the code under test). This uses
the existing validated variable and the module-level DEFAULT_TEMPERATURE symbol
to make the regression contract explicit.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 452462e8-eb5e-4b73-9dff-c3d58e631a48
📒 Files selected for processing (2)
unstract/sdk1/src/unstract/sdk1/adapters/base1.pyunstract/sdk1/tests/test_openai_compatible_adapter.py
CodeRabbit (MAJOR): the previous fix moved `reasoning_effort` and `max_completion_tokens` into `extra_body` and stripped both `enable_reasoning` and `reasoning_effort` from the top level. Feeding that output back into `validate()` for a non-auto-detected alias lost the reasoning state entirely — the second pass emitted `temperature` and `max_tokens` and the upstream API rejected the request. Recover the original reasoning fields from `extra_body` on the re- validation pass: - Treat `extra_body.reasoning_effort` as equivalent to a top-level `reasoning_effort` when inferring `enable_reasoning`. - Fall back to `extra_body.reasoning_effort` and `extra_body.max_completion_tokens` when the top-level values are absent. Adds `test_reasoning_state_survives_revalidation_for_custom_alias` that feeds `validate()` output back into `validate()` for a custom alias and asserts the reasoning payload is identical. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@coderabbitai — the MAJOR idempotency finding on Skipping the nitpick about strengthening assertions in |
|
Tip For best results, initiate chat on the files or code changes.
Understood on skipping the assertion-strengthening nitpick — the existing assertions on If you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
|
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|



What
OpenAI CompatibleLLM adapter (unstract/sdk1OpenAICompatibleLLMParameters).gpt-5,o1,o3,o4) after stripping the LiteLLM provider prefix.Enable Reasoningopt-in toggle (withreasoning_effortlow/medium/high) for gateway aliases that hide the underlying reasoning model behind a custom name.temperatureandmax_tokensfrom the top-level kwargs and routesreasoning_effort+max_completion_tokensvia LiteLLM'sextra_body.Why
PR #1895 added a dedicated
OpenAI Compatibleadapter that routes through LiteLLM'scustom_openai/provider. That provider is generic and does not apply the reasoning-model parameter transformations that LiteLLM'sopenai/provider does (auto-striptemperature, rewritemax_tokens->max_completion_tokens). The adapter's defaults (temperature=0.1,max_tokens=4096) are forwarded raw and rejected by OpenAI's GPT-5 API:Custom_openaiException - Unsupported value: 'temperature' does not support 0.1 with this model. Only the default (1) value is supported.Custom_openaiException - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.Result: any user creating an OpenAI Compatible adapter pointed at OpenAI for a reasoning model fails Test Connection.
How
OpenAICompatibleLLMParameters.validate()consults a small regex (_OPENAI_REASONING_MODEL_PATTERN = ^(o1|o3|o4|gpt-5)(?:[-/]|$)) after strippingcustom_openai/and optionalopenai/sub-prefix. Conservative on purpose — catchesgpt-5,gpt-5-mini,gpt-5-2024-12-01,o1,o1-mini,o1-preview,o3,o3-mini,o4-mini, but notgpt-4o/gpt-4o-mini/gpt-3.5-turbo/ arbitrary gateway aliases.enable_reasoningORreasoning_effortalready in metadata from a re-validation pass), the validator:temperaturefrom the validated kwargs (OpenAI gpt-5 only accepts the default1).max_tokensfrom the validated kwargs.extra_body = {"reasoning_effort": <user value or "medium">, "max_completion_tokens": <max_tokens if provided>}. LiteLLM'scustom_openaiprovider forwardsextra_bodyas-is to the upstream endpoint, bypassing its own param rewrites.temperature=0.1andmax_tokensflow through unchanged so existing vLLM / LM Studio / generic-gateway users see no behavior change.custom_openai.jsonexposes theEnable Reasoningboolean (defaultfalse) and conditionalreasoning_effortenum, mirroring the OpenAI adapter's UX.Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
No. The reasoning code path only activates for known OpenAI reasoning model prefixes (
gpt-5,o1,o3,o4) or when the user explicitly togglesEnable Reasoning. Non-reasoning OpenAI models and arbitrary gateway model names hit the original code path withtemperature/max_tokenspreserved exactly as before — verified bytest_openai_compatible_validate_preserves_non_reasoning_modelsandtest_openai_compatible_validate_no_reasoning_unchanged.Database Migrations
None.
Env Config
None.
Relevant Docs
None required.
Related Issues or PRs
Dependencies Versions
None.
Notes on Testing
https://api.openai.com/v1, Modelgpt-5, leaveEnable Reasoningunchecked, click Test Connection -> succeeds (auto-detected).o1/o3-mini-> succeeds.gpt-4o-> succeeds (non-reasoning path, unchanged behavior).my-gateway-aliaspointing at gpt-5 on a proxy -> succeeds.Screenshots
N/A — only adds the
Enable Reasoningtoggle andReasoning Effortdropdown to the existing OpenAI Compatible adapter form.Checklist
I have read and understood the Contribution Guidelines.