UN-3478 [FIX] Support OpenAI gpt-5 / o-series in openai-compatible adapter by pk-zipstack · Pull Request #1983 · Zipstack/unstract

pk-zipstack · 2026-05-22T08:26:22Z

What

Adds OpenAI reasoning-model handling to the OpenAI Compatible LLM adapter (unstract/sdk1 OpenAICompatibleLLMParameters).
Auto-detects known OpenAI reasoning families (gpt-5, o1, o3, o4) after stripping the LiteLLM provider prefix.
Adds an Enable Reasoning opt-in toggle (with reasoning_effort low/medium/high) for gateway aliases that hide the underlying reasoning model behind a custom name.
When the reasoning path is active: drops temperature and max_tokens from the top-level kwargs and routes reasoning_effort + max_completion_tokens via LiteLLM's extra_body.

Why

PR #1895 added a dedicated OpenAI Compatible adapter that routes through LiteLLM's custom_openai/ provider. That provider is generic and does not apply the reasoning-model parameter transformations that LiteLLM's openai/ provider does (auto-strip temperature, rewrite max_tokens -> max_completion_tokens). The adapter's defaults (temperature=0.1, max_tokens=4096) are forwarded raw and rejected by OpenAI's GPT-5 API:

Custom_openaiException - Unsupported value: 'temperature' does not support 0.1 with this model. Only the default (1) value is supported.
Custom_openaiException - Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.

Result: any user creating an OpenAI Compatible adapter pointed at OpenAI for a reasoning model fails Test Connection.

How

OpenAICompatibleLLMParameters.validate() consults a small regex (_OPENAI_REASONING_MODEL_PATTERN = ^(o1|o3|o4|gpt-5)(?:[-/]|$)) after stripping custom_openai/ and optional openai/ sub-prefix. Conservative on purpose — catches gpt-5, gpt-5-mini, gpt-5-2024-12-01, o1, o1-mini, o1-preview, o3, o3-mini, o4-mini, but not gpt-4o / gpt-4o-mini / gpt-3.5-turbo / arbitrary gateway aliases.
When the reasoning path activates (auto-detect OR explicit enable_reasoning OR reasoning_effort already in metadata from a re-validation pass), the validator:
- Drops temperature from the validated kwargs (OpenAI gpt-5 only accepts the default 1).
- Drops max_tokens from the validated kwargs.
- Sets extra_body = {"reasoning_effort": <user value or "medium">, "max_completion_tokens": <max_tokens if provided>}. LiteLLM's custom_openai provider forwards extra_body as-is to the upstream endpoint, bypassing its own param rewrites.
The non-reasoning path is byte-identical to before — temperature=0.1 and max_tokens flow through unchanged so existing vLLM / LM Studio / generic-gateway users see no behavior change.
custom_openai.json exposes the Enable Reasoning boolean (default false) and conditional reasoning_effort enum, mirroring the OpenAI adapter's UX.

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

No. The reasoning code path only activates for known OpenAI reasoning model prefixes (gpt-5, o1, o3, o4) or when the user explicitly toggles Enable Reasoning. Non-reasoning OpenAI models and arbitrary gateway model names hit the original code path with temperature / max_tokens preserved exactly as before — verified by test_openai_compatible_validate_preserves_non_reasoning_models and test_openai_compatible_validate_no_reasoning_unchanged.

Database Migrations

None.

Env Config

None.

Relevant Docs

None required.

Related Issues or PRs

Jira: UN-3478
Follow-up to Add a dedicated OpenAI-compatible LLM adapter #1895 which introduced the OpenAI Compatible adapter.

Dependencies Versions

None.

Notes on Testing

Add LLM adapter -> OpenAI Compatible, API Base https://api.openai.com/v1, Model gpt-5, leave Enable Reasoning unchecked, click Test Connection -> succeeds (auto-detected).
Same as above with model o1 / o3-mini -> succeeds.
Same as above with model gpt-4o -> succeeds (non-reasoning path, unchanged behavior).
Enable Reasoning on, model my-gateway-alias pointing at gpt-5 on a proxy -> succeeds.
Existing vLLM / LM Studio adapter with a non-reasoning model -> Test Connection still works exactly as before.

Screenshots

N/A — only adds the Enable Reasoning toggle and Reasoning Effort dropdown to the existing OpenAI Compatible adapter form.

Checklist

I have read and understood the Contribution Guidelines.

OpenAI gpt-5 / o1 / o3 routed through the openai-compatible adapter were rejected by the upstream API with "temperature does not support 0.1" and "max_tokens not supported, use max_completion_tokens" because LiteLLM's `custom_openai` provider does not apply the reasoning-model parameter transformations that the `openai` provider does. Add an `Enable Reasoning` toggle (matching the OpenAI adapter pattern) that drops `temperature` and `max_tokens` from the top-level kwargs and routes `reasoning_effort` and `max_completion_tokens` via LiteLLM's `extra_body` (which is forwarded as-is to the upstream API). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The Enable Reasoning toggle alone is not enough: users dropping a gpt-5 or o-series model name into the adapter will not know they need to flip a switch, so the upstream API still returns `temperature does not support 0.1` / `max_tokens is not supported`. Pattern-match OpenAI's known reasoning families (gpt-5, o1, o3, o4) after stripping the `custom_openai/` / `openai/` prefixes and route the request through the reasoning code path automatically. Keep the `enable_reasoning` opt-in for gateway aliases that hide the underlying reasoning model behind a custom name. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-22T08:26:38Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds OpenAI reasoning-model detection, a schema toggle to enable reasoning, reasoning-aware validation that routes reasoning parameters into extra_body (mapping max_tokens to max_completion_tokens), and tests covering these behaviors.

Changes

OpenAI Reasoning Model Support

Layer / File(s)	Summary
Model detection and provider constants `unstract/sdk1/src/unstract/sdk1/adapters/base1.py`	Hoists `_OPENAI_PROVIDER_PREFIX` and `_CUSTOM_OPENAI_PROVIDER_PREFIX`, adds `_OPENAI_REASONING_MODEL_PATTERN` and `_is_openai_reasoning_model()`, and updates model formatting to use hoisted constants.
Parameters and validation `unstract/sdk1/src/unstract/sdk1/adapters/base1.py`	Adds `OpenAICompatibleLLMParameters.reasoning_effort` and rewrites `validate()` to compute `enable_reasoning` (flag, model detection, or presence of effort), exclude/route fields accordingly, and map `max_tokens` to `extra_body.max_completion_tokens` when reasoning is enabled.
Schema: enable_reasoning toggle `unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json`	Updates `max_tokens` docs and adds `enable_reasoning: boolean` plus `allOf` conditional rules requiring `reasoning_effort` (`"low"
Validation test suite `unstract/sdk1/tests/test_openai_compatible_adapter.py`	Adds schema exposure test and eight validation cases: auto-detection routing, non-reasoning preservation, explicit enablement routing, omission of derived tokens when unset, inference from `reasoning_effort`, explicit disable regression, and unchanged non-reasoning behavior; introduces `_DEFAULT_TEMPERATURE` helper.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 7.14% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly identifies the specific feature and fix (OpenAI gpt-5/o-series reasoning support) and the component being modified (openai-compatible adapter), directly reflecting the main change.
Description check	✅ Passed	The description is comprehensive and complete, covering all template sections including What, Why, How, backward compatibility assurance, related issues, testing notes, and contribution guidelines acknowledgment.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/openai-compatible-gpt5

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

for more information, see https://pre-commit.ci

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/adapters/base1.py`:
- Around line 405-420: The code currently uses the raw adapter_metadata
max_tokens when building extra_body for reasoning, risking an unvalidated value
leaking into the payload; instead, after validating adapter_metadata via
OpenAICompatibleLLMParameters (the validated dict), pull the validated
max_tokens from validated (e.g., validated.get("max_tokens")) and use that value
when setting extra_body["max_completion_tokens"] (only add it when not None),
keeping other logic (reasoning_effort, enable_reasoning, extra_body assignment)
the same.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 448295e2-7dca-4140-8cef-be840aba9ade

📥 Commits

Reviewing files that changed from the base of the PR and between ff50bd1 and 67f9dec.

📒 Files selected for processing (3)

unstract/sdk1/src/unstract/sdk1/adapters/base1.py
unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json
unstract/sdk1/tests/test_openai_compatible_adapter.py

greptile-apps · 2026-05-22T08:31:15Z

Greptile Summary

This PR fixes a compatibility issue with OpenAI reasoning models (gpt-5, o1, o3, o4) routed through LiteLLM's custom_openai/ provider, which does not auto-strip temperature or rewrite max_tokens → max_completion_tokens the way the openai/ provider does. The fix auto-detects known reasoning model families and reroutes the incompatible parameters via extra_body.

base1.py: Adds _is_openai_reasoning_model helper and rewrites OpenAICompatibleLLMParameters.validate() to strip temperature/max_tokens and inject extra_body for the reasoning path; introduces three new module-level constants (_OPENAI_PROVIDER_PREFIX, _CUSTOM_OPENAI_PROVIDER_PREFIX, _OPENAI_REASONING_MODEL_PATTERN).
custom_openai.json: Adds the enable_reasoning toggle and a conditional reasoning_effort dropdown using JSON Schema allOf/if/then; both if-branches correctly anchor on \"required\": [\"enable_reasoning\"] following prior review fixes.
test_openai_compatible_adapter.py: Adds eleven focused tests covering auto-detection, explicit opt-in/out, re-validation idempotency, and regressions from prior review findings.

Confidence Score: 5/5

Safe to merge — the reasoning path is isolated behind model-name detection and an explicit toggle, and the non-reasoning path is byte-identical to before.

The change is well-scoped: all existing callers with non-reasoning models hit the original code path unchanged, and the reasoning branch is guarded by both name-pattern detection and an explicit opt-in toggle. Prior review feedback was addressed thoroughly (inference guard, JSON Schema anchor, test isolation). The two observations here are both organisational or documentation concerns with no effect on correctness.

No files require special attention; the constant-ordering note in base1.py is cosmetic.

Important Files Changed

Filename	Overview
unstract/sdk1/src/unstract/sdk1/adapters/base1.py	Core reasoning-model logic added to OpenAICompatibleLLMParameters.validate(); module constants defining provider prefixes and regex pattern are placed after OpenAILLMParameters which already references _OPENAI_PROVIDER_PREFIX — works at runtime but unusual ordering.
unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json	Adds enable_reasoning toggle and conditional reasoning_effort field via JSON Schema allOf/if/then; both if-branches correctly include required:[enable_reasoning] following prior review fixes.
unstract/sdk1/tests/test_openai_compatible_adapter.py	Comprehensive new tests for reasoning auto-detection, explicit opt-in/out, re-validation idempotency, and schema structure; addresses all prior review feedback with correctly isolated test scenarios.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[validate called with adapter_metadata] --> B[validate_model: add custom_openai/ prefix]
    B --> C{enable_reasoning explicitly True?}
    C -- Yes --> G[reasoning_path = True]
    C -- No --> D{enable_reasoning absent AND reasoning_effort or extra_body.reasoning_effort present?}
    D -- Yes --> G
    D -- No --> E{_is_openai_reasoning_model: gpt-5, o1, o3, o4?}
    E -- Yes --> G
    E -- No --> F[Non-reasoning path]
    F --> F1[Pydantic validates params]
    F1 --> F2[Pop reasoning_effort]
    F2 --> Z[Return validated dict with temperature + max_tokens]
    G --> H[Pydantic validates params excluding enable_reasoning]
    H --> I[Build extra_body with reasoning_effort + max_completion_tokens]
    I --> J[Pop temperature, max_tokens, reasoning_effort from top-level]
    J --> K[Return validated dict with extra_body only]

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
unstract/sdk1/src/unstract/sdk1/adapters/base1.py:286-287
The three module constants (`_OPENAI_PROVIDER_PREFIX`, `_CUSTOM_OPENAI_PROVIDER_PREFIX`, `_OPENAI_REASONING_MODEL_PATTERN`) are defined at line 352, but `OpenAILLMParameters.validate_model` — defined earlier at line 340 — already references `_OPENAI_PROVIDER_PREFIX`. Python resolves the name at call time, so there is no runtime error, but the forward-reference is easy to miss during maintenance and is inconsistent with the module's convention of defining constants before their first use. Moving the three lines above `OpenAILLMParameters` eliminates the forward dependency.

```suggestion
# LiteLLM provider prefixes hoisted to module constants so the same literal is
# not duplicated across `OpenAILLMParameters`, `OpenAICompatibleLLMParameters`,
# and the reasoning-model detector below.
_OPENAI_PROVIDER_PREFIX = "openai/"
_CUSTOM_OPENAI_PROVIDER_PREFIX = "custom_openai/"
_OPENAI_REASONING_MODEL_PATTERN = re.compile(r"^(o1|o3|o4|gpt-5)(?:[-/]|$)")

class OpenAILLMParameters(BaseChatCompletionParameters):
    """See https://docs.litellm.ai/docs/providers/openai/."""
```

### Issue 2 of 2
unstract/sdk1/src/unstract/sdk1/adapters/base1.py:425-426
**Auto-detection silently overrides an explicit `enable_reasoning: false` for known model families**

The PR description explains this is intentional — the schema default for `enable_reasoning` is `false`, so a fresh form submission for `gpt-5` arrives as `{"enable_reasoning": false, ...}` and auto-detection must still fire. However, this means there is no way for a user to opt out of reasoning mode for a model whose name matches the auto-detect pattern (e.g. a vLLM deployment named `gpt-5`). The behavior is not covered by any test, leaving it invisible to future maintainers who may try to "fix" it by adding the same `"enable_reasoning" not in adapter_metadata` guard that was applied to the inference branch above. A focused test for this exact scenario — `{"model": "gpt-5", "enable_reasoning": false}` → reasoning path still active — would both document the intent and prevent accidental regression.

_{Reviews (5): Last reviewed commit: "[FIX] Preserve reasoning state across re..." | Re-trigger Greptile}

- base1.py: read max_tokens for extra_body from the Pydantic-validated dict so `int | None` coercion is applied before the value is forwarded to the upstream API (CodeRabbit). - custom_openai.json: anchor `allOf` `if` branches on `required: [enable_reasoning]` so the conditional does not fire vacuously when the property is absent from the submitted instance (Greptile). - test_openai_compatible_adapter.py: switch the `infers_reasoning_from_effort_field` test to a non-auto-detected model name so the test actually exercises the `has_reasoning_effort` branch instead of passing via auto-detection (Greptile). Also assert the new `required: [enable_reasoning]` clause on both schema branches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

for more information, see https://pre-commit.ci

chandrasekharan-zipstack

LGTM, @pk-zipstack try to make the comments concise and generic to make sense when read independently

- base1.py: hoist `openai/` and `custom_openai/` LiteLLM provider prefixes into `_OPENAI_PROVIDER_PREFIX` / `_CUSTOM_OPENAI_PROVIDER_PREFIX` module constants and use them in `_is_openai_reasoning_model`, `OpenAILLMParameters.validate_model`, and `OpenAICompatibleLLMParameters.validate_model`. Clears Sonar python:S1192 (the new helper pushed each literal past the duplication threshold). - test_openai_compatible_adapter.py: replace bare `== 0.1` float-equality assertions on the Pydantic-default `temperature` with a `pytest.approx(0.1)` constant. Clears Sonar python:S1244. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

for more information, see https://pre-commit.ci

Greptile P1: the inference branch ("if reasoning_effort is present, infer reasoning enabled") fired even when the user explicitly submitted `enable_reasoning: false`, silently overriding the opt-out — for example, when editing an adapter that previously had reasoning on and still has a leftover `reasoning_effort` in its stored metadata. Guard the inference branch with `"enable_reasoning" not in adapter_metadata` so it only fires on a re-validation pass where the field has been consumed/stripped, never when the user deliberately opted out. Auto-detection is intentionally NOT gated by this guard: the schema default for `enable_reasoning` is `false`, so a fresh form submission for `gpt-5` arrives as `enable_reasoning: false` and the adapter must still rescue the call. Adds a regression test using a non-auto-detected model name to isolate the inference branch from the auto-detect branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

for more information, see https://pre-commit.ci

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

unstract/sdk1/src/unstract/sdk1/adapters/base1.py (1)

401-440: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve reasoning state when re-validating a reasoning-enabled alias.

On the reasoning path this method strips both enable_reasoning and top-level reasoning_effort, leaving only extra_body. A later validate() on that returned dict for a non-auto-detected alias will miss this guarded inference, drop the existing extra_body, and fall back to normal top-level params. That makes explicit enable_reasoning=True non-idempotent across re-validation.

Suggested fix

         has_reasoning_effort = (
             "reasoning_effort" in adapter_metadata
             and adapter_metadata.get("reasoning_effort") is not None
         )
+        existing_extra_body = adapter_metadata.get("extra_body")
+        has_reasoning_extra_body = (
+            isinstance(existing_extra_body, dict)
+            and existing_extra_body.get("reasoning_effort") is not None
+        )
         # Infer reasoning only when `enable_reasoning` is ABSENT (e.g. on a
         # re-validation pass that already stripped the field). Skip the inference
         # if the user explicitly submitted `enable_reasoning: false` with a
         # leftover `reasoning_effort` — that's an explicit opt-out, not an
         # implicit opt-in.
         if (
             not enable_reasoning
-            and has_reasoning_effort
             and "enable_reasoning" not in adapter_metadata
+            and (has_reasoning_effort or has_reasoning_extra_body)
         ):
             enable_reasoning = True
         if not enable_reasoning and _is_openai_reasoning_model(adapter_metadata["model"]):
             enable_reasoning = True

-        reasoning_effort = adapter_metadata.get("reasoning_effort") or "medium"
+        reasoning_effort = (
+            adapter_metadata.get("reasoning_effort")
+            or (
+                existing_extra_body.get("reasoning_effort")
+                if isinstance(existing_extra_body, dict)
+                else None
+            )
+            or "medium"
+        )
@@
         if enable_reasoning:
             # Read max_tokens from the validated dict so Pydantic's `int | None`
             # coercion (e.g. "4096" -> 4096) has already been applied before the
             # value is forwarded to the upstream API via extra_body.
             max_tokens = validated.get("max_tokens")
+            if max_tokens is None and isinstance(existing_extra_body, dict):
+                max_tokens = existing_extra_body.get("max_completion_tokens")
             extra_body = {"reasoning_effort": reasoning_effort}
             if max_tokens is not None:
                 extra_body["max_completion_tokens"] = max_tokens

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@unstract/sdk1/src/unstract/sdk1/adapters/base1.py` around lines 401 - 440,
The validation currently strips enable_reasoning and top-level reasoning_effort
(moving them into validated["extra_body"]) which makes a later call to this same
validation lose the original reasoning flags; to fix, when you detect reasoning
is enabled (either via auto-detection or because adapter_metadata originally
contained enable_reasoning/reasoning_effort) preserve that state in the
validated dict so re-validation is idempotent — e.g., when enable_reasoning is
True add enable_reasoning=True and the original reasoning_effort (or ensure
extra_body with reasoning_effort is left intact) into validated before
returning, and avoid removing both enable_reasoning and reasoning_effort
unconditionally; update the logic around adapter_metadata, validated,
extra_body, and the pop calls so subsequent
OpenAICompatibleLLMParameters(**validation_metadata).model_dump() calls still
see the reasoning flags.

🧹 Nitpick comments (1)

unstract/sdk1/tests/test_openai_compatible_adapter.py (1)

257-260: ⚡ Quick win

Strengthen the regression contract for explicit disable.

This test should also assert that leftover reasoning_effort is removed and that temperature remains the expected default value, not just present.

Suggested test hardening

-    assert "extra_body" not in validated
-    assert validated["max_tokens"] == 1024
-    assert "temperature" in validated
+    assert "extra_body" not in validated
+    assert "reasoning_effort" not in validated
+    assert validated["max_tokens"] == 1024
+    assert validated["temperature"] == _DEFAULT_TEMPERATURE

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@unstract/sdk1/tests/test_openai_compatible_adapter.py` around lines 257 -
260, Add two stronger assertions to the test: ensure the removed field
"reasoning_effort" is not present (assert "reasoning_effort" not in validated)
and assert that validated["temperature"] equals the adapter's canonical default
rather than merely existing (e.g. assert validated["temperature"] ==
openai_compatible_adapter.DEFAULT_TEMPERATURE or the appropriate
DEFAULT_TEMPERATURE constant used by the code under test). This uses the
existing validated variable and the module-level DEFAULT_TEMPERATURE symbol to
make the regression contract explicit.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@unstract/sdk1/src/unstract/sdk1/adapters/base1.py`:
- Around line 401-440: The validation currently strips enable_reasoning and
top-level reasoning_effort (moving them into validated["extra_body"]) which
makes a later call to this same validation lose the original reasoning flags; to
fix, when you detect reasoning is enabled (either via auto-detection or because
adapter_metadata originally contained enable_reasoning/reasoning_effort)
preserve that state in the validated dict so re-validation is idempotent — e.g.,
when enable_reasoning is True add enable_reasoning=True and the original
reasoning_effort (or ensure extra_body with reasoning_effort is left intact)
into validated before returning, and avoid removing both enable_reasoning and
reasoning_effort unconditionally; update the logic around adapter_metadata,
validated, extra_body, and the pop calls so subsequent
OpenAICompatibleLLMParameters(**validation_metadata).model_dump() calls still
see the reasoning flags.

---

Nitpick comments:
In `@unstract/sdk1/tests/test_openai_compatible_adapter.py`:
- Around line 257-260: Add two stronger assertions to the test: ensure the
removed field "reasoning_effort" is not present (assert "reasoning_effort" not
in validated) and assert that validated["temperature"] equals the adapter's
canonical default rather than merely existing (e.g. assert
validated["temperature"] == openai_compatible_adapter.DEFAULT_TEMPERATURE or the
appropriate DEFAULT_TEMPERATURE constant used by the code under test). This uses
the existing validated variable and the module-level DEFAULT_TEMPERATURE symbol
to make the regression contract explicit.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 452462e8-eb5e-4b73-9dff-c3d58e631a48

📥 Commits

Reviewing files that changed from the base of the PR and between 4cf0e47 and fbd7177.

📒 Files selected for processing (2)

unstract/sdk1/src/unstract/sdk1/adapters/base1.py
unstract/sdk1/tests/test_openai_compatible_adapter.py

CodeRabbit (MAJOR): the previous fix moved `reasoning_effort` and `max_completion_tokens` into `extra_body` and stripped both `enable_reasoning` and `reasoning_effort` from the top level. Feeding that output back into `validate()` for a non-auto-detected alias lost the reasoning state entirely — the second pass emitted `temperature` and `max_tokens` and the upstream API rejected the request. Recover the original reasoning fields from `extra_body` on the re- validation pass: - Treat `extra_body.reasoning_effort` as equivalent to a top-level `reasoning_effort` when inferring `enable_reasoning`. - Fall back to `extra_body.reasoning_effort` and `extra_body.max_completion_tokens` when the top-level values are absent. Adds `test_reasoning_state_survives_revalidation_for_custom_alias` that feeds `validate()` output back into `validate()` for a custom alias and asserts the reasoning payload is identical. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pk-zipstack · 2026-05-22T09:15:19Z

@coderabbitai — the MAJOR idempotency finding on base1.py:401-440 is fixed in 6c2a574. validate() now reads existing_extra_body = adapter_metadata.get('extra_body') before the Pydantic dump and uses extra_body.reasoning_effort / extra_body.max_completion_tokens as fallbacks when the top-level fields are absent. New test_reasoning_state_survives_revalidation_for_custom_alias feeds the first validate() output back into validate() for a non-auto-detected alias and asserts the reasoning payload is identical across passes.

Skipping the nitpick about strengthening assertions in test_explicit_disable_overrides_leftover_reasoning_effort — the regression intent is captured by the existing assertions (no extra_body, max_tokens preserved, temperature present), and per the reviewer-batching policy I'm not bundling nice-to-haves.

coderabbitai · 2026-05-22T09:15:44Z

Tip

For best results, initiate chat on the files or code changes.

@pk-zipstack: The idempotency fix in 6c2a574b looks well-targeted — reading existing_extra_body before the Pydantic dump and using it as a fallback cleanly handles the case where top-level reasoning_effort/max_completion_tokens have already been promoted into extra_body on a prior pass. The new test_reasoning_state_survives_revalidation_for_custom_alias directly pins the contract that multiple validate() passes produce identical payloads, which is exactly the right regression anchor for this scenario.

Understood on skipping the assertion-strengthening nitpick — the existing assertions on test_explicit_disable_overrides_leftover_reasoning_effort do cover the regression intent adequately.

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

sonarqubecloud · 2026-05-22T09:16:46Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

github-actions · 2026-05-22T09:18:19Z

Test Results

Summary

✅ Runner Tests: 11 passed, 0 failed (11 total)
✅ SDK1 Tests: 343 passed, 0 failed (343 total)

Runner Tests - Full Report

filepath	function	$$\textcolor{#23d18b}{\tt{passed}}$$	SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_logs}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_client\_init}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_run\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$		$$\textcolor{#23d18b}{\tt{11}}$$	$$\textcolor{#23d18b}{\tt{11}}$$

SDK1 Tests - Full Report

pk-zipstack and others added 2 commits May 22, 2026 12:45

[pre-commit.ci] auto fixes from pre-commit.com hooks

67f9dec

for more information, see https://pre-commit.ci

coderabbitai Bot reviewed May 22, 2026

View reviewed changes

Comment thread unstract/sdk1/src/unstract/sdk1/adapters/base1.py Outdated

greptile-apps Bot reviewed May 22, 2026

View reviewed changes

Comment thread unstract/sdk1/tests/test_openai_compatible_adapter.py

Comment thread unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json

pk-zipstack and others added 2 commits May 22, 2026 14:05

[pre-commit.ci] auto fixes from pre-commit.com hooks

4bcc1a9

for more information, see https://pre-commit.ci

chandrasekharan-zipstack approved these changes May 22, 2026

View reviewed changes

chandrasekharan-zipstack mentioned this pull request May 22, 2026

[MISC] Handle reasoning models in OpenAI-compatible LLM adapter #1984

Closed

pk-zipstack and others added 2 commits May 22, 2026 14:15

[pre-commit.ci] auto fixes from pre-commit.com hooks

4cf0e47

for more information, see https://pre-commit.ci

greptile-apps Bot reviewed May 22, 2026

View reviewed changes

Comment thread unstract/sdk1/src/unstract/sdk1/adapters/base1.py Outdated

pk-zipstack and others added 2 commits May 22, 2026 14:25

[pre-commit.ci] auto fixes from pre-commit.com hooks

17a021c

for more information, see https://pre-commit.ci

coderabbitai Bot reviewed May 22, 2026

View reviewed changes

vishnuszipstack approved these changes May 22, 2026

View reviewed changes

chandrasekharan-zipstack merged commit c2fb931 into main May 22, 2026
6 checks passed

chandrasekharan-zipstack deleted the fix/openai-compatible-gpt5 branch May 22, 2026 09:17

chandrasekharan-zipstack mentioned this pull request May 22, 2026

[MISC] Fix zero cost tracking for OpenAI-compatible adapter #1985

Merged

Conversation

pk-zipstack commented May 22, 2026

What

Why

How

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

Database Migrations

Env Config

Relevant Docs

Related Issues or PRs

Dependencies Versions

Notes on Testing

Screenshots

Checklist

Uh oh!

coderabbitai Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

chandrasekharan-zipstack left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

pk-zipstack commented May 22, 2026

Uh oh!

coderabbitai Bot commented May 22, 2026

Uh oh!

sonarqubecloud Bot commented May 22, 2026

Quality Gate passed

Uh oh!

Uh oh!

github-actions Bot commented May 22, 2026

Test Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai Bot commented May 22, 2026 •

edited

Loading

greptile-apps Bot commented May 22, 2026 •

edited

Loading