Skip to content

LCORE-1583: Clear reasoning and max_output_tokens in responses#1432

Merged
tisnik merged 1 commit intolightspeed-core:mainfrom
asimurka:clear_reasoning_and_max_out_tokens
Mar 30, 2026
Merged

LCORE-1583: Clear reasoning and max_output_tokens in responses#1432
tisnik merged 1 commit intolightspeed-core:mainfrom
asimurka:clear_reasoning_and_max_out_tokens

Conversation

@asimurka
Copy link
Copy Markdown
Contributor

@asimurka asimurka commented Mar 30, 2026

Description

This PR adds explicit clearing of reasoning and max_output_tokens attributes of responses endpoint due to their swapped validation in LLS. The attributes will be supported since LCORE 0.6.0.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: N/A

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • Bug Fixes

    • Implemented proper handling for reasoning and max_output_tokens parameters, with warning notifications logged when provided in requests.
  • Documentation

    • Added documentation detailing parameter handling behavior and compatibility information for the responses endpoint.

@asimurka asimurka requested a review from tisnik March 30, 2026 12:27
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 30, 2026

Walkthrough

The changes add handling for a known Llama Stack (LCORE) issue where reasoning and max_output_tokens parameters are accepted for OpenResponses compatibility but must be cleared before processing. Pre-processing logic is implemented in the responses endpoint handler to clear these fields and log warnings, with corresponding documentation added.

Changes

Cohort / File(s) Summary
Documentation
docs/responses.md
Added note to OpenAPI-attribute request table documenting that reasoning and max_output_tokens are accepted for compatibility but cleared before LCORE processing with a logged warning.
Request Preprocessing
src/app/endpoints/responses.py
Added pre-processing logic in responses_endpoint_handler to detect, log warnings for, and clear reasoning and max_output_tokens fields before the request is deep-copied and processed.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main changes: clearing reasoning and max_output_tokens in the responses endpoint to address LCORE-1583.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/responses.md`:
- Around line 99-100: Update the docs to consistently mark the `reasoning`
parameter and `max_output_tokens` as "accepted but ignored until LCORE 0.6.0"
wherever they are referenced: update the standalone "reasoning" section, all
examples that show active behavior, and any descriptions that imply they are
supported to explicitly state they are currently cleared/ignored and a warning
is logged; alternatively remove examples demonstrating active `reasoning`
behavior until LCORE 0.6.0 to avoid client confusion.

In `@src/app/endpoints/responses.py`:
- Around line 162-171: Create and sanitize a copy of the incoming
responses_request instead of mutating the parameter in place: make a shallow
copy (e.g., sanitized_request) at the start of the handler where
responses_request is used, apply the existing checks and warnings to
sanitized_request (clearing reasoning and max_output_tokens there), and then use
or return sanitized_request for downstream processing; update all subsequent
references in this function from responses_request to sanitized_request so the
original argument is not modified.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d5326af3-ed8a-446f-969a-fc992e5bdff3

📥 Commits

Reviewing files that changed from the base of the PR and between ee0bcf9 and fb50736.

📒 Files selected for processing (2)
  • docs/responses.md
  • src/app/endpoints/responses.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: integration_tests (3.12)
  • GitHub Check: unit_tests (3.13)
  • GitHub Check: Pylinter
  • GitHub Check: mypy
  • GitHub Check: build-pr
  • GitHub Check: unit_tests (3.12)
  • GitHub Check: Pyright
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: E2E: server mode / ci
  • GitHub Check: E2E: library mode / ci
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use absolute imports for internal modules from the Lightspeed Core Stack (e.g., from authentication import get_auth_dependency)

Files:

  • src/app/endpoints/responses.py
src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use FastAPI imports from the fastapi module: from fastapi import APIRouter, HTTPException, Request, status, Depends
Use from llama_stack_client import AsyncLlamaStackClient for Llama Stack client imports
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
Define type aliases at module level for clarity in Lightspeed Core Stack
All functions require docstrings with brief descriptions following Google Python docstring conventions
Use complete type annotations for function parameters and return types
Use modern union type syntax str | int instead of Union[str, int] for type annotations
Use Optional[Type] for nullable types in type annotations
Use snake_case with descriptive, action-oriented names for functions (e.g., get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Use async def for functions performing I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling logic
Use standard log levels appropriately: debug for diagnostic info, info for general execution, warning for unexpected events, error for serious problems
All classes require descriptive docstrings explaining their purpose
Use PascalCase for class names with descriptive names and standard suffixes (Configuration, Error/Exception, Resolver, Interface)
Use ABC (Abstract Base Classes) with @abstractmethod decorators for interface implementations
Use complete type annotations for all class attributes; avoid using Any type
Follow Google Python docstring conventions with Parameters, Returns, Raises, and Attributes sections as needed

Files:

  • src/app/endpoints/responses.py
src/app/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

  • src/app/endpoints/responses.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:39.608Z
Learning: In the lightspeed-stack codebase, src/models/requests.py uses OpenAIResponseInputTool as Tool while src/models/responses.py uses OpenAIResponseTool as Tool. This type difference is intentional - input tools and output/response tools have different schemas in llama-stack-api.
📚 Learning: 2026-02-25T07:46:39.608Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:39.608Z
Learning: In the lightspeed-stack codebase, src/models/requests.py uses OpenAIResponseInputTool as Tool while src/models/responses.py uses OpenAIResponseTool as Tool. This type difference is intentional - input tools and output/response tools have different schemas in llama-stack-api.

Applied to files:

  • src/app/endpoints/responses.py

@tisnik
Copy link
Copy Markdown
Contributor

tisnik commented Mar 30, 2026

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 1 🔵⚪⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ No major issues detected

Copy link
Copy Markdown
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tisnik tisnik merged commit 9c9410c into lightspeed-core:main Mar 30, 2026
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants