LCORE-1583: Clear reasoning and max_output_tokens in responses by asimurka · Pull Request #1432 · lightspeed-core/lightspeed-stack

asimurka · 2026-03-30T12:27:08Z

Description

This PR adds explicit clearing of reasoning and max_output_tokens attributes of responses endpoint due to their swapped validation in LLS. The attributes will be supported since LCORE 0.6.0.

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: N/A

Related Tickets & Documents

Related Issue # LCORE-1583
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

Bug Fixes
- Implemented proper handling for reasoning and max_output_tokens parameters, with warning notifications logged when provided in requests.
Documentation
- Added documentation detailing parameter handling behavior and compatibility information for the responses endpoint.

coderabbitai · 2026-03-30T12:29:30Z

Walkthrough

The changes add handling for a known Llama Stack (LCORE) issue where reasoning and max_output_tokens parameters are accepted for OpenResponses compatibility but must be cleared before processing. Pre-processing logic is implemented in the responses endpoint handler to clear these fields and log warnings, with corresponding documentation added.

Changes

Cohort / File(s)	Summary
Documentation `docs/responses.md`	Added note to OpenAPI-attribute request table documenting that `reasoning` and `max_output_tokens` are accepted for compatibility but cleared before LCORE processing with a logged warning.
Request Preprocessing `src/app/endpoints/responses.py`	Added pre-processing logic in `responses_endpoint_handler` to detect, log warnings for, and clear `reasoning` and `max_output_tokens` fields before the request is deep-copied and processed.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main changes: clearing reasoning and max_output_tokens in the responses endpoint to address LCORE-1583.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/responses.md`:
- Around line 99-100: Update the docs to consistently mark the `reasoning`
parameter and `max_output_tokens` as "accepted but ignored until LCORE 0.6.0"
wherever they are referenced: update the standalone "reasoning" section, all
examples that show active behavior, and any descriptions that imply they are
supported to explicitly state they are currently cleared/ignored and a warning
is logged; alternatively remove examples demonstrating active `reasoning`
behavior until LCORE 0.6.0 to avoid client confusion.

In `@src/app/endpoints/responses.py`:
- Around line 162-171: Create and sanitize a copy of the incoming
responses_request instead of mutating the parameter in place: make a shallow
copy (e.g., sanitized_request) at the start of the handler where
responses_request is used, apply the existing checks and warnings to
sanitized_request (clearing reasoning and max_output_tokens there), and then use
or return sanitized_request for downstream processing; update all subsequent
references in this function from responses_request to sanitized_request so the
original argument is not modified.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d5326af3-ed8a-446f-969a-fc992e5bdff3

📥 Commits

Reviewing files that changed from the base of the PR and between ee0bcf9 and fb50736.

📒 Files selected for processing (2)

docs/responses.md
src/app/endpoints/responses.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: integration_tests (3.12)
GitHub Check: unit_tests (3.13)
GitHub Check: Pylinter
GitHub Check: mypy
GitHub Check: build-pr
GitHub Check: unit_tests (3.12)
GitHub Check: Pyright
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
GitHub Check: E2E: server mode / ci
GitHub Check: E2E: library mode / ci

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use absolute imports for internal modules from the Lightspeed Core Stack (e.g., from authentication import get_auth_dependency)

Files:

src/app/endpoints/responses.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use FastAPI imports from the fastapi module: from fastapi import APIRouter, HTTPException, Request, status, Depends
Use from llama_stack_client import AsyncLlamaStackClient for Llama Stack client imports
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
Define type aliases at module level for clarity in Lightspeed Core Stack
All functions require docstrings with brief descriptions following Google Python docstring conventions
Use complete type annotations for function parameters and return types
Use modern union type syntax str | int instead of Union[str, int] for type annotations
Use Optional[Type] for nullable types in type annotations
Use snake_case with descriptive, action-oriented names for functions (e.g., get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Use async def for functions performing I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling logic
Use standard log levels appropriately: debug for diagnostic info, info for general execution, warning for unexpected events, error for serious problems
All classes require descriptive docstrings explaining their purpose
Use PascalCase for class names with descriptive names and standard suffixes (Configuration, Error/Exception, Resolver, Interface)
Use ABC (Abstract Base Classes) with @abstractmethod decorators for interface implementations
Use complete type annotations for all class attributes; avoid using Any type
Follow Google Python docstring conventions with Parameters, Returns, Raises, and Attributes sections as needed

Files:

src/app/endpoints/responses.py

src/app/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/responses.py

🧠 Learnings (2)

📓 Common learnings

Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:39.608Z
Learning: In the lightspeed-stack codebase, src/models/requests.py uses OpenAIResponseInputTool as Tool while src/models/responses.py uses OpenAIResponseTool as Tool. This type difference is intentional - input tools and output/response tools have different schemas in llama-stack-api.

📚 Learning: 2026-02-25T07:46:39.608Z

Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:39.608Z
Learning: In the lightspeed-stack codebase, src/models/requests.py uses OpenAIResponseInputTool as Tool while src/models/responses.py uses OpenAIResponseTool as Tool. This type difference is intentional - input tools and output/response tools have different schemas in llama-stack-api.

Applied to files:

src/app/endpoints/responses.py

docs/responses.md

src/app/endpoints/responses.py

tisnik · 2026-03-30T16:02:46Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 1 🔵⚪⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ No major issues detected

tisnik

LGTM

Clear reasoning and max_output_tokens in responses

fb50736

asimurka requested a review from tisnik March 30, 2026 12:27

coderabbitai bot reviewed Mar 30, 2026

View reviewed changes

docs/responses.md Show resolved Hide resolved

src/app/endpoints/responses.py Show resolved Hide resolved

tisnik added the Review effort 1/5 label Mar 30, 2026

tisnik approved these changes Mar 30, 2026

View reviewed changes

tisnik merged commit 9c9410c into lightspeed-core:main Mar 30, 2026
27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LCORE-1583: Clear reasoning and max_output_tokens in responses#1432

LCORE-1583: Clear reasoning and max_output_tokens in responses#1432
tisnik merged 1 commit intolightspeed-core:mainfrom
asimurka:clear_reasoning_and_max_out_tokens

asimurka commented Mar 30, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 30, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

tisnik commented Mar 30, 2026

Uh oh!

tisnik left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

asimurka commented Mar 30, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tisnik commented Mar 30, 2026

PR Reviewer Guide 🔍

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

asimurka commented Mar 30, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 30, 2026 •

edited

Loading