RSPEED-2978: refactor Responses API shared side effects by major · Pull Request #1649 · lightspeed-core/lightspeed-stack

major · 2026-04-30T17:31:41Z

Description

Refactor the Responses API endpoint so streaming and non-streaming handlers reuse the same side-effect helpers for error telemetry, blocked response persistence, previous response continuation, result storage, and completed telemetry. This brings handle_non_streaming_response down to radon A while keeping the behavior exercised through the existing handler tests.

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: Hephaestus (OpenAI GPT-5.5), CodeRabbit CLI
Generated by: N/A

Related Tickets & Documents

Related Issue # RSPEED-2978
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

uv run pyright src/app/endpoints/responses.py tests/unit/app/endpoints/test_responses_splunk.py
uv run pytest tests/unit/app/endpoints/test_responses.py tests/unit/app/endpoints/test_responses_splunk.py, 85 passed
uv run make verify
git diff --check origin/main...HEAD
uv run radon cc -s -a src/app/endpoints/responses.py, average A, handle_non_streaming_response A (5)
Local CodeRabbit review found one minor completed-telemetry user_agent finding, fixed in this branch with assertions in test_responses_splunk.py.

Summary by CodeRabbit

Release Notes

Bug Fixes & Improvements
- Enhanced error handling for connection failures and capacity constraints with improved exception mapping
- Improved handling of blocked requests with consistent logging across streaming and non-streaming flows
Tests
- Strengthened telemetry validation tests with enhanced parameter coverage

major · 2026-04-30T17:34:27Z

@tisnik I meant to do this when I added the splunk stuff, but then I got caught up in Summit deliverables. 🤪

coderabbitai · 2026-04-30T17:34:52Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c9b2f1a9-52c9-4db9-a429-8163ba6c1105

📥 Commits

Reviewing files that changed from the base of the PR and between 0c79c65 and 26ce9b9.

📒 Files selected for processing (2)

src/app/endpoints/responses.py
tests/unit/app/endpoints/test_responses_splunk.py

📜 Recent review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: build-pr
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 3

🧰 Additional context used

📓 Path-based instructions (5)

tests/unit/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/unit/**/*.py: Use pytest for all unit and integration tests
Use pytest-mock for AsyncMock objects in unit tests
Use marker pytest.mark.asyncio for async tests

Files:

tests/unit/app/endpoints/test_responses_splunk.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Do not use unittest - pytest is the standard testing framework for this project

Files:

tests/unit/app/endpoints/test_responses_splunk.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
All modules start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
Type aliases defined at module level for clarity
Use Final[type] as type hint for all constants
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types
Use Union types with modern syntax: str | int
Use Optional[Type] for optional type hints
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Use async def for I/O operations and external API calls
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes use ABC with @abstractmethod decorators
Complete type annotations for all class attributes, use specific types, not Any
Follow Google Python docstring conventions for all modules, classes, and functions
Docstring Parameters section documents function parameters
Docstring Returns section documents function return values
Docstring Raises section documents exceptions that may be raised
Use black for code formatting
Use pylint for static analysis with source-roots configuration set to "src"
Use pyright for type checking
Use ruff for fast linting
Use pydocstyle for docstring style validation
Use mypy for additional type checking
Use bandit for security issue detection

Files:

src/app/endpoints/responses.py

src/app/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use FastAPI dependencies: from fastapi import APIRouter, HTTPException, Request, status, Depends

Files:

src/app/endpoints/responses.py

src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/responses.py

🧠 Learnings (1)

📚 Learning: 2026-04-06T20:18:07.852Z

Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.

Applied to files:

src/app/endpoints/responses.py

🔇 Additional comments (16)

src/app/endpoints/responses.py (14)

372-388: Topic-summary generation may still be called for shield-blocked requests.

This was raised in a previous review and marked as addressed, but the current guard at line 385 only checks generate_topic_summary without verifying moderation_result.decision == "passed". Both blocked code paths continue execution to where this helper is called, potentially invoking get_topic_summary() on rejected input.

If the omission is intentional per the parity discussion with query endpoints, consider adding a comment documenting the design decision.

7-16: LGTM!

Import additions align with the new helper functions: Sequence and OpenAIResponseOutput for _append_previous_response_turn, and NoReturn for _raise_response_api_http_exception.

228-251: LGTM!

The helper correctly uses fire_and_forget=True for error telemetry, since FastAPI discards BackgroundTasks on non-2xx responses. Parameters are properly sourced from api_params and context.

254-280: LGTM!

The error mapping logic correctly uses the utility functions from utils/query.py shown in the relevant code snippets. Returning None for unrecognized RuntimeError allows the caller to re-raise the original exception unchanged.

283-303: LGTM!

The NoReturn type hint correctly indicates the function always raises. Exception chaining with from error preserves the traceback context, and telemetry is properly queued before the exception is raised.

306-323: LGTM!

The helper correctly guards on api_params.store and the cast to ShieldModerationBlocked is safe since both call sites verify moderation_result.decision == "blocked" first.

326-348: LGTM!

Correctly uses fire_and_forget=False (default) since shield-blocked requests return a valid 200 response with refusal content, so BackgroundTasks will be executed normally.

351-369: LGTM!

The dual guard condition correctly handles the case where conversation context is being continued from a previous response. The Sequence[OpenAIResponseOutput] type is appropriately specific.

391-421: LGTM!

The helper correctly guards on api_params.store and properly extracts auth tuple components. The empty attachments=[] is appropriate for the Responses API path.

424-454: LGTM!

The guard at lines 440-441 correctly prevents duplicate telemetry—blocked requests emit responses_shield_blocked via _queue_blocked_response_event instead. Token counts and user_agent are properly propagated.

658-682: LGTM!

The streaming handler correctly delegates to the shared helpers for both the blocked path and exception handling. The exception types in the except clause align with what _raise_response_api_http_exception handles.

1035-1040: LGTM!

The guard on latest_response_object before calling _append_previous_response_turn is correct—it prevents attempting to append when no valid response was received.

1067-1082: LGTM!

The ordering is correct—_store_response_query_results executes before _queue_completed_response_event, ensuring telemetry is only emitted after successful persistence. This addresses the previous review concern about ordering consistency.

1099-1182: LGTM!

The non-streaming handler correctly delegates to all shared helpers. The ordering at lines 1169-1182 is correct—_store_response_query_results before _queue_completed_response_event. The simplified user_id = context.auth[0] at line 1099 is appropriate since only the user_id is needed at that point.

tests/unit/app/endpoints/test_responses_splunk.py (2)

425-449: LGTM!

The test correctly verifies user_agent propagation through the non-streaming success path. The assertion at line 449 confirms the parameter reaches _queue_responses_splunk_event.

655-684: LGTM!

The streaming success test mirrors the non-streaming test coverage for user_agent propagation. Both paths are now verified to pass the user agent through to telemetry.

Walkthrough

Refactored responses.py to extract error handling, shield-blocked request persistence, topic-summary generation, response storage, and completion telemetry into reusable private helper functions. Both streaming and non-streaming endpoints now delegate to these consolidated helpers instead of duplicating logic. Tests updated to verify user_agent parameter propagation.

Changes

Error Handling & Telemetry Consolidation

Layer / File(s)	Summary
Helper Function Definitions `src/app/endpoints/responses.py` (lines 228–455)	Introduced `_http_exception_for_response_api_error`, `_raise_response_api_http_exception`, and `_queue_responses_error_event` for centralized exception-to-`HTTPException` mapping and `responses_error` telemetry. Added `_persist_blocked_response_turn` and `_queue_blocked_response_event` for shield-blocked request handling. Added `_append_previous_response_turn`, `_maybe_get_topic_summary`, `_store_response_query_results`, and `_queue_completed_response_event` for response persistence and completion telemetry (suppressed unless moderation passed).
Streaming Flow Refactoring `src/app/endpoints/responses.py` (lines 658–1082)	Updated streaming shield-blocked branch to use `_persist_blocked_response_turn()` and `_queue_blocked_response_event()`. Consolidated streaming exception handling into `_raise_response_api_http_exception(e, api_params, context)`. Replaced inline response appending, topic summary, storage, and completion telemetry with calls to `_append_previous_response_turn()`, `_maybe_get_topic_summary()`, `_store_response_query_results(...)`, and `_queue_completed_response_event(...)`.
Non-Streaming Flow Refactoring `src/app/endpoints/responses.py` (lines 1099–1182)	Simplified `user_id` extraction from tuple unpacking to `context.auth[0]`. Refactored shield-blocked branch to use consolidated helpers. Consolidated exception handling into `_raise_response_api_http_exception(...)`. Replaced inline storage and completion telemetry with calls to shared helpers.
Test Verification `tests/unit/app/endpoints/test_responses_splunk.py` (lines 432–449, 662–684)	Updated `test_non_streaming_success` and `test_streaming_success` to pass explicit `user_agent="test-agent/1.0"` and assert `user_agent` is present in `_queue_responses_splunk_event` call arguments.

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title clearly and specifically summarizes the main change: refactoring shared side effects in the Responses API endpoint, which aligns directly with the primary objective of reducing code duplication between streaming and non-streaming handlers.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 60 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/app/endpoints/responses.py`:
- Around line 1168-1181: The completion side effects are run in the wrong order
here—call _store_response_query_results(...) before
_queue_completed_response_event(...) to match generate_response()'s streaming
path; update the sequence in the non-streaming branch so
_store_response_query_results(api_params, context, turn_summary, completed_at,
topic_summary) executes first and only then call
_queue_completed_response_event(api_params, context, turn_summary, completed_at,
output_text) with the same arguments and ordering.
- Around line 384-387: The snippet unconditionally calls get_topic_summary when
context.generate_topic_summary is true, which can trigger another model call for
requests already blocked by the shield; update the guard to also check the
shield-block indicator on the context (e.g., context.shield_blocked or the
existing moderation/shield flag present on Context) and return None if the
request was shield-blocked before calling get_topic_summary(context.input_text,
context.client, api_params.model) so that topic-summary generation is skipped
for rejected inputs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7c607f27-fb46-4faa-8854-81479813f8cb

📥 Commits

Reviewing files that changed from the base of the PR and between be92b89 and 64670e6.

📒 Files selected for processing (2)

src/app/endpoints/responses.py
tests/unit/app/endpoints/test_responses_splunk.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: build-pr

🧰 Additional context used

📓 Path-based instructions (5)

tests/unit/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/unit/**/*.py: Use pytest for all unit and integration tests
Use pytest-mock for AsyncMock objects in unit tests
Use marker pytest.mark.asyncio for async tests

Files:

tests/unit/app/endpoints/test_responses_splunk.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Do not use unittest - pytest is the standard testing framework for this project

Files:

tests/unit/app/endpoints/test_responses_splunk.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
All modules start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
Type aliases defined at module level for clarity
Use Final[type] as type hint for all constants
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types
Use Union types with modern syntax: str | int
Use Optional[Type] for optional type hints
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Use async def for I/O operations and external API calls
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes use ABC with @abstractmethod decorators
Complete type annotations for all class attributes, use specific types, not Any
Follow Google Python docstring conventions for all modules, classes, and functions
Docstring Parameters section documents function parameters
Docstring Returns section documents function return values
Docstring Raises section documents exceptions that may be raised
Use black for code formatting
Use pylint for static analysis with source-roots configuration set to "src"
Use pyright for type checking
Use ruff for fast linting
Use pydocstyle for docstring style validation
Use mypy for additional type checking
Use bandit for security issue detection

Files:

src/app/endpoints/responses.py

src/app/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use FastAPI dependencies: from fastapi import APIRouter, HTTPException, Request, status, Depends

Files:

src/app/endpoints/responses.py

src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/responses.py

🧠 Learnings (2)

📚 Learning: 2026-04-29T15:45:22.854Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-29T15:45:22.854Z
Learning: Applies to src/app/**/*.py : Use FastAPI dependencies: `from fastapi import APIRouter, HTTPException, Request, status, Depends`

Applied to files:

src/app/endpoints/responses.py

📚 Learning: 2026-04-06T20:18:07.852Z

Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.

Applied to files:

src/app/endpoints/responses.py

🔇 Additional comments (1)

src/app/endpoints/responses.py (1)
298-300: This review comment is based on an incorrect understanding of Python exception re-raise semantics.

The verification script demonstrates that raise error (raising by name) in Python 3.12+ actually preserves the full exception traceback, including both the original exception site and the re-raise site. A bare raise (without the exception variable) would show less traceback context, not more.

Additionally, the function has no internal except block, making a bare raise syntactically impossible in this context. The current raise error is appropriate for re-raising unmapped exceptions unchanged, and the code correctly uses raise http_exception from error to establish explicit chaining for mapped exceptions.
			> Likely an incorrect or invalid review comment.

asimurka

Great work with refactoring in overall.

For discussion: The endpoint has become quite long. We should consider creating separate module for all the utility functions dedicated to this endpoint to keep it lightweight. Anyway, it is not necessary for this PR.

Signed-off-by: Major Hayden <major@redhat.com>

asimurka · 2026-05-05T06:01:15Z

/ok-to-test

tisnik

LGTM

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread src/app/endpoints/responses.py

Comment thread src/app/endpoints/responses.py Outdated

major force-pushed the refactor/responses-shared-side-effects branch from 64670e6 to 21779f4 Compare April 30, 2026 18:04

tisnik requested a review from asimurka May 4, 2026 07:17

asimurka requested changes May 4, 2026

View reviewed changes

Comment thread src/app/endpoints/responses.py Outdated

Comment thread src/app/endpoints/responses.py Outdated

refactor(responses): add shared response helpers

6782625

Signed-off-by: Major Hayden <major@redhat.com>

major force-pushed the refactor/responses-shared-side-effects branch from 21779f4 to 0c79c65 Compare May 4, 2026 11:53

major requested a review from asimurka May 4, 2026 18:04

refactor(responses): reuse shared response helpers

26ce9b9

Signed-off-by: Major Hayden <major@redhat.com>

major force-pushed the refactor/responses-shared-side-effects branch from 0c79c65 to 26ce9b9 Compare May 4, 2026 18:32

asimurka approved these changes May 5, 2026

View reviewed changes

openshift-ci Bot added the ok-to-test label May 5, 2026

tisnik approved these changes May 5, 2026

View reviewed changes

tisnik merged commit fa19cbd into lightspeed-core:main May 5, 2026
33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RSPEED-2978: refactor Responses API shared side effects#1649

RSPEED-2978: refactor Responses API shared side effects#1649
tisnik merged 2 commits intolightspeed-core:mainfrom
major:refactor/responses-shared-side-effects

major commented Apr 30, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

major commented Apr 30, 2026

Uh oh!

coderabbitai Bot commented Apr 30, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

asimurka left a comment

Uh oh!

Uh oh!

Uh oh!

asimurka commented May 5, 2026

Uh oh!

tisnik left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

major commented Apr 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Release Notes

Uh oh!

major commented Apr 30, 2026

Uh oh!

coderabbitai Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

asimurka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

asimurka commented May 5, 2026

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

major commented Apr 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 30, 2026 •

edited

Loading