RSPEED-2885: filter internal details from /v1/responses stream by Lifto · Pull Request #1524 · lightspeed-core/lightspeed-stack

Lifto · 2026-04-16T17:25:26Z

Summary

Prevent internal implementation details from leaking through the /v1/responses SSE stream to clients.

What Was Leaking

The /v1/responses endpoint was including in every SSE response object:

instructions — the full server-side system prompt (~3KB of guardrails, RHEL lifecycle dates, formatting rules, search instructions)
tools — server-deployed MCP tool definitions including server_url, server_label, auth headers, and full tool schemas
model — full provider path like google-vertex/publishers/google/models/gemini-2.5-flash instead of a clean model name

Additionally, MCP streaming events (response.output_item.added, response.mcp_call.*, response.mcp_list_tools.*, response.mcp_approval_request.*) were only filtered when the X-LCS-Merge-Server-Tools: true header was present, leaking them in the default case.

What This PR Does

Strips instructions from all response objects in the SSE stream via pop("instructions", None)
Sanitizes model by extracting the last path segment after / (e.g. gemini-2.5-flash)
Strips server-deployed MCP tool definitions from the tools array in response objects; client-provided tools are preserved
Makes MCP event filtering unconditional — configured_mcp_labels is now always populated from configuration.mcp_servers, so server-deployed MCP streaming events are always filtered regardless of the X-LCS-Merge-Server-Tools header

The X-LCS-Merge-Server-Tools header continues to control tool merging behavior (resolve_client_tool_choice vs resolve_tool_choice) unchanged.

All sanitization is done via a new _sanitize_response_dict() helper, applied in both the streaming and non-streaming response paths.

Testing

Added 13 new unit tests:

TestSanitizeResponseDict (11 tests) — unit tests for each sanitization behavior
TestMcpEventsFilteredUnconditionally (2 tests) — integration tests verifying unconditional MCP event filtering

Jira

RSPEED-2885

Summary by CodeRabbit

New Features
- Responses now replace server-applied system instructions with a literal placeholder and filter out configured server-side tools/events from responses.
Bug Fixes
- Sanitization consistently applied to streamed chunks and final responses to prevent exposure of substituted instructions or server-only tools.
Documentation
- Updated docs to explain placeholder behavior and server tool filtering.
Tests
- Added unit and integration tests for sanitization and event/tool filtering.

coderabbitai · 2026-04-16T17:25:41Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds a new constant and a helper that can replace substituted instructions with a placeholder and remove server-deployed MCP tools from serialized response dicts; adds tests exercising that helper and updates documentation to reflect the new placeholder and tool-filtering behavior.

Changes

Cohort / File(s)	Summary
Responses helper `src/app/endpoints/responses.py`	Introduces `_sanitize_response_dict(response_dict, configured_mcp_labels, instructions_substituted=False)` which mutates a serialized response dict in-place: replaces `instructions` with `SUBSTITUTED_INSTRUCTIONS_PLACEHOLDER` when `instructions_substituted=True`, and filters `tools` by removing entries whose `server_label` is in `configured_mcp_labels`. No other invocation points are shown in the provided diff.
Constants `src/constants.py`	Adds `SUBSTITUTED_INSTRUCTIONS_PLACEHOLDER = "<server prompt applied>"`.
Tests (responses) `tests/unit/app/endpoints/test_responses.py`	Adds `TestSanitizeResponseDict` unit tests for instructions placeholder behavior and tool filtering (handles missing fields, empty configured labels). Adds async tests (`TestMcpEventsFilteredUnconditionally`) covering streaming event filtering behavior when MCP servers are configured vs. not configured; updates imports to include the new helper/constant/types.
Docs `docs/responses.md`	Updates “System Prompt Resolution” to state that when the server substitutes the system prompt the response `instructions` is replaced with the literal `"<server prompt applied>"`, and that server-deployed MCP tool definitions are filtered out of response `tools` while client-provided tools are preserved.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: filtering internal details from the /v1/responses stream, which aligns with the PR's core objective of sanitizing response objects to prevent implementation details from reaching clients.
Docstring Coverage	✅ Passed	Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tisnik · 2026-04-16T18:42:07Z

@coderabbitai full review

coderabbitai · 2026-04-16T18:42:13Z

✅ Actions performed

Full review triggered.

tisnik · 2026-04-16T18:43:20Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 Security concerns No followed all instructions
⚡ No major issues detected

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/app/endpoints/responses.py (2)
675-699: ⚠️ Potential issue | 🟠 Major

Keep persisted turn summaries aligned with the sanitized response.

The client payload is now sanitized unconditionally, but turn_summary is still built from the raw response_object and remains gated by filter_server_tools. With the header absent, server-deployed MCP calls disappear from the returned response yet still flow into turn_summary/store_query_results, so stored state no longer matches what the client actually saw.

Also applies to: 758-764, 921-950
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/responses.py` around lines 675 - 699, The persisted turn
summary is still derived from the raw response_object while the outgoing chunk
was sanitized (via configured_mcp_labels, server_mcp_output_indices,
_should_filter_mcp_chunk and _sanitize_response_dict), so stored state can
diverge from what the client saw; update the code paths that build turn_summary
and call store_query_results (the places around the response handling where
response_object / filter_server_tools are used) to use the same sanitized
chunk_dict/response (or re-run the same filtering and _sanitize_response_dict
logic) before constructing turn_summary and persisting results so persisted
summaries match the sanitized client payload.
675-699: ⚠️ Potential issue | 🔴 Critical

Apply the same sanitization to the shield-blocked SSE path.

These lines only sanitize chunks that flow through response_generator(). When moderation blocks a streaming request, handle_streaming_response() switches to shield_violation_generator(), which still serializes request.echoed_params() directly into response.created and response.completed. That means blocked SSE responses can still leak instructions, merged server MCP tools, and unsanitized model IDs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/responses.py` around lines 675 - 699, Shielded/blocked SSE
responses from shield_violation_generator are not being sanitized, so ensure the
same sanitization applied in response_generator runs for the shield path: where
handle_streaming_response/ shield_violation_generator serializes
request.echoed_params() into response.created and response.completed, call
_sanitize_response_dict(...) with the same configured_mcp_labels (and preserve
setting response["conversation"]=normalized_conv_id) and update any model
IDs/instructions/mcp tool lists before sending; reuse the existing helpers
(_should_filter_mcp_chunk, _sanitize_response_dict, configured_mcp_labels,
server_mcp_output_indices, normalized_conv_id) to perform this sanitization so
blocked SSE chunks cannot leak instructions, merged server MCP tools, or raw
model IDs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/app/endpoints/responses.py`:
- Around line 531-537: The current filter removes any tool whose server_label
matches configured_mcp_labels, which can accidentally drop client-supplied
tools; change the predicate in the response_dict "tools" cleanup so it only
excludes tools that both have server_label in configured_mcp_labels and are
explicitly server-injected (e.g., check a marker like tool.get("mcp_injected")
or tool.get("source") == "server"). Update the logic around the tools variable
and response_dict["tools"] so client-provided tools without the injected marker
are preserved while real injected MCP tools are still filtered.
- Around line 501-537: The helper _sanitize_response_dict currently mutates the
input response_dict; change it to be pure by returning a sanitized copy instead:
create a shallow copy of response_dict at the start (e.g., new =
dict(response_dict)), remove "instructions" from the copy, transform the "model"
value on the copy if it's a string with more than one "/", and build a new
"tools" list by filtering and copying each tool dict so callers are not mutated;
update the function signature to return dict[str, Any] and adjust all callers of
_sanitize_response_dict to use the returned sanitized dict instead of relying on
in-place modification.

---

Outside diff comments:
In `@src/app/endpoints/responses.py`:
- Around line 675-699: The persisted turn summary is still derived from the raw
response_object while the outgoing chunk was sanitized (via
configured_mcp_labels, server_mcp_output_indices, _should_filter_mcp_chunk and
_sanitize_response_dict), so stored state can diverge from what the client saw;
update the code paths that build turn_summary and call store_query_results (the
places around the response handling where response_object / filter_server_tools
are used) to use the same sanitized chunk_dict/response (or re-run the same
filtering and _sanitize_response_dict logic) before constructing turn_summary
and persisting results so persisted summaries match the sanitized client
payload.
- Around line 675-699: Shielded/blocked SSE responses from
shield_violation_generator are not being sanitized, so ensure the same
sanitization applied in response_generator runs for the shield path: where
handle_streaming_response/ shield_violation_generator serializes
request.echoed_params() into response.created and response.completed, call
_sanitize_response_dict(...) with the same configured_mcp_labels (and preserve
setting response["conversation"]=normalized_conv_id) and update any model
IDs/instructions/mcp tool lists before sending; reuse the existing helpers
(_should_filter_mcp_chunk, _sanitize_response_dict, configured_mcp_labels,
server_mcp_output_indices, normalized_conv_id) to perform this sanitization so
blocked SSE chunks cannot leak instructions, merged server MCP tools, or raw
model IDs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2c1847a9-af28-472f-85be-df3128ccba58

📥 Commits

Reviewing files that changed from the base of the PR and between 0f95aca and 31279ad.

📒 Files selected for processing (2)

src/app/endpoints/responses.py
tests/unit/app/endpoints/test_responses.py

📜 Review details

🧰 Additional context used

📓 Path-based instructions (4)

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules (e.g., from authentication import get_auth_dependency)
Use from llama_stack_client import AsyncLlamaStackClient for Llama Stack imports
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use complete type annotations for all class attributes; avoid using Any
Follow Google Python docstring conventions for all modules, classes, and functions
Include Parameters:, Returns:, Raises: sections in function docstrings as needed

Files:

src/app/endpoints/responses.py

src/app/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/app/**/*.py: Use from fastapi import APIRouter, HTTPException, Request, status, Depends for FastAPI dependencies
Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/responses.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async unit tests

Files:

tests/unit/app/endpoints/test_responses.py

tests/unit/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use pytest-mock for AsyncMock objects in unit tests

Files:

tests/unit/app/endpoints/test_responses.py

🧠 Learnings (2)

📚 Learning: 2026-02-25T07:46:39.608Z

Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:39.608Z
Learning: In the lightspeed-stack codebase, src/models/requests.py uses OpenAIResponseInputTool as Tool while src/models/responses.py uses OpenAIResponseTool as Tool. This type difference is intentional - input tools and output/response tools have different schemas in llama-stack-api.

Applied to files:

src/app/endpoints/responses.py

📚 Learning: 2026-04-06T20:18:07.852Z

Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.

Applied to files:

src/app/endpoints/responses.py

coderabbitai · 2026-04-16T18:48:26Z

+def _sanitize_response_dict(
+    response_dict: dict[str, Any],
+    configured_mcp_labels: set[str],
+) -> None:
+    """Sanitize a serialized response object in-place to remove internal details.
+
+    Strips fields that expose server-side implementation details from the
+    response object before it is forwarded to the client:
+
+    - ``instructions``: the server-side system prompt
+    - ``model``: deep provider paths (2+ slashes) are truncated to the
+      last segment (e.g.
+      ``google-vertex/publishers/google/models/gemini`` → ``gemini``);
+      simple ``provider/model`` strings are left intact
+    - ``tools``: server-deployed MCP tool definitions are removed; client-
+      provided tools (those whose ``server_label`` is not in
+      ``configured_mcp_labels``) are preserved
+
+    Args:
+        response_dict: Mutable dict produced by ``model_dump`` on a response
+            object.  Modified in-place.
+        configured_mcp_labels: Set of ``server_label`` values that identify
+            server-deployed MCP servers.
+    """
+    response_dict.pop("instructions", None)
+
+    model = response_dict.get("model")
+    if isinstance(model, str) and model.count("/") > 1:
+        response_dict["model"] = model.rsplit("/", 1)[-1]
+
+    tools = response_dict.get("tools")
+    if tools is not None:
+        response_dict["tools"] = [
+            tool
+            for tool in tools
+            if tool.get("server_label") not in configured_mcp_labels
+        ]


🛠️ Refactor suggestion | 🟠 Major

Return a sanitized copy instead of mutating the input dict.

Making this helper pure will avoid hidden caller side effects and make the two response paths easier to reason about.

As per coding guidelines "Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/app/endpoints/responses.py` around lines 501 - 537, The helper _sanitize_response_dict currently mutates the input response_dict; change it to be pure by returning a sanitized copy instead: create a shallow copy of response_dict at the start (e.g., new = dict(response_dict)), remove "instructions" from the copy, transform the "model" value on the copy if it's a string with more than one "/", and build a new "tools" list by filtering and copying each tool dict so callers are not mutated; update the function signature to return dict[str, Any] and adjust all callers of _sanitize_response_dict to use the returned sanitized dict instead of relying on in-place modification.

In-place mutation is intentional here. The dict is created by model_dump() on the line immediately before the call — it is a throwaway serialization dict, not a shared data structure. Returning a copy would allocate a new dict for every SSE chunk with zero safety benefit, since nothing else holds a reference to the original.

The AGENTS.md guideline about avoiding in-place modification targets a different anti-pattern: functions that silently mutate a caller's long-lived data structure via a parameter. That does not apply here — the dict is born, sanitized, serialized to JSON, and discarded within the same async for iteration.

@Lifto, that's a well-reasoned clarification — you're right. The guideline is aimed at functions that silently mutate a caller's long-lived/shared data structure through a parameter reference, which is a genuine source of bugs. Here the dict is born from model_dump(), passed straight into _sanitize_response_dict, serialized, and discarded — there's no aliasing risk and no wasted allocation. The in-place mutation is appropriate and intentional. I'll withdraw the suggestion.

(ᵔᴥᵔ) 🐇✏️

✏️ Learnings added

Learnt from: Lifto Repo: lightspeed-core/lightspeed-stack PR: 1524 File: src/app/endpoints/responses.py:501-529 Timestamp: 2026-04-16T19:08:26.366Z Learning: In `src/app/endpoints/responses.py`, `_sanitize_response_dict(response_dict, configured_mcp_labels)` intentionally mutates `response_dict` in-place. The dict is always a fresh throwaway produced by `model_dump()` on the immediately preceding line (both in the streaming `async for` loop and in the non-streaming path); no other reference to it exists. The AGENTS.md guideline "avoid in-place parameter modification" applies to mutating a caller's long-lived/shared data structures, not to ephemeral serialization dicts. Do not flag this as an anti-pattern in future reviews.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: CR Repo: lightspeed-core/lightspeed-stack PR: 0 File: AGENTS.md:0-0 Timestamp: 2026-04-05T12:19:36.009Z Learning: Applies to src/**/*.py : Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters

Learnt from: major Repo: lightspeed-core/lightspeed-stack PR: 1463 File: src/app/endpoints/rlsapi_v1.py:266-271 Timestamp: 2026-04-06T20:18:07.852Z Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.

Prevent internal implementation details from leaking through the /v1/responses SSE stream: - Strip 'instructions' field (system prompt) from response objects - Sanitize 'model' field by removing provider prefix (e.g. google-vertex/publishers/google/models/gemini → gemini) - Strip server-deployed MCP tool definitions from the 'tools' array in response objects (client-provided tools are preserved) - Make MCP streaming event filtering unconditional — mcp_call, mcp_list_tools, and mcp_approval_request events for server-deployed tools are now always filtered, not only when X-LCS-Merge-Server-Tools header is present The X-LCS-Merge-Server-Tools header continues to control tool merging behavior (resolve_client_tool_choice vs resolve_tool_choice) unchanged. Fixes: https://redhat.atlassian.net/browse/RSPEED-2885 Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Lifto · 2026-04-16T19:08:44Z

Replies to CodeRabbit review comments

Comment 3 — "Keep persisted turn summaries aligned with the sanitized response"

This divergence is intentional. The sanitization controls what the client sees — we strip implementation details from the wire format. The server-side turn summary, metrics, and conversation storage intentionally retain the full context (including server-deployed MCP tool calls) because that data is needed for debugging, audit trails, and accurate token accounting. Aligning the persisted state with the sanitized client view would lose information the server needs.

The filter_server_tools flag in build_turn_summary (from PR #1435) already handles the case where the client explicitly requests tool merging — in that mode, the summary is also filtered. Our change does not alter that behavior.

Comment 4 — "Apply sanitization to shield-blocked SSE path"

This is a valid observation. When moderation blocks a request, shield_violation_generator() builds a synthetic response object using **echoed_params from the request. The echoed_params() method (in src/models/requests.py) dumps _ECHOED_FIELDS, which includes "instructions". Because responses_endpoint_handler overwrites responses_request.instructions with get_system_prompt() at line 182 before the moderation check, the shield response will echo back the server-side system prompt.

This is a pre-existing leak that exists on main today, independent of this PR. Our diff does not touch shield_violation_generator(), echoed_params(), or _ECHOED_FIELDS. We could demonstrate this leak with a test against the current main branch:

@pytest.mark.asyncio
async def test_shield_blocked_response_leaks_instructions(mocker, minimal_config):
    """Shield-blocked responses echo server-set instructions via echoed_params."""
    # 1. Create request, let responses_endpoint_handler set instructions via get_system_prompt()
    # 2. Trigger moderation block (mock moderation_result.decision = "blocked")  
    # 3. Collect SSE output from shield_violation_generator()
    # 4. Assert "instructions" IS present in response.created / response.completed
    #    → This test PASSES on main, proving the leak is pre-existing

That said — this PR makes the leak more visible by establishing the pattern that instructions should be stripped from client-facing responses. Now that response_generator() strips it, it is more obvious that shield_violation_generator() does not.

We think the right fix is to either remove "instructions" from _ECHOED_FIELDS in requests.py, or call _sanitize_response_dict() on the shield response dict before serialization. But that is a separate change to code outside this diff. We will file a follow-up ticket for it.

Lifto · 2026-04-16T19:14:43Z

Follow-up filed: RSPEED-2889 — shield-blocked /v1/responses SSE leaks server-side system prompt via echoed_params. Pre-existing bug in the moderation path, out of scope for this PR.

Lifto · 2026-04-16T19:17:15Z

Correction: follow-up ticket moved to the correct project. LCORE-1851 — shield-blocked /v1/responses SSE leaks server-side system prompt via echoed_params. Pre-existing bug in the moderation path, out of scope for this PR. (RSPEED-2889 closed as duplicate.)

dprince · 2026-04-17T00:21:54Z

@Lifto is this for rhel-lightspeed? Initially I think I had the filtering always on so this seems resonable but would like to review. But curious if this is using Goose are you not using X-LCS-Merge-Server-Tools (that would leave Goose extensions mostly useless)?

asimurka

@Lifto Tools filtering is reasonable as currently client tools are leaked in tools attribute.
I agree that echoing huge system prompts is unnecessary. However, removing them from the response entirely can be surprising as it is a native responses behavior.
Wdyt about echoing fixed-length prefix of the system prompt with predefined constant in constants.py e.g. 100?
Anyway, please add a short comment to docs/responses.md to document this (there is already a section dedicated to system prompt/instructions resolution). Thanks.

jrobertboos

100% agree with @asimurka also was wondering about some style changes and if they are necessary for this PR.

jrobertboos · 2026-04-17T15:37:31Z

+            responses_request.tool_choice,
+            auth[1],
+            mcp_headers,
+            request.headers,


is there a reason behind this change?

jrobertboos · 2026-04-17T15:37:45Z

+            responses_request.tool_choice,
+            auth[1],
+            mcp_headers,
+            request.headers,


Lifto · 2026-04-17T17:07:33Z

@dprince Yes, this is for rhel-lightspeed — the filtering prevents leaking the server-side system prompt (instructions) and server-deployed MCP tool definitions to the client.

On the Goose question: the filtering doesn't disable server tools, it hides the internal events from the client's SSE stream. When X-LCS-Merge-Server-Tools: true is set, the server still merges its tools, the model still calls them, and the server still executes them internally — the model's response still benefits from the server tools. The filtering just strips the server tool definitions and call events from what the client sees, since the client doesn't execute those anyway.

Goose's own extensions (client-provided tools) are unaffected — their server_label doesn't match configured_mcp_labels, so they pass through the filter.

The two concerns are complementary: X-LCS-Merge-Server-Tools controls what the model has access to; the filtering controls what the client sees.

Assisted by Claude Opus 4.6

When the server substitutes its own system prompt, the response now sets instructions to '<server prompt applied>' instead of stripping the field entirely. Client-provided instructions are echoed back unchanged, preserving native Responses API behavior. Adds documentation to docs/responses.md covering the filtering behavior for instructions and server-deployed MCP tools. RSPEED-2885 Signed-off-by: Ellis Low <elow@redhat.com>

Lifto · 2026-04-17T18:31:34Z

@asimurka Thanks for the feedback. Updated the approach — instead of stripping instructions entirely or echoing a truncated prefix:

When the server substituted its own system prompt (client sent None), the response sets instructions to the placeholder <server prompt applied>. This avoids leaking any part of the server prompt while still signaling that a prompt was applied.
When the client provides their own instructions and per-request customization is allowed, they are echoed back unchanged.

This preserves native Responses API behavior for clients that send instructions, while preventing the leak when the server injects its own.

Also added documentation to docs/responses.md in the existing System Prompt Resolution section.

Assisted by Claude Opus 4.6

samdoran · 2026-04-17T19:04:45Z

+def _sanitize_response_dict(
+    response_dict: dict[str, Any],
+    configured_mcp_labels: set[str],
+    instructions_substituted: bool,


This could probably have a default value of False.

samdoran · 2026-04-17T19:11:50Z

+    tools = response_dict.get("tools")
+    if tools is not None:
+        response_dict["tools"] = [
+            tool
+            for tool in tools
+            if tool.get("server_label") not in configured_mcp_labels
+        ]


Nit: this could be simplified a bit.

Suggested change

tools = response_dict.get("tools")

if tools is not None:

response_dict["tools"] = [

tool

for tool in tools

if tool.get("server_label") not in configured_mcp_labels

]

if tools := response_dict.get("tools"):

response_dict["tools"] = [

tool

for tool in tools

if tool.get("server_label") not in configured_mcp_labels

]

- Revert unintended tuple unpacking style change (jrobertboos) - Add default value for instructions_substituted param (samdoran) - Use walrus operator for tools filtering (samdoran) RSPEED-2885 Signed-off-by: Ellis Low <elow@redhat.com>

Lifto · 2026-04-17T19:43:02Z

@jrobertboos Unintended style change — not from the formatter. Reverted.

@samdoran Both nits addressed: added default value for instructions_substituted, and switched to walrus operator for tools filtering.

Assisted by Claude Opus 4.6

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/app/endpoints/responses.py`:
- Around line 500-537: The _sanitize_response_dict helper is defined but never
used; call it on the serialized response in both streaming and non-streaming
code paths: for the streaming path, after building chunk_dict and before
yielding (i.e., where chunk_dict["response"] is set and alongside existing
_should_filter_mcp_chunk/_is_server_mcp_output_item checks) invoke
_sanitize_response_dict(chunk_dict["response"], configured_mcp_labels,
instructions_substituted=...) and for the non-streaming path, after
api_response_dict = api_response.model_dump(exclude_none=True) and before
translating tool IDs call _sanitize_response_dict(api_response_dict,
configured_mcp_labels, instructions_substituted=...). Also wire the correct
boolean for instructions_substituted (set it where the server decides to
substitute instructions and pass that variable into both calls) so
SUBSTITUTED_INSTRUCTIONS_PLACEHOLDER will be applied when appropriate.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 92d5d4bd-7daa-4084-9c61-3c3d1ca57f16

📥 Commits

Reviewing files that changed from the base of the PR and between 68c0057 and 3d970d4.

📒 Files selected for processing (1)

src/app/endpoints/responses.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E Tests for Lightspeed Evaluation job

🧰 Additional context used

📓 Path-based instructions (2)

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules (e.g., from authentication import get_auth_dependency)
Use from llama_stack_client import AsyncLlamaStackClient for Llama Stack imports
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use complete type annotations for all class attributes; avoid using Any
Follow Google Python docstring conventions for all modules, classes, and functions
Include Parameters:, Returns:, Raises: sections in function docstrings as needed

Files:

src/app/endpoints/responses.py

src/app/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/app/**/*.py: Use from fastapi import APIRouter, HTTPException, Request, status, Depends for FastAPI dependencies
Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/responses.py

🧠 Learnings (7)

📓 Common learnings

Learnt from: Lifto
Repo: lightspeed-core/lightspeed-stack PR: 1524
File: src/app/endpoints/responses.py:523-529
Timestamp: 2026-04-16T19:08:35.441Z
Learning: In lightspeed-stack (`src/app/endpoints/responses.py`), the predicate `server_label in configured_mcp_labels` is the established, intentional pattern for identifying server-deployed MCP tools across `_sanitize_response_dict`, `_is_server_mcp_output_item`, and `_should_filter_mcp_chunk`. Client-supplied tools cannot collide with configured server labels because `server_label` is a server-side field set by lightspeed-stack during tool injection; clients send `function` tools or MCP tools pointing at their own servers with different labels. Do not flag this predicate as a false-positive collision risk in code review.

Learnt from: Lifto
Repo: lightspeed-core/lightspeed-stack PR: 1524
File: src/app/endpoints/responses.py:501-529
Timestamp: 2026-04-16T19:08:26.366Z
Learning: In `src/app/endpoints/responses.py`, `_sanitize_response_dict(response_dict, configured_mcp_labels)` intentionally mutates `response_dict` in-place. The dict is always a fresh throwaway produced by `model_dump()` on the immediately preceding line (both in the streaming `async for` loop and in the non-streaming path); no other reference to it exists. The AGENTS.md guideline "avoid in-place parameter modification" applies to mutating a caller's long-lived/shared data structures, not to ephemeral serialization dicts. Do not flag this as an anti-pattern in future reviews.

📚 Learning: 2026-04-16T19:08:26.366Z

Learnt from: Lifto
Repo: lightspeed-core/lightspeed-stack PR: 1524
File: src/app/endpoints/responses.py:501-529
Timestamp: 2026-04-16T19:08:26.366Z
Learning: In `src/app/endpoints/responses.py`, `_sanitize_response_dict(response_dict, configured_mcp_labels)` intentionally mutates `response_dict` in-place. The dict is always a fresh throwaway produced by `model_dump()` on the immediately preceding line (both in the streaming `async for` loop and in the non-streaming path); no other reference to it exists. The AGENTS.md guideline "avoid in-place parameter modification" applies to mutating a caller's long-lived/shared data structures, not to ephemeral serialization dicts. Do not flag this as an anti-pattern in future reviews.

Applied to files:

src/app/endpoints/responses.py

📚 Learning: 2026-04-16T19:08:35.441Z

Learnt from: Lifto
Repo: lightspeed-core/lightspeed-stack PR: 1524
File: src/app/endpoints/responses.py:523-529
Timestamp: 2026-04-16T19:08:35.441Z
Learning: In lightspeed-stack (`src/app/endpoints/responses.py`), the predicate `server_label in configured_mcp_labels` is the established, intentional pattern for identifying server-deployed MCP tools across `_sanitize_response_dict`, `_is_server_mcp_output_item`, and `_should_filter_mcp_chunk`. Client-supplied tools cannot collide with configured server labels because `server_label` is a server-side field set by lightspeed-stack during tool injection; clients send `function` tools or MCP tools pointing at their own servers with different labels. Do not flag this predicate as a false-positive collision risk in code review.

Applied to files:

src/app/endpoints/responses.py

📚 Learning: 2026-04-05T12:19:36.009Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/**/*.py : Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters

Applied to files:

src/app/endpoints/responses.py

📚 Learning: 2026-04-06T20:18:07.852Z

Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.

Applied to files:

src/app/endpoints/responses.py

📚 Learning: 2026-02-25T07:46:39.608Z

Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:39.608Z
Learning: In the lightspeed-stack codebase, src/models/requests.py uses OpenAIResponseInputTool as Tool while src/models/responses.py uses OpenAIResponseTool as Tool. This type difference is intentional - input tools and output/response tools have different schemas in llama-stack-api.

Applied to files:

src/app/endpoints/responses.py

📚 Learning: 2026-04-05T12:19:36.009Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/**/*.py : Use `from llama_stack_client import AsyncLlamaStackClient` for Llama Stack imports

Applied to files:

src/app/endpoints/responses.py

🔇 Additional comments (2)

src/app/endpoints/responses.py (2)
500-537: Consider returning the sanitized dict rather than mutating in place.

Per AGENTS.md ("Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters"), a pure helper is generally preferred. I understand from prior discussion on this PR that the input is always a fresh model_dump() throwaway so there's no aliasing risk, and the decision to mutate in-place is intentional — acknowledging that and not asking for a change. Noting here only so future readers aren't surprised by the -> None signature on a function named _sanitize_….

Based on learnings: "_sanitize_response_dict(...) intentionally mutates response_dict in-place. The dict is always a fresh throwaway produced by model_dump() on the immediately preceding line … Do not flag this as an anti-pattern in future reviews."

531-536: 🧹 Nitpick | 🔵 Trivial

Tool filtering assumes each tool is a dict.

tool.get("server_label") will raise AttributeError if any entry in response_dict["tools"] is not a mapping (e.g., already-unserialized pydantic model, or an unexpected shape from a future Llama Stack schema change). Since response_dict comes from model_dump() today that's fine in practice, but a defensive isinstance(tool, dict) guard (or getattr(tool, "get", lambda _ : None)) would make this safer against schema drift and the non-streaming call path where response_dict may be assembled differently.

Not blocking — flagging as optional hardening.
⛔ Skipped due to learnings
Learnt from: Lifto
Repo: lightspeed-core/lightspeed-stack PR: 1524
File: src/app/endpoints/responses.py:523-529
Timestamp: 2026-04-16T19:08:35.441Z
Learning: In lightspeed-stack (`src/app/endpoints/responses.py`), the predicate `server_label in configured_mcp_labels` is the established, intentional pattern for identifying server-deployed MCP tools across `_sanitize_response_dict`, `_is_server_mcp_output_item`, and `_should_filter_mcp_chunk`. Client-supplied tools cannot collide with configured server labels because `server_label` is a server-side field set by lightspeed-stack during tool injection; clients send `function` tools or MCP tools pointing at their own servers with different labels. Do not flag this predicate as a false-positive collision risk in code review.
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:11.336Z
Learning: In the lightspeed-stack codebase (src/app/endpoints/), `tools: Optional[list[Any]]` for MCP tool definitions is an intentional, consistent pattern used across all inference endpoints (query, responses, streaming_query, rlsapi_v1). Do not flag this as a typing issue — changing it in isolation would break consistency.
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:39.608Z
Learning: In the lightspeed-stack codebase, src/models/requests.py uses OpenAIResponseInputTool as Tool while src/models/responses.py uses OpenAIResponseTool as Tool. This type difference is intentional - input tools and output/response tools have different schemas in llama-stack-api.
Learnt from: Lifto
Repo: lightspeed-core/lightspeed-stack PR: 1524
File: src/app/endpoints/responses.py:501-529
Timestamp: 2026-04-16T19:08:26.366Z
Learning: In `src/app/endpoints/responses.py`, `_sanitize_response_dict(response_dict, configured_mcp_labels)` intentionally mutates `response_dict` in-place. The dict is always a fresh throwaway produced by `model_dump()` on the immediately preceding line (both in the streaming `async for` loop and in the non-streaming path); no other reference to it exists. The AGENTS.md guideline "avoid in-place parameter modification" applies to mutating a caller's long-lived/shared data structures, not to ephemeral serialization dicts. Do not flag this as an anti-pattern in future reviews.

The previous nit-fix commit (3d970d4) accidentally removed: - _sanitize_response_dict call sites in both streaming and non-streaming paths - instructions_substituted computation and threading - unconditional configured_mcp_labels computation Restored from working commit 68c0057 and reapplied only the three nit fixes (tuple unpacking style, default param value, walrus operator). RSPEED-2885 Signed-off-by: Ellis Low <elow@redhat.com>

asimurka · 2026-04-20T06:26:05Z

+    instructions_substituted = (
+        client_instructions is None
+        or client_instructions != responses_request.instructions
+    )


I think this logic is too complicated. Checking the value of the original instructions before applying get_system_prompt() is sufficient:

if instructions is None` -> server-side system prompt will be applied

if instructions are specified per-request, they have the highest priority -> no override

Good call — simplified to just client_instructions is None. The second condition was always redundant: when the client provides instructions, get_system_prompt() returns them unchanged, and when per-request customization is disabled it raises 422 before reaching this line.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

asimurka

LGTM

tisnik

LGTM

…ilter-responses-internals RSPEED-2885: filter internal details from /v1/responses stream

Lifto force-pushed the feat/rspeed-2885-filter-responses-internals branch from 5b571d9 to 31279ad Compare April 16, 2026 18:18

tisnik requested a review from asimurka April 16, 2026 18:39

tisnik added the Review effort 3/5 label Apr 16, 2026

coderabbitai Bot reviewed Apr 16, 2026

View reviewed changes

Lifto force-pushed the feat/rspeed-2885-filter-responses-internals branch from 31279ad to da6ae27 Compare April 16, 2026 18:53

asimurka reviewed Apr 17, 2026

View reviewed changes

jrobertboos reviewed Apr 17, 2026

View reviewed changes

samdoran reviewed Apr 17, 2026

View reviewed changes

fix(responses): address review nits

3d970d4

- Revert unintended tuple unpacking style change (jrobertboos) - Add default value for instructions_substituted param (samdoran) - Use walrus operator for tools filtering (samdoran) RSPEED-2885 Signed-off-by: Ellis Low <elow@redhat.com>

coderabbitai Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread src/app/endpoints/responses.py

Lifto force-pushed the feat/rspeed-2885-filter-responses-internals branch from 071ec8e to 90b0d0e Compare April 17, 2026 20:20

Merge branch 'main' into feat/rspeed-2885-filter-responses-internals

303a40b

asimurka reviewed Apr 20, 2026

View reviewed changes

Lifto and others added 2 commits April 20, 2026 10:37

fix(responses): simplify instructions_substituted check

f865c51

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

chore: remove .sisyphus from tracking and add to .gitignore

07327cc

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

asimurka approved these changes Apr 20, 2026

View reviewed changes

tisnik approved these changes Apr 20, 2026

View reviewed changes

tisnik merged commit 95fced9 into lightspeed-core:main Apr 20, 2026
26 of 27 checks passed

major pushed a commit to major/lightspeed-stack that referenced this pull request Apr 20, 2026

Merge pull request lightspeed-core#1524 from Lifto/feat/rspeed-2885-f…

b6eff07

…ilter-responses-internals RSPEED-2885: filter internal details from /v1/responses stream

Anxhela21 pushed a commit to Anxhela21/lightspeed-stack that referenced this pull request Apr 27, 2026

Merge pull request lightspeed-core#1524 from Lifto/feat/rspeed-2885-f…

893269a

…ilter-responses-internals RSPEED-2885: filter internal details from /v1/responses stream

Conversation

Lifto commented Apr 16, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Was Leaking

What This PR Does

Testing

Jira

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Uh oh!

tisnik commented Apr 16, 2026

Uh oh!

coderabbitai Bot commented Apr 16, 2026

Uh oh!

tisnik commented Apr 16, 2026

PR Reviewer Guide 🔍

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Lifto Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Lifto commented Apr 16, 2026

Replies to CodeRabbit review comments

Comment 3 — "Keep persisted turn summaries aligned with the sanitized response"

Comment 4 — "Apply sanitization to shield-blocked SSE path"

Uh oh!

Lifto commented Apr 16, 2026

Uh oh!

Lifto commented Apr 16, 2026

Uh oh!

dprince commented Apr 17, 2026

Uh oh!

asimurka left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jrobertboos left a comment

Choose a reason for hiding this comment

Uh oh!

jrobertboos Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

jrobertboos Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Lifto commented Apr 17, 2026

Uh oh!

Lifto commented Apr 17, 2026

Uh oh!

samdoran Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

samdoran Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Lifto commented Apr 17, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

asimurka Apr 20, 2026

Choose a reason for hiding this comment

Lifto commented Apr 16, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 16, 2026 •

edited

Loading

coderabbitai Bot Apr 16, 2026 •

edited

Loading

asimurka left a comment •

edited

Loading