-
Notifications
You must be signed in to change notification settings - Fork 56
LCORE-577: SSE content type for streaming_query endpoints #837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
LCORE-577: SSE content type for streaming_query endpoints #837
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis PR updates streaming endpoint OpenAPI documentation to properly represent Server-Sent Events (SSE) responses with text/event-stream media type. A new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–25 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/openapi.json (1)
1503-1513: Add response headers to OpenAPI documentation to match SSE best practices.Server code correctly uses
text/event-streammedia type, and OpenAPI examples already follow proper SSE format withdata:lines. However, OpenAPI response definitions should explicitly document standard SSE headers to guide clients and reverse proxies.Add response headers to both
/v1/streaming_queryand/v2/streaming_query(currently at OpenAPI lines 1517–1524 and 3734–3741):"responses": { "200": { "description": "Successful response", + "headers": { + "Cache-Control": { "schema": { "type": "string" }, "example": "no-cache" }, + "Connection": { "schema": { "type": "string" }, "example": "keep-alive" }, + "X-Accel-Buffering": { "schema": { "type": "string" }, "example": "no" } + }, "content": { "text/event-stream": { ... } } } }Note: If headers are intended to be set by the framework/proxy layer rather than explicitly in code, document this assumption to avoid client confusion. The example test data (empty tokens,
truncated: null) are acceptable as fixtures.
🧹 Nitpick comments (1)
docs/openapi.json (1)
3720-3731: v1/v2 200 responses are identical; refactor suggestion remains valid.The verification confirms
/v1/streaming_queryand/v2/streaming_queryhave matching 200 responses, so no divergence risk. However, the refactor to extract the inline SSE example tocomponents/examplesand any shared headers tocomponents/headersremains a valid optimization to simplify maintenance and prevent future drift.Currently both endpoints define the example and schema inline. Moving these to reusable component references would improve consistency across future updates.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
docs/openapi.json(4 hunks)src/app/endpoints/streaming_query.py(4 hunks)src/app/endpoints/streaming_query_v2.py(4 hunks)src/models/responses.py(4 hunks)tests/unit/models/responses/test_successful_responses.py(3 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use absolute imports for internal modules in LCS project (e.g.,from auth import get_auth_dependency)
All modules must start with descriptive docstrings explaining their purpose
Uselogger = logging.getLogger(__name__)pattern for module logging
All functions must include complete type annotations for parameters and return types, using modern syntax (str | int) andOptional[Type]orType | None
All functions must have docstrings with brief descriptions following Google Python docstring conventions
Function names must use snake_case with descriptive, action-oriented names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying input parameters
Useasync deffor I/O operations and external API calls
All classes must include descriptive docstrings explaining their purpose following Google Python docstring conventions
Class names must use PascalCase with descriptive names and standard suffixes:Configurationfor config classes,Error/Exceptionfor exceptions,Resolverfor strategy patterns,Interfacefor abstract base classes
Abstract classes must use ABC with@abstractmethoddecorators
Include complete type annotations for all class attributes in Python classes
Useimport loggingand module logger pattern with standard log levels: debug, info, warning, error
Files:
src/app/endpoints/streaming_query_v2.pysrc/models/responses.pysrc/app/endpoints/streaming_query.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use FastAPI
HTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/streaming_query_v2.pysrc/app/endpoints/streaming_query.py
src/**/{client,app/endpoints/**}.py
📄 CodeRabbit inference engine (CLAUDE.md)
Handle
APIConnectionErrorfrom Llama Stack in integration code
Files:
src/app/endpoints/streaming_query_v2.pysrc/app/endpoints/streaming_query.py
tests/{unit,integration}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests; do not use unittest framework
Unit tests must achieve 60% code coverage; integration tests must achieve 10% coverage
Files:
tests/unit/models/responses/test_successful_responses.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use
pytest-mockwith AsyncMock objects for mocking in tests
Files:
tests/unit/models/responses/test_successful_responses.py
src/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/models/**/*.py: Use@field_validatorand@model_validatorfor custom validation in Pydantic models
Pydantic configuration classes must extendConfigurationBase; data models must extendBaseModel
Files:
src/models/responses.py
🧬 Code graph analysis (3)
src/app/endpoints/streaming_query_v2.py (1)
src/models/responses.py (5)
StreamingQueryResponse(453-518)openapi_response(46-59)openapi_response(457-479)openapi_response(857-881)openapi_response(1186-1209)
tests/unit/models/responses/test_successful_responses.py (1)
src/models/responses.py (5)
StreamingQueryResponse(453-518)openapi_response(46-59)openapi_response(457-479)openapi_response(857-881)openapi_response(1186-1209)
src/app/endpoints/streaming_query.py (1)
src/models/responses.py (4)
openapi_response(46-59)openapi_response(457-479)openapi_response(857-881)openapi_response(1186-1209)
🪛 GitHub Actions: Unit tests
tests/unit/models/responses/test_successful_responses.py
[error] 969-969: TestStreamingQueryResponse.openapi_response_structure failed. Expected description 'Streaming response (Server-Sent Events)' but got 'Successful response'.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build-pr
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: e2e_tests (azure)
- GitHub Check: e2e_tests (ci)
🔇 Additional comments (9)
src/app/endpoints/streaming_query.py (2)
926-930: LGTM! Comprehensive SSE documentation improvements.The decorator and docstring changes properly document the SSE streaming behavior:
response_class=StreamingResponsecorrectly indicates FastAPI streaming responseresponses=streaming_query_responsesprovides comprehensive OpenAPI error documentation- Docstring accurately describes SSE format with
text/event-streamcontent type- All documented HTTP error codes match the configured responses (lines 79-94)
- Documentation aligns with actual implementation (line 886 uses
media_type="text/event-stream")Also applies to: 941-947, 950-957
58-58: No issues found - StreamingQueryResponse is properly configured with examples.Verification confirms that
StreamingQueryResponsehas examples correctly defined in itsmodel_configusingjson_schema_extra(lines 481-503 insrc/models/responses.py). Theopenapi_response()method will successfully retrieve these examples and generate the correct OpenAPI documentation for the SSE streaming response. The endpoint at line 886 instreaming_query.pycorrectly usesmedia_type="text/event-stream", matching the documented response format.tests/unit/models/responses/test_successful_responses.py (1)
981-987: LGTM!The test correctly validates that the
StreamingQueryResponsemodel schema includes examples in the expected format (single string example representing SSE stream).src/app/endpoints/streaming_query_v2.py (4)
43-43: LGTM!The import follows the coding guidelines for absolute imports in the LCS project.
61-76: LGTM!The response dictionary is well-structured with comprehensive HTTP error mappings and specific examples for each error scenario. The use of
StreamingQueryResponse.openapi_response()centralizes the SSE response schema definition, improving maintainability.
294-298: LGTM!The decorator correctly combines
response_class=StreamingResponsefor runtime behavior withresponses=streaming_query_v2_responsesfor OpenAPI documentation. This is the proper FastAPI pattern for streaming endpoints.
306-326: LGTM!The docstring updates accurately describe the SSE streaming behavior with
text/event-streamcontent type and provide comprehensive error documentation that aligns with the response dictionary. The documentation follows Google Python docstring conventions.src/models/responses.py (2)
14-14: LGTM! Good refactoring to centralize the description.The new constant follows DRY principles and is used consistently throughout the file.
56-56: LGTM! Consistent use of the new constant.Both updates correctly use the centralized
SUCCESSFUL_RESPONSE_DESCRIPTIONconstant.Also applies to: 878-878
| class StreamingQueryResponse(AbstractSuccessfulResponse): | ||
| """Documentation-only model for streaming query responses using Server-Sent Events (SSE).""" | ||
|
|
||
| @classmethod | ||
| def openapi_response(cls) -> dict[str, Any]: | ||
| """Generate FastAPI response dict for SSE streaming with examples. | ||
| Note: This is used for OpenAPI documentation only. The actual endpoint | ||
| returns a StreamingResponse object, not this Pydantic model. | ||
| """ | ||
| schema = cls.model_json_schema() | ||
| model_examples = schema.get("examples") | ||
| if not model_examples: | ||
| raise SchemaError(f"Examples not found in {cls.__name__}") | ||
| example_value = model_examples[0] | ||
| content = { | ||
| "text/event-stream": { | ||
| "schema": {"type": "string", "format": "text/event-stream"}, | ||
| "example": example_value, | ||
| } | ||
| } | ||
|
|
||
| return { | ||
| "description": SUCCESSFUL_RESPONSE_DESCRIPTION, | ||
| "content": content, | ||
| # Note: No "model" key since we're not actually serializing this model | ||
| } | ||
|
|
||
| model_config = { | ||
| "json_schema_extra": { | ||
| "examples": [ | ||
| ( | ||
| 'data: {"event": "start", "data": {' | ||
| '"conversation_id": "123e4567-e89b-12d3-a456-426614174000"}}\n\n' | ||
| 'data: {"event": "token", "data": {' | ||
| '"id": 0, "token": "No Violation"}}\n\n' | ||
| 'data: {"event": "token", "data": {' | ||
| '"id": 1, "token": ""}}\n\n' | ||
| 'data: {"event": "token", "data": {' | ||
| '"id": 2, "token": "Hello"}}\n\n' | ||
| 'data: {"event": "token", "data": {' | ||
| '"id": 3, "token": "!"}}\n\n' | ||
| 'data: {"event": "token", "data": {' | ||
| '"id": 4, "token": " How"}}\n\n' | ||
| 'data: {"event": "token", "data": {' | ||
| '"id": 5, "token": " can"}}\n\n' | ||
| 'data: {"event": "token", "data": {' | ||
| '"id": 6, "token": " I"}}\n\n' | ||
| 'data: {"event": "token", "data": {' | ||
| '"id": 7, "token": " assist"}}\n\n' | ||
| 'data: {"event": "token", "data": {' | ||
| '"id": 8, "token": " you"}}\n\n' | ||
| 'data: {"event": "token", "data": {' | ||
| '"id": 9, "token": " today"}}\n\n' | ||
| 'data: {"event": "token", "data": {' | ||
| '"id": 10, "token": "?"}}\n\n' | ||
| 'data: {"event": "turn_complete", "data": {' | ||
| '"token": "Hello! How can I assist you today?"}}\n\n' | ||
| 'data: {"event": "end", "data": {' | ||
| '"rag_chunks": [], "referenced_documents": [], ' | ||
| '"truncated": null, "input_tokens": 11, "output_tokens": 19}, ' | ||
| '"available_quotas": {}}\n\n' | ||
| ), | ||
| ] | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix JSON structure in the SSE example.
Line 513-514 has a structural issue: "available_quotas": {} is placed outside the "data" object. Based on the QueryResponse model (lines 410-414), available_quotas should be a field within the "data" object of the "end" event.
Apply this diff to correct the JSON structure:
'data: {"event": "end", "data": {'
'"rag_chunks": [], "referenced_documents": [], '
- '"truncated": null, "input_tokens": 11, "output_tokens": 19}, '
- '"available_quotas": {}}\n\n'
+ '"truncated": null, "input_tokens": 11, "output_tokens": 19, '
+ '"available_quotas": {}}}\n\n'Additional observations:
-
This Pydantic model has no fields defined, which is unusual even for documentation-only models. Consider adding a comment explaining why this diverges from typical Pydantic usage.
-
Line 470 uses
"format": "text/event-stream", which is not a standard JSON Schema format. While this may work for documentation purposes, be aware it's non-standard.
🤖 Prompt for AI Agents
In src/models/responses.py around lines 453-518, the SSE example string has a
malformed "end" event where "available_quotas": {} sits outside the "data"
object; move "available_quotas": {} inside the "data" object of the "end" event
and adjust surrounding commas/braces so the JSON is valid, then update the
example string accordingly; also add a one-line comment on the
StreamingQueryResponse class explaining why it defines no Pydantic fields
(documentation-only model) and add a short comment noting that
"text/event-stream" in the schema format is non-standard but kept for docs (or
replace it with a neutral string format if preferred).
| def test_openapi_response_structure(self) -> None: | ||
| """Test that openapi_response() returns correct structure.""" | ||
| result = StreamingQueryResponse.openapi_response() | ||
|
|
||
| assert "description" in result | ||
| assert "content" in result | ||
| assert result["description"] == "Streaming response (Server-Sent Events)" | ||
| assert "model" not in result | ||
|
|
||
| assert "text/event-stream" in result["content"] | ||
| content = result["content"]["text/event-stream"] | ||
| assert "schema" in content | ||
| assert "example" in content | ||
|
|
||
| schema = content["schema"] | ||
| assert schema["type"] == "string" | ||
| assert schema["format"] == "text/event-stream" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Test failure caused by incorrect description in src/models/responses.py implementation.
The pipeline failure at line 969 reveals that StreamingQueryResponse.openapi_response() returns "Successful response" (from SUCCESSFUL_RESPONSE_DESCRIPTION) instead of the expected "Streaming response (Server-Sent Events)".
The test expectation is correct based on the PR objectives. The issue is in src/models/responses.py at approximately line 474, where the implementation should use a streaming-specific description constant rather than the generic SUCCESSFUL_RESPONSE_DESCRIPTION.
Fix required in src/models/responses.py:
# Define a new constant for streaming responses
STREAMING_RESPONSE_DESCRIPTION = "Streaming response (Server-Sent Events)"
# Then in StreamingQueryResponse.openapi_response():
return {
"description": STREAMING_RESPONSE_DESCRIPTION, # Change from SUCCESSFUL_RESPONSE_DESCRIPTION
"content": content,
}🧰 Tools
🪛 GitHub Actions: Unit tests
[error] 969-969: TestStreamingQueryResponse.openapi_response_structure failed. Expected description 'Streaming response (Server-Sent Events)' but got 'Successful response'.
🤖 Prompt for AI Agents
In src/models/responses.py around lines ~470-480,
StreamingQueryResponse.openapi_response() is using the generic
SUCCESSFUL_RESPONSE_DESCRIPTION instead of a streaming-specific description,
causing the test to expect "Streaming response (Server-Sent Events)" but receive
"Successful response"; define a new constant STREAMING_RESPONSE_DESCRIPTION =
"Streaming response (Server-Sent Events)" near the other description constants
and change StreamingQueryResponse.openapi_response() to return that constant as
the "description" key (replace SUCCESSFUL_RESPONSE_DESCRIPTION), ensuring the
constant is in scope for the function.
f65712d to
17b285f
Compare
Description
Declared SSE content type for streaming_query endpoints, updated documentation, added new response model with unit test.
Type of change
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.