Skip to content

Conversation

@tisnik
Copy link
Contributor

@tisnik tisnik commented Nov 16, 2025

Description

LCORE-946: updated REST API documentation

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

  • Related Issue #LCORE-946

Summary by CodeRabbit

  • New Features

    • Added v2 streaming query endpoint to the API.
  • Documentation

    • Updated API documentation to clarify v1 and v2 streaming query endpoints with enhanced descriptions of implementations and error handling.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 16, 2025

Walkthrough

This PR updates API documentation to clarify the v1 streaming_query endpoint uses the Agent API wrapper and adds a new v2 streaming_query endpoint using the Responses API wrapper. Both endpoints are documented in the OpenAPI specification and markdown files with mirrored request/response structures.

Changes

Cohort / File(s) Summary
API Documentation
docs/openapi.json, docs/openapi.md, docs/output.md
Updated v1 /streaming_query endpoint description to reference Agent API wrapper and added new v2 /streaming_query endpoint referencing Responses API wrapper; both endpoints mirror each other in request/response structure with distinct operationId naming and tags

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Documentation-only changes with consistent pattern across files
  • Straightforward updates to endpoint descriptions and addition of v2 endpoint documentation
  • No code logic or functionality changes to evaluate

Possibly related PRs

Suggested labels

ok-to-test

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly indicates the main change is updating REST API documentation, which aligns with all file summaries (openapi.json, openapi.md, output.md).
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (11)
docs/openapi.md (4)

230-234: Hyphenate compound modifier “API-specific”.

Use “Agent API‑specific … response generator functions.” Same for Responses API elsewhere for consistency.

-This is a wrapper around streaming_query_endpoint_handler_base that provides
-the Agent API specific retrieve_response and response generator functions.
+This is a wrapper around streaming_query_endpoint_handler_base that provides
+the Agent API-specific retrieve_response and response generator functions.

255-256: Fix broken table row (MD055/MD056) for 200 response.

The row is split across two lines and missing a trailing pipe.

-| 200 | Streaming response with Server-Sent Events | string
-string |
+| 200 | Streaming response with Server-Sent Events | string |

Add blank lines before and after the table to satisfy MD058 if missing.


586-601: Hyphenate “Responses API‑specific”.

Mirror the v1 wording with proper hyphenation.

-This is a wrapper around streaming_query_endpoint_handler_base that provides
-the Responses API specific retrieve_response and response generator functions.
+This is a wrapper around streaming_query_endpoint_handler_base that provides
+the Responses API-specific retrieve_response and response generator functions.

613-621: Fix v2 table formatting and add missing 429 parity.

The response table for POST /v2/streaming_query (lines 615-616) has split row formatting with missing pipes, confirmed by markdownlint errors. Additionally, the v1 endpoint includes a 429 response for quota exceeded scenarios, which v2 lacks.

-| 200 | Streaming response with Server-Sent Events | string
-string |
+| 200 | Streaming response with Server-Sent Events | string |
 | 400 | Missing or invalid credentials provided by client | [UnauthorizedResponse](#unauthorizedresponse) |
 | 401 | Unauthorized: Invalid or missing Bearer token for k8s auth | [UnauthorizedResponse](#unauthorizedresponse) |
 | 403 | User is not authorized | [ForbiddenResponse](#forbiddenresponse) |
+| 429 | The quota has been exceeded | [QuotaExceededResponse](#quotaexceededresponse) |
 | 500 | Internal Server Error |  |
docs/openapi.json (3)

414-415: Hyphenate “Agent API‑specific” in v1 streaming description.

Keep wording consistent with md and style.

-"description": "Handle request to the /streaming_query endpoint using Agent API.\n\nThis is a wrapper around streaming_query_endpoint_handler_base that provides\nthe Agent API specific retrieve_response and response generator functions.\n\nReturns:\n    StreamingResponse: An HTTP streaming response yielding\n    SSE-formatted events for the query lifecycle.\n\nRaises:\n    HTTPException: Returns HTTP 500 if unable to connect to the\n    Llama Stack server.",
+"description": "Handle request to the /streaming_query endpoint using Agent API.\n\nThis is a wrapper around streaming_query_endpoint_handler_base that provides\nthe Agent API-specific retrieve_response and response generator functions.\n\nReturns:\n    StreamingResponse: An HTTP streaming response yielding\n    SSE-formatted events for the query lifecycle.\n\nRaises:\n    HTTPException: Returns HTTP 500 if unable to connect to the\n    Llama Stack server.",

1315-1316: Hyphenate “Responses API‑specific” in v2 streaming description.

-This is a wrapper around streaming_query_endpoint_handler_base that provides
-the Responses API specific retrieve_response and response generator functions.
+This is a wrapper around streaming_query_endpoint_handler_base that provides
+the Responses API-specific retrieve_response and response generator functions.

Also applies to: 1316-1316


427-443: Optional: note SSE content type.

Consider adding “text/event-stream” to content types to reflect SSE default.

docs/output.md (4)

230-234: Hyphenate “API‑specific”.

Same wording as openapi.md: “Agent API‑specific …”.

-... that provides
-the Agent API specific retrieve_response and response generator functions.
+... that provides
+the Agent API-specific retrieve_response and response generator functions.

255-256: Fix broken 200 response table row and add trailing pipe.

-| 200 | Streaming response with Server-Sent Events | string
-string |
+| 200 | Streaming response with Server-Sent Events | string |

Ensure a blank line before and after the table to satisfy MD rules.


590-594: Hyphenate “Responses API‑specific”.

-... that provides
-the Responses API specific retrieve_response and response generator functions.
+... that provides
+the Responses API-specific retrieve_response and response generator functions.

615-621: Fix v2 streaming_query table row and add 429 parity.

-| 200 | Streaming response with Server-Sent Events | string
-string |
+| 200 | Streaming response with Server-Sent Events | string |
 ...
 | 401 | Unauthorized: Invalid or missing Bearer token for k8s auth | [UnauthorizedResponse](#unauthorizedresponse) |
 | 403 | User is not authorized | [ForbiddenResponse](#forbiddenresponse) |
+| 429 | The quota has been exceeded | [QuotaExceededResponse](#quotaexceededresponse) |
 | 500 | Internal Server Error |  |
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 41e89f6 and 8d2a657.

📒 Files selected for processing (3)
  • docs/openapi.json (2 hunks)
  • docs/openapi.md (2 hunks)
  • docs/output.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/output.md

[grammar] ~233-~233: Use a hyphen to join words.
Context: ...handler_base that provides the Agent API specific retrieve_response and response ...

(QB_NEW_EN_HYPHEN)


[grammar] ~593-~593: Use a hyphen to join words.
Context: ...ler_base that provides the Responses API specific retrieve_response and response ...

(QB_NEW_EN_HYPHEN)

docs/openapi.md

[grammar] ~233-~233: Use a hyphen to join words.
Context: ...handler_base that provides the Agent API specific retrieve_response and response ...

(QB_NEW_EN_HYPHEN)


[grammar] ~593-~593: Use a hyphen to join words.
Context: ...ler_base that provides the Responses API specific retrieve_response and response ...

(QB_NEW_EN_HYPHEN)

🪛 markdownlint-cli2 (0.18.1)
docs/openapi.md

615-615: Table pipe style
Expected: leading_and_trailing; Actual: leading_only; Missing trailing pipe

(MD055, table-pipe-style)


616-616: Table pipe style
Expected: leading_and_trailing; Actual: trailing_only; Missing leading pipe

(MD055, table-pipe-style)


616-616: Table column count
Expected: 3; Actual: 1; Too few cells, row will be missing data

(MD056, table-column-count)


621-621: Tables should be surrounded by blank lines

(MD058, blanks-around-tables)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: e2e_tests (azure)
  • GitHub Check: e2e_tests (ci)
🔇 Additional comments (1)
docs/openapi.json (1)

1345-1392: Incorrect suggestion: the v2 streaming_query endpoint intentionally does not enforce user/cluster quotas.

The review recommends adding a 429 response code to match "v1 streaming_query," but:

  1. v1 endpoint doesn't exist: The /streaming_query path is not present in openapi.json (only v2's /v2/streaming_query exists).
  2. No user/cluster quota enforcement: The v2 streaming_query handler (streaming_query_endpoint_handler_base) does not call check_tokens_available() or consume_tokens(), unlike the /query endpoints which do enforce quotas. This is an intentional architectural difference.
  3. Model rate limiting already covered: The endpoint catches RateLimitError (model provider rate limits) and returns HTTP 429, so rate-limit scenarios are handled.

The streaming_query_v2_responses dict omits both 429 and 422 because the endpoint doesn't validate request payloads like the query endpoint does and doesn't enforce user quotas. Adding 429 would misrepresent the endpoint's behavior.

Likely an incorrect or invalid review comment.

Comment on lines +1314 to +1316
"summary": "Streaming Query Endpoint Handler V2",
"description": "Handle request to the /streaming_query endpoint using Responses API.\n\nThis is a wrapper around streaming_query_endpoint_handler_base that provides\nthe Responses API specific retrieve_response and response generator functions.\n\nReturns:\n StreamingResponse: An HTTP streaming response yielding\n SSE-formatted events for the query lifecycle.\n\nRaises:\n HTTPException: Returns HTTP 500 if unable to connect to the\n Llama Stack server.",
"operationId": "streaming_query_endpoint_handler_v2_v2_streaming_query_post",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix duplicated “v2_v2” in operationId.

Stabilize the name to match the pattern used elsewhere; clients may depend on it.

-"operationId": "streaming_query_endpoint_handler_v2_v2_streaming_query_post",
+"operationId": "streaming_query_endpoint_handler_v2_streaming_query_post",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"summary": "Streaming Query Endpoint Handler V2",
"description": "Handle request to the /streaming_query endpoint using Responses API.\n\nThis is a wrapper around streaming_query_endpoint_handler_base that provides\nthe Responses API specific retrieve_response and response generator functions.\n\nReturns:\n StreamingResponse: An HTTP streaming response yielding\n SSE-formatted events for the query lifecycle.\n\nRaises:\n HTTPException: Returns HTTP 500 if unable to connect to the\n Llama Stack server.",
"operationId": "streaming_query_endpoint_handler_v2_v2_streaming_query_post",
"summary": "Streaming Query Endpoint Handler V2",
"description": "Handle request to the /streaming_query endpoint using Responses API.\n\nThis is a wrapper around streaming_query_endpoint_handler_base that provides\nthe Responses API specific retrieve_response and response generator functions.\n\nReturns:\n StreamingResponse: An HTTP streaming response yielding\n SSE-formatted events for the query lifecycle.\n\nRaises:\n HTTPException: Returns HTTP 500 if unable to connect to the\n Llama Stack server.",
"operationId": "streaming_query_endpoint_handler_v2_streaming_query_post",
🤖 Prompt for AI Agents
In docs/openapi.json around lines 1314 to 1316, the operationId contains a
duplicated segment "v2_v2"
("streaming_query_endpoint_handler_v2_v2_streaming_query_post"); update it to
remove the duplicate so it matches the project's operationId pattern (e.g.,
"streaming_query_endpoint_handler_v2_streaming_query_post") ensuring the new
operationId is unique and consistent with other endpoints.

@tisnik tisnik merged commit 8760feb into lightspeed-core:main Nov 16, 2025
21 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant