-
Notifications
You must be signed in to change notification settings - Fork 54
LCORE-946: updated REST API documentation #798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LCORE-946: updated REST API documentation #798
Conversation
WalkthroughThis PR updates API documentation to clarify the v1 streaming_query endpoint uses the Agent API wrapper and adds a new v2 streaming_query endpoint using the Responses API wrapper. Both endpoints are documented in the OpenAPI specification and markdown files with mirrored request/response structures. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Possibly related PRs
Suggested labels
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (11)
docs/openapi.md (4)
230-234: Hyphenate compound modifier “API-specific”.Use “Agent API‑specific … response generator functions.” Same for Responses API elsewhere for consistency.
-This is a wrapper around streaming_query_endpoint_handler_base that provides -the Agent API specific retrieve_response and response generator functions. +This is a wrapper around streaming_query_endpoint_handler_base that provides +the Agent API-specific retrieve_response and response generator functions.
255-256: Fix broken table row (MD055/MD056) for 200 response.The row is split across two lines and missing a trailing pipe.
-| 200 | Streaming response with Server-Sent Events | string -string | +| 200 | Streaming response with Server-Sent Events | string |Add blank lines before and after the table to satisfy MD058 if missing.
586-601: Hyphenate “Responses API‑specific”.Mirror the v1 wording with proper hyphenation.
-This is a wrapper around streaming_query_endpoint_handler_base that provides -the Responses API specific retrieve_response and response generator functions. +This is a wrapper around streaming_query_endpoint_handler_base that provides +the Responses API-specific retrieve_response and response generator functions.
613-621: Fix v2 table formatting and add missing 429 parity.The response table for POST
/v2/streaming_query(lines 615-616) has split row formatting with missing pipes, confirmed by markdownlint errors. Additionally, the v1 endpoint includes a 429 response for quota exceeded scenarios, which v2 lacks.-| 200 | Streaming response with Server-Sent Events | string -string | +| 200 | Streaming response with Server-Sent Events | string | | 400 | Missing or invalid credentials provided by client | [UnauthorizedResponse](#unauthorizedresponse) | | 401 | Unauthorized: Invalid or missing Bearer token for k8s auth | [UnauthorizedResponse](#unauthorizedresponse) | | 403 | User is not authorized | [ForbiddenResponse](#forbiddenresponse) | +| 429 | The quota has been exceeded | [QuotaExceededResponse](#quotaexceededresponse) | | 500 | Internal Server Error | |docs/openapi.json (3)
414-415: Hyphenate “Agent API‑specific” in v1 streaming description.Keep wording consistent with md and style.
-"description": "Handle request to the /streaming_query endpoint using Agent API.\n\nThis is a wrapper around streaming_query_endpoint_handler_base that provides\nthe Agent API specific retrieve_response and response generator functions.\n\nReturns:\n StreamingResponse: An HTTP streaming response yielding\n SSE-formatted events for the query lifecycle.\n\nRaises:\n HTTPException: Returns HTTP 500 if unable to connect to the\n Llama Stack server.", +"description": "Handle request to the /streaming_query endpoint using Agent API.\n\nThis is a wrapper around streaming_query_endpoint_handler_base that provides\nthe Agent API-specific retrieve_response and response generator functions.\n\nReturns:\n StreamingResponse: An HTTP streaming response yielding\n SSE-formatted events for the query lifecycle.\n\nRaises:\n HTTPException: Returns HTTP 500 if unable to connect to the\n Llama Stack server.",
1315-1316: Hyphenate “Responses API‑specific” in v2 streaming description.-This is a wrapper around streaming_query_endpoint_handler_base that provides -the Responses API specific retrieve_response and response generator functions. +This is a wrapper around streaming_query_endpoint_handler_base that provides +the Responses API-specific retrieve_response and response generator functions.Also applies to: 1316-1316
427-443: Optional: note SSE content type.Consider adding “text/event-stream” to content types to reflect SSE default.
docs/output.md (4)
230-234: Hyphenate “API‑specific”.Same wording as openapi.md: “Agent API‑specific …”.
-... that provides -the Agent API specific retrieve_response and response generator functions. +... that provides +the Agent API-specific retrieve_response and response generator functions.
255-256: Fix broken 200 response table row and add trailing pipe.-| 200 | Streaming response with Server-Sent Events | string -string | +| 200 | Streaming response with Server-Sent Events | string |Ensure a blank line before and after the table to satisfy MD rules.
590-594: Hyphenate “Responses API‑specific”.-... that provides -the Responses API specific retrieve_response and response generator functions. +... that provides +the Responses API-specific retrieve_response and response generator functions.
615-621: Fix v2 streaming_query table row and add 429 parity.-| 200 | Streaming response with Server-Sent Events | string -string | +| 200 | Streaming response with Server-Sent Events | string | ... | 401 | Unauthorized: Invalid or missing Bearer token for k8s auth | [UnauthorizedResponse](#unauthorizedresponse) | | 403 | User is not authorized | [ForbiddenResponse](#forbiddenresponse) | +| 429 | The quota has been exceeded | [QuotaExceededResponse](#quotaexceededresponse) | | 500 | Internal Server Error | |
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
docs/openapi.json(2 hunks)docs/openapi.md(2 hunks)docs/output.md(2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/output.md
[grammar] ~233-~233: Use a hyphen to join words.
Context: ...handler_base that provides the Agent API specific retrieve_response and response ...
(QB_NEW_EN_HYPHEN)
[grammar] ~593-~593: Use a hyphen to join words.
Context: ...ler_base that provides the Responses API specific retrieve_response and response ...
(QB_NEW_EN_HYPHEN)
docs/openapi.md
[grammar] ~233-~233: Use a hyphen to join words.
Context: ...handler_base that provides the Agent API specific retrieve_response and response ...
(QB_NEW_EN_HYPHEN)
[grammar] ~593-~593: Use a hyphen to join words.
Context: ...ler_base that provides the Responses API specific retrieve_response and response ...
(QB_NEW_EN_HYPHEN)
🪛 markdownlint-cli2 (0.18.1)
docs/openapi.md
615-615: Table pipe style
Expected: leading_and_trailing; Actual: leading_only; Missing trailing pipe
(MD055, table-pipe-style)
616-616: Table pipe style
Expected: leading_and_trailing; Actual: trailing_only; Missing leading pipe
(MD055, table-pipe-style)
616-616: Table column count
Expected: 3; Actual: 1; Too few cells, row will be missing data
(MD056, table-column-count)
621-621: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build-pr
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: e2e_tests (azure)
- GitHub Check: e2e_tests (ci)
🔇 Additional comments (1)
docs/openapi.json (1)
1345-1392: Incorrect suggestion: the v2 streaming_query endpoint intentionally does not enforce user/cluster quotas.The review recommends adding a 429 response code to match "v1 streaming_query," but:
- v1 endpoint doesn't exist: The
/streaming_querypath is not present in openapi.json (only v2's/v2/streaming_queryexists).- No user/cluster quota enforcement: The v2 streaming_query handler (
streaming_query_endpoint_handler_base) does not callcheck_tokens_available()orconsume_tokens(), unlike the/queryendpoints which do enforce quotas. This is an intentional architectural difference.- Model rate limiting already covered: The endpoint catches
RateLimitError(model provider rate limits) and returns HTTP 429, so rate-limit scenarios are handled.The
streaming_query_v2_responsesdict omits both 429 and 422 because the endpoint doesn't validate request payloads like the query endpoint does and doesn't enforce user quotas. Adding 429 would misrepresent the endpoint's behavior.Likely an incorrect or invalid review comment.
| "summary": "Streaming Query Endpoint Handler V2", | ||
| "description": "Handle request to the /streaming_query endpoint using Responses API.\n\nThis is a wrapper around streaming_query_endpoint_handler_base that provides\nthe Responses API specific retrieve_response and response generator functions.\n\nReturns:\n StreamingResponse: An HTTP streaming response yielding\n SSE-formatted events for the query lifecycle.\n\nRaises:\n HTTPException: Returns HTTP 500 if unable to connect to the\n Llama Stack server.", | ||
| "operationId": "streaming_query_endpoint_handler_v2_v2_streaming_query_post", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix duplicated “v2_v2” in operationId.
Stabilize the name to match the pattern used elsewhere; clients may depend on it.
-"operationId": "streaming_query_endpoint_handler_v2_v2_streaming_query_post",
+"operationId": "streaming_query_endpoint_handler_v2_streaming_query_post",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "summary": "Streaming Query Endpoint Handler V2", | |
| "description": "Handle request to the /streaming_query endpoint using Responses API.\n\nThis is a wrapper around streaming_query_endpoint_handler_base that provides\nthe Responses API specific retrieve_response and response generator functions.\n\nReturns:\n StreamingResponse: An HTTP streaming response yielding\n SSE-formatted events for the query lifecycle.\n\nRaises:\n HTTPException: Returns HTTP 500 if unable to connect to the\n Llama Stack server.", | |
| "operationId": "streaming_query_endpoint_handler_v2_v2_streaming_query_post", | |
| "summary": "Streaming Query Endpoint Handler V2", | |
| "description": "Handle request to the /streaming_query endpoint using Responses API.\n\nThis is a wrapper around streaming_query_endpoint_handler_base that provides\nthe Responses API specific retrieve_response and response generator functions.\n\nReturns:\n StreamingResponse: An HTTP streaming response yielding\n SSE-formatted events for the query lifecycle.\n\nRaises:\n HTTPException: Returns HTTP 500 if unable to connect to the\n Llama Stack server.", | |
| "operationId": "streaming_query_endpoint_handler_v2_streaming_query_post", |
🤖 Prompt for AI Agents
In docs/openapi.json around lines 1314 to 1316, the operationId contains a
duplicated segment "v2_v2"
("streaming_query_endpoint_handler_v2_v2_streaming_query_post"); update it to
remove the duplicate so it matches the project's operationId pattern (e.g.,
"streaming_query_endpoint_handler_v2_streaming_query_post") ensuring the new
operationId is unique and consistent with other endpoints.
Description
LCORE-946: updated REST API documentation
Type of change
Related Tickets & Documents
Summary by CodeRabbit
New Features
Documentation