-
Notifications
You must be signed in to change notification settings - Fork 56
LCORE-837: Exception handling refactor #830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LCORE-837: Exception handling refactor #830
Conversation
WalkthroughStandardizes API responses across the codebase by introducing rich response models and OpenAPI helpers, replacing inline HTTPException detail payloads with model-based responses, updating endpoint response mappings, adding a global exception middleware, and updating tests to expect the new structured error formats and status codes. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Endpoint
participant ResponseModel
participant Middleware
participant ClientResp as Client Response
rect rgba(200,220,255,0.6)
Note over Endpoint,ResponseModel: Success flow (typed responses)
Client->>Endpoint: HTTP request
Endpoint->>ResponseModel: build ConfigurationResponse / ToolsResponse ...
ResponseModel-->>Endpoint: typed object
Endpoint->>ClientResp: 200 + JSON (model serialized)
end
rect rgba(255,220,220,0.6)
Note over Endpoint,ResponseModel: Auth error (old vs new)
Client->>Endpoint: HTTP request without auth
alt Old
Endpoint-->>ClientResp: 400 {"detail": "No Authorization header found"}
else New
Endpoint->>ResponseModel: UnauthorizedResponse(cause="No Authorization header found")
ResponseModel-->>Endpoint: object (status_code=401)
Endpoint->>Middleware: raise HTTPException(status_code=401, detail=response.model_dump())
Middleware-->>ClientResp: 401 {"detail":{"response":"Missing or invalid credentials provided by client","cause":"No Authorization header found"}}
end
end
rect rgba(220,255,220,0.6)
Note over Endpoint,Middleware: Unhandled exception -> global middleware
Client->>Endpoint: HTTP request
Endpoint->>Endpoint: raises ValueError (uncaught)
Middleware->>ResponseModel: InternalServerErrorResponse.generic()
ResponseModel-->>Middleware: object
Middleware-->>ClientResp: 500 {"detail":{"response":"Internal server error","cause":"<trace/cause>"}}
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Areas requiring extra attention:
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| "user_data_collection": { | ||
| "feedback_enabled": True, |
Check warning
Code scanning / Bandit
Probable insecure usage of temp file/directory. Warning
| "user_data_collection": { | ||
| "feedback_enabled": True, |
Check warning
Code scanning / Bandit
Probable insecure usage of temp file/directory. Warning
| "transcripts_storage": "/tmp/data/transcripts", | ||
| }, | ||
| "database": { | ||
| "sqlite": {"db_path": "/tmp/lightspeed-stack.db"}, |
Check warning
Code scanning / Bandit
Probable insecure usage of temp file/directory. Warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (10)
tests/unit/authentication/test_noop_with_token.py (1)
60-89: Bring docstring in line with 401 + structured error detailThe test correctly asserts a 401 with a
{response, cause}detail for the missing-header case, but the docstring still refers to a 400 status and a flat"No Authorization header found"message. Updating the text will avoid confusion for future readers.Test that NoopWithTokenAuthDependency raises an HTTPException when no Authorization header is present in the request. - Asserts that the exception has a status code of 400 and the detail message - "No Authorization header found". + Asserts that the exception has a status code of 401 and a structured + detail message indicating that no Authorization header was found.src/authorization/middleware.py (1)
77-95: Forbidden response probably should useauth[0](user ID) instead of username
authis a 4‑tuple(user_id, username, skip_userid_check, token)(seeensure_test_user_id_and_nameintests/unit/authentication/test_jwk_token.py), but the forbidden branch calls:response = ForbiddenResponse.endpoint(user_id=auth[1])
ForbiddenResponse.endpoint’s message is phrased in terms of a user ID. Passing the username here makes the error text say “User {username} is not authorized to access this endpoint”, which doesn’t match the model’s field naming and may be confusing.Switching to
auth[0]will align the payload with the documented user ID.- if not access_resolver.check_access(action, user_roles): - response = ForbiddenResponse.endpoint(user_id=auth[1]) - raise HTTPException(**response.model_dump()) + if not access_resolver.check_access(action, user_roles): + response = ForbiddenResponse.endpoint(user_id=auth[0]) + raise HTTPException(**response.model_dump())src/authentication/k8s.py (2)
174-205:get_user_infodocstring and error mapping could be tightenedThe new behavior—raising a 503
ServiceUnavailableResponseonly whenK8sClientSingleton.get_authn_api()fails, and treatingcreate_token_reviewexceptions as “token invalid” by returningNone—is coherent and works with the caller’s 401 mapping.However, the
Raises:section still suggests that any unexpected error triggers anHTTPException, which is no longer true for TokenReview API failures. Consider updating the docstring to clarify that only failures to initialize the Kubernetes auth API result in an immediate 503, while TokenReview errors are logged and treated as an invalid/expired token at the call site.
223-285: SubjectAccessReview 403s are being turned into 503s by the broadexceptblockIn
K8SAuthDependency.__call__, the forbidden case:if not response.status.allowed: response = ForbiddenResponse.endpoint(user_id=user_info.user.uid) raise HTTPException(**response.model_dump())is inside a
trythat’s immediately followed by:except Exception as e: logger.error("API exception during SubjectAccessReview: %s", e) response = ServiceUnavailableResponse(...) raise HTTPException(**response.model_dump()) from eBecause
HTTPExceptionis a subclass ofException, every 403 you raise for a denied SubjectAccessReview is caught and rethrown as a 503ServiceUnavailableResponse. Callers will never see a proper 403 Forbidden from this path.Narrowing the
excepttoApiException(or restricting thetryjust to the client call) fixes this while still mapping genuine Kubernetes API failures to 503.- try: - authorization_api = K8sClientSingleton.get_authz_api() - sar = kubernetes.client.V1SubjectAccessReview( - spec=kubernetes.client.V1SubjectAccessReviewSpec( - user=user_info.user.username, - groups=user_info.user.groups, - non_resource_attributes=kubernetes.client.V1NonResourceAttributes( - path=self.virtual_path, verb="get" - ), - ) - ) - response = authorization_api.create_subject_access_review(sar) - - if not response.status.allowed: - response = ForbiddenResponse.endpoint(user_id=user_info.user.uid) - raise HTTPException(**response.model_dump()) - except Exception as e: - logger.error("API exception during SubjectAccessReview: %s", e) - response = ServiceUnavailableResponse( - backend_name="Kubernetes API", - cause="Unable to perform authorization check", - ) - raise HTTPException(**response.model_dump()) from e + authorization_api = K8sClientSingleton.get_authz_api() + sar = kubernetes.client.V1SubjectAccessReview( + spec=kubernetes.client.V1SubjectAccessReviewSpec( + user=user_info.user.username, + groups=user_info.user.groups, + non_resource_attributes=kubernetes.client.V1NonResourceAttributes( + path=self.virtual_path, + verb="get", + ), + ) + ) + + try: + sar_response = authorization_api.create_subject_access_review(sar) + except ApiException as e: + logger.error("API exception during SubjectAccessReview: %s", e) + error_response = ServiceUnavailableResponse( + backend_name="Kubernetes API", + cause="Unable to perform authorization check", + ) + raise HTTPException(**error_response.model_dump()) from e + + if not sar_response.status.allowed: + error_response = ForbiddenResponse.endpoint(user_id=user_info.user.uid) + raise HTTPException(**error_response.model_dump())Also, the class docstring still claims only a 403 is raised “if the token is invalid, expired, or the user is not authorized”. After introducing
UnauthorizedResponse(401) andServiceUnavailableResponse(503) paths, it would be good to update that summary to reflect the broader set of outcomes.tests/unit/app/endpoints/test_shields.py (1)
24-42: Reset AppConfig before simulating unloaded config
AppConfigis a process-wide singleton, so if any earlier test initialized it,_configurationstays populated. In that state this test silently exercises the “configuration loaded” branch and never raises the expected 500. Clear_configurationon the instance you patch in so the test reliably hits the failure path.- mock_config = AppConfig() - mocker.patch("app.endpoints.shields.configuration", mock_config) + mock_config = AppConfig() + mock_config._configuration = None # pylint: disable=protected-access + mocker.patch("app.endpoints.shields.configuration", mock_config)tests/unit/app/endpoints/test_config.py (1)
20-44: Ensure the singleton really mimics “not loaded” stateBecause
AppConfig()returns the shared singleton, any priorinit_from_dictcall leaves_configurationnon-None. Without resetting it here, this test won’t raise and will incorrectly succeed. Clear_configurationon the patched instance before invoking the handler so the guard fires.- mock_config = AppConfig() - mocker.patch("app.endpoints.config.configuration", mock_config) + mock_config = AppConfig() + mock_config._configuration = None # pylint: disable=protected-access + mocker.patch("app.endpoints.config.configuration", mock_config)tests/unit/app/endpoints/test_streaming_query.py (1)
153-171: Override the autouse config fixture when testing the failure pathThe autouse
setup_configurationfixture initializesAppConfigbefore every test, soAppConfig()here already has_configurationpopulated. As a result, the guard never trips and this assertion is false. Explicitly null out_configurationon the instance you patch into the module so this test actually exercises the “configuration not loaded” branch.- mock_config = AppConfig() - mocker.patch("app.endpoints.streaming_query.configuration", mock_config) + mock_config = AppConfig() + mock_config._configuration = None # pylint: disable=protected-access + mocker.patch("app.endpoints.streaming_query.configuration", mock_config)tests/unit/app/endpoints/test_query.py (1)
1777-1860: Fix inaccurate type hints intest_evaluate_model_hintsThe parametrized test passes
UserConversation | Noneforuser_conversationand tuples forrequest_values/expected_values, but the function signature annotates them all aslist. This is misleading for readers and static analyzers, even though it’s harmless at runtime.Consider tightening the annotations, e.g.:
-def test_evaluate_model_hints( - user_conversation: list, - request_values: list, - expected_values: list, -) -> None: +def test_evaluate_model_hints( + user_conversation: UserConversation | None, + request_values: tuple[str | None, str | None], + expected_values: tuple[str | None, str | None], +) -> None:src/app/endpoints/query.py (1)
577-607: Fix return type mismatch inparse_metadata_from_text_item
parse_metadata_from_text_itemis annotated to returnOptional[ReferencedDocument], but whentext_itemis not aTextContentItemit returnsdocs, which is alist[ReferencedDocument]. That’s inconsistent with the signature and could surprise future callers or type checkers.You can simplify and fix this by removing the unused
docslist and returningNonein the non-TextContentItemcase:-def parse_metadata_from_text_item( - text_item: TextContentItem, -) -> Optional[ReferencedDocument]: +def parse_metadata_from_text_item( + text_item: TextContentItem, +) -> Optional[ReferencedDocument]: @@ - docs: list[ReferencedDocument] = [] - if not isinstance(text_item, TextContentItem): - return docs + if not isinstance(text_item, TextContentItem): + return None @@ - return None + return NoneThis matches the declared return type and the way the function is used (single document or
None).tests/unit/app/endpoints/test_conversations.py (1)
859-861: Incorrect use ofset(Action.X)when patchingget_actionsIn tests where you restrict authorized actions, e.g.:
mocker.patch( "authorization.resolvers.NoopAccessResolver.get_actions", return_value=set(Action.GET_CONVERSATION), )and similarly for
Action.DELETE_CONVERSATION,set(Action.GET_CONVERSATION)creates a set of characters from the enum’s string value (e.g.,{"g", "e", "t", ...}), not{Action.GET_CONVERSATION}.This can cause the
authorizedecorator to see a completely different set of actions than intended and potentially short‑circuit with an endpoint‑level 403 instead of exercising the conversation‑level forbidden path you’re trying to test.You likely want:
return_value={Action.GET_CONVERSATION} # and return_value={Action.DELETE_CONVERSATION}instead of
set(Action.…).
🧹 Nitpick comments (28)
tests/unit/authentication/test_utils.py (1)
23-41: Prefercastover# type: ignoreforexc.detailThe new 401 + structured-detail checks are correct, but
test_extract_user_token_no_headerstill relies on# type: ignorewhile the next test usescast. Mirroring the pattern improves consistency and type-checking.def test_extract_user_token_no_header() -> None: @@ - try: - extract_user_token(headers) - except HTTPException as exc: - assert exc.status_code == 401 - assert exc.detail["response"] == ( # type: ignore - "Missing or invalid credentials provided by client" - ) - assert exc.detail["cause"] == "No Authorization header found" # type: ignore + try: + extract_user_token(headers) + except HTTPException as exc: + assert exc.status_code == 401 + detail = cast(dict[str, str], exc.detail) + assert detail["response"] == "Missing or invalid credentials provided by client" + assert detail["cause"] == "No Authorization header found"src/authentication/jwk_token.py (1)
37-46: Avoid holding the cache lock across network I/O inget_jwk_setThe error mapping and use of
UnauthorizedResponsehere looks consistent with the new error model surface, butget_jwk_setcurrently holds_jwk_cache_lockwhile performing the entireaiohttprequest and JSON/JWK parsing. This unnecessarily serializes all concurrent requests that miss the cache.Consider narrowing the critical section to just the cache read/write, and doing the HTTP fetch outside the lock (with a simple double‑checked pattern) so that slow or failing key servers don't become a bottleneck:
async def get_jwk_set(url: str) -> KeySet: - """Fetch the JWK set from the cache, or fetch it from the URL if not cached.""" - async with _jwk_cache_lock: - if url not in _jwk_cache: - async with aiohttp.ClientSession() as session: - async with session.get(url) as resp: - resp.raise_for_status() - _jwk_cache[url] = JsonWebKey.import_key_set(await resp.json()) - return _jwk_cache[url] + """Fetch the JWK set from the cache, or fetch it from the URL if not cached.""" + cached = _jwk_cache.get(url) + if cached is not None: + return cached + + async with _jwk_cache_lock: + # Re-check under the lock in case another task populated it + cached = _jwk_cache.get(url) + if cached is not None: + return cached + + async with aiohttp.ClientSession() as session: + async with session.get(url) as resp: + resp.raise_for_status() + jwk_set = JsonWebKey.import_key_set(await resp.json()) + + _jwk_cache[url] = jwk_set + return jwk_setThis keeps correctness while improving concurrency characteristics.
Also applies to: 132-165
src/models/responses.py (2)
41-58: Be aware thatAbstractSuccessfulResponse.openapi_responsenow hard‑fails when examples are missing
AbstractSuccessfulResponse.openapi_response()pulls examples frommodel_json_schema()and raisesSchemaErrorif none are present. This is a nice guardrail, but it means any future subclass that forgets to definemodel_config["json_schema_extra"]["examples"]will cause an exception at import/OpenAPI generation time.If that strictness is intentional, it’s fine; otherwise, consider a softer fallback (e.g., returning a response spec without
exampleinstead of raising).Also applies to: 69-87
1087-1141: Validate thatAbstractErrorResponse.openapi_responsematches how FastAPI actually serializesHTTPException
AbstractErrorResponsemodels errors as:
status_code: intdetail: DetailModel(response, cause)and
openapi_response()builds examples like:named_examples[label] = {"value": {"detail": detail}}while routes raise errors via
HTTPException(**response.model_dump()), meaning the actual HTTP body is{"detail": {"response": ..., "cause": ...}}.That’s coherent, but a few things to double‑check:
- The schema for
cls(withstatus_code+detail) will not match the example values (which omitstatus_code). OpenAPI tooling may complain about examples not matching the schema.- When no
examplesare defined injson_schema_extra,named_examplesstays empty and you emit"examples": None, which is a bit odd; you might prefer to either raise (likeAbstractSuccessfulResponse) or omit theexampleskey entirely.If you want tighter alignment, you could either:
- Include
status_codein the example values, or- Change
modelin the response spec to represent the actual body shape (i.e., just thedetailobject) rather than the full helper model.This is largely a documentation/contract concern; runtime behavior with
HTTPException(**model_dump())looks fine.Also applies to: 1439-1542, 1555-1685
tests/e2e/features/feedback.feature (1)
111-115: Feedback error responses correctly aligned with new response modelsThe updated 403/401/500 expectations (nested
detailwithresponse/cause) matchForbiddenResponse.feedback_disabled,UnauthorizedResponse(“missing header”), andInternalServerErrorResponse.feedback_path_invalid("/invalid")fromsrc/models/responses.py, so the E2E specs look consistent with the refactored error handling.One small consistency nit: the
ForbiddenResponseOpenAPI example for the"feedback"label still uses"response": "Storing feedback is disabled."while this scenario (andfeedback_disabled()) use"Feedback is disabled". Consider aligning the example text so docs, models, and tests all show the same message.Also applies to: 262-269, 277-284, 308-310
src/app/endpoints/shields.py (1)
16-22: Shields endpoint now correctly wired into standardized response modelsThe
shields_responsesmapping and the APIConnectionError →ServiceUnavailableResponsetranslation (raised viaHTTPException(**response.model_dump())) look consistent with the new error-handling pattern and theServiceUnavailableResponsecontract.One optional consideration:
cause=str(e)will surface the rawAPIConnectionErrormessage to clients. If you prefer to avoid leaking backend details, you might want to replace this with a more generic cause (similar to the"Connection error while trying to reach backend service."example) and log the full exception only server‑side.Also applies to: 29-37, 79-82
tests/unit/app/endpoints/test_feedback.py (1)
33-53: Config mocking foris_feedback_enabled/is_feedback_disabledis sound; consider isolating from real FS checksThe pattern of creating an
AppConfig, setting its_configuration.user_data_collection, and patchingapp.endpoints.feedback.configurationis appropriate and keeps the test focused onis_feedback_enabled().Minor suggestion:
UserDataCollection’s validator will callchecks.directory_check(Path("/tmp"), ...)whenfeedback_enabled=True. To avoid any dependency on the host filesystem, you could patchchecks.directory_checkin these tests.src/app/endpoints/tools.py (1)
29-140: Tools endpoint refactor correctly standardizes responses; minor optional optimizationsThe new implementation cleanly:
- Documents 200/401/403/500/503 via
tools_responseswithToolsResponseand the error response models.- Maps
APIConnectionErrorfrom bothtoolgroups.list()andtools.list()to an HTTP 503 viaServiceUnavailableResponse, matching the PR’s error-handling goals.- Skips missing toolgroups on
BadRequestErrorand aggregates tools with a clearserver_sourcesplit between MCP and builtin usingconfiguration.mcp_servers.Two small, optional refinements:
- For large numbers of MCP servers/toolgroups, consider precomputing a
name -> serverdict once instead of usingnext(... for s in configuration.mcp_servers ...)on each tool (Lines 107–119).- The “toolgroup is not found” case (Lines 89–91) may be more of a warning than an error; downgrading the log level to
logger.warningcould better reflect that it’s a client/config issue rather than a server failure.These are non-blocking and can be deferred.
tests/unit/app/endpoints/test_conversations_v2.py (2)
93-107:ReferencedDocumentconstruction in transform tests looks good;# type: ignorelikely unnecessaryThe
docslist usingReferencedDocument(doc_title="Test Doc", doc_url="http://example.com")correctly exercises thereferenced_documentspath intransform_chat_message, including URL normalization.The trailing
# type: ignoreon the list definition seems unnecessary here and could mask real type issues in the future. You can probably drop it without changing behavior.
582-628: Inconsistent semantics betweensuccessflag and “cannot be deleted” messageIn
test_unsuccessful_deletion, whenconversation_cache.deletereturnsFalse, the test asserts:
response.success is Trueresponse.response == "Conversation cannot be deleted"This combination is semantically confusing: the
successflag indicates success while the message states it “cannot be deleted”. Unless the intent is to treat “already deleted/no-op” as a success case, consider either:
- Setting
successtoFalseand using a clearly error-like message, or- Keeping
successasTruebut rephrasing the message to something like “Conversation already deleted” or “No conversation to delete”.Right now the API surface is ambiguous for clients interpreting this response.
src/utils/endpoints.py (1)
159-173: System-prompt disabling behavior is explicit and user-friendlyRaising an
UnprocessableEntityResponsewhendisable_query_system_promptis set and asystem_promptis supplied provides a precise 422 with a helpful cause message telling clients to remove the field. This is a solid UX improvement.If this pattern is reused elsewhere, you might later factor a small helper like
UnprocessableEntityResponse.system_prompt_disabled()to avoid hard‑coding the message in multiple places, but it’s not urgent.src/app/endpoints/query.py (2)
214-243: Update docstring to reflect 503 on backend connection failures
query_endpoint_handler_base’s docstring still says connection errors to Llama Stack result in an HTTP 500, but the implementation now wrapsAPIConnectionErrorinServiceUnavailableResponse("Llama Stack", cause)and raises it viaHTTPException(**response.model_dump()), i.e., HTTP 503.Consider updating the docstring text to mention 503 Service Unavailable instead of 500 so behavior and documentation stay in sync.
Also applies to: 417-427
427-435: Handle RateLimitError more defensively when model name is missingThe
RateLimitErrorbranch builds aQuotaExceededResponse.model(used_model)usingused_model = getattr(e, "model", ""). IfRateLimitErrorlacks amodelattribute or it’s empty, this yields a slightly odd message (“for model ”).A safer pattern would be to fall back to
QuotaExceededResponse.from_exception(e)whenused_modelis falsy, e.g.:- except RateLimitError as e: - used_model = getattr(e, "model", "") - response = QuotaExceededResponse.model(used_model) - raise HTTPException(**response.model_dump()) from e + except RateLimitError as e: + used_model = getattr(e, "model", "") or "" + if used_model: + response = QuotaExceededResponse.model(used_model) + else: + response = QuotaExceededResponse.from_exception(e) + raise HTTPException(**response.model_dump()) from eThis keeps messages clean even if the upstream error doesn’t expose a model name.
src/app/endpoints/streaming_query_v2.py (1)
316-337: Update streaming v2 docstring to mention 503 instead of 500The
streaming_query_endpoint_handler_v2docstring still says it “Returns HTTP 500 if unable to connect to the Llama Stack server,” but the responses mapping (and underlying shared handler) now useServiceUnavailableResponse(503) for backend connectivity issues.Recommend updating the docstring’s “Raises” section to reference HTTP 503 Service Unavailable instead of 500 to avoid confusion.
src/app/endpoints/conversations_v2.py (3)
96-103: Consider DRY-ing and tightening conversation cache configuration checksThe repeated pattern:
if configuration.conversation_cache_configuration.type is None: logger.warning("Converastion cache is not configured") response = InternalServerErrorResponse.cache_unavailable() raise HTTPException(**response.model_dump())appears in multiple handlers, with a typo in the log message in three places and a slightly different spelling in the update handler. You may want to:
- Extract a small helper (e.g.,
ensure_conversation_cache_configured()) that performs this check and raises the standardized 500 response.- Fix the log message spelling uniformly.
This would reduce duplication and keep cache-unavailable behavior consistent across endpoints.
- if configuration.conversation_cache_configuration.type is None: - logger.warning("Converastion cache is not configured") - response = InternalServerErrorResponse.cache_unavailable() - raise HTTPException(**response.model_dump()) + ensure_conversation_cache_configured(logger)and in a shared helper module:
def ensure_conversation_cache_configured(logger: logging.Logger) -> None: if configuration.conversation_cache_configuration.type is None: logger.warning("Conversation cache is not configured") response = InternalServerErrorResponse.cache_unavailable() raise HTTPException(**response.model_dump())Also applies to: 125-129, 160-163, 194-197
176-180: MissingRequestparameter onupdate_conversation_endpoint_handlermay affectauthorizemiddlewareUnlike the other endpoints in this module,
update_conversation_endpoint_handlerdoes not accept aRequestparameter, yet it is decorated with@authorize(Action.UPDATE_CONVERSATION). If theauthorizemiddleware relies on aRequestargument to check and attachrequest.state.authorized_actions, this handler may bypass or break that logic.Consider adding
request: Requestas the first parameter for consistency with the other handlers and to avoid surprises in the authorization stack.-@router.put("/conversations/{conversation_id}", responses=conversation_update_responses) -@authorize(Action.UPDATE_CONVERSATION) -async def update_conversation_endpoint_handler( - conversation_id: str, - update_request: ConversationUpdateRequest, - auth: Any = Depends(get_auth_dependency()), -) -> ConversationUpdateResponse: +@router.put("/conversations/{conversation_id}", responses=conversation_update_responses) +@authorize(Action.UPDATE_CONVERSATION) +async def update_conversation_endpoint_handler( + request: Request, # pylint: disable=unused-argument + conversation_id: str, + update_request: ConversationUpdateRequest, + auth: Any = Depends(get_auth_dependency()), +) -> ConversationUpdateResponse:
229-241: Double‑check conversation existence check withskip_userid_checksemantics
check_conversation_existencehard-codes:if configuration.conversation_cache_configuration.type is None: return conversations = configuration.conversation_cache.list(user_id, False)so it ignores the
skip_userid_checkflag that the list/get/delete/update handlers pass when they callconfiguration.conversation_cache.list(...)directly.If
skip_userid_check=Truecan legitimately broaden visibility (e.g., service accounts, admin flows), this existence check could return a false negative for conversations that are accessible but not owned byuser_id. If v2 is strictly “per-user only” that may be fine, but it would be good to confirm and, if needed, threadskip_userid_checkthrough this helper.-def check_conversation_existence(user_id: str, conversation_id: str) -> None: +def check_conversation_existence( + user_id: str, conversation_id: str, skip_userid_check: bool = False +) -> None: @@ - conversations = configuration.conversation_cache.list(user_id, False) + conversations = configuration.conversation_cache.list(user_id, skip_userid_check)and update callers accordingly.
tests/unit/models/responses/test_successful_responses.py (1)
696-739: SchemaError message assertions are tightly coupled to implementation detailsTests like:
with pytest.raises(SchemaError, match="has no label"): InvalidResponse.openapi_response()and:
with pytest.raises(SchemaError, match="Examples not found"): NoExamplesResponse.openapi_response()depend on the exact wording of the
SchemaErrormessages. Any minor phrasing change insrc/models/responses.pywill break these tests even though behavior is still correct.If you want the tests to be more resilient while still validating behavior, you could:
- Assert only on the exception type, or
- Use a more generic
matchthat looks for a stable token (e.g., the class name) rather than the full English sentence.Not blocking, but worth considering to reduce test brittleness.
Also applies to: 962-973
tests/unit/app/endpoints/test_conversations.py (2)
299-323: Assertions on exactdetail["response"]/detail["cause"]strings may be brittleSeveral tests assert substrings on the structured error payload, e.g.:
assert "Invalid conversation ID format" in detail["response"] assert INVALID_CONVERSATION_ID in detail["cause"] assert "Conversation not found" in detail["response"] assert "does not exist" in detail["cause"]These are generally aligned with the current implementations of
BadRequestResponseandNotFoundResponse, but they are quite sensitive to wording and casing changes insrc/models/responses.py.Given this PR’s goal of centralizing message construction in response models, it may be more robust to:
- Assert against the specific fields you care about (e.g., that
resourceandresource_idare correct if they are exposed), or- At least loosen the expectations (e.g., check only for
"conversation"and the ID, or for a stable prefix like"Invalid"/"not found") rather than full phrases.Not mandatory, but it would make tests less fragile to minor copy changes in the response messages.
Also applies to: 731-733, 397-400, 804-807, 765-773, 781-787, 1004-1008, 1256-1260
808-842: Comment/docstring drift around “general exception” vsAPIConnectionErrorIn
test_session_deletion_exceptionthe comment says:# Mock AsyncLlamaStackClientHolder to raise a general exception mock_client.agents.session.delete.side_effect = APIConnectionError(...)but the implementation and assertions now specifically exercise the
APIConnectionError→ 503 path viaServiceUnavailableResponse.The behavior is correct and matches the endpoint code, but updating the comment/docstring to mention
APIConnectionErrorexplicitly would avoid confusion for future readers.src/app/endpoints/conversations.py (2)
84-123:simplify_session_datamatches tests and intent; consider skipping missing output messagesThe transformation logic cleanly strips Llama Stack session turns down to:
messageslist of{content, type}pairs (role → type),started_at/completed_attimestamps.This aligns with the tests in
TestSimplifySessionDataand with the documented intent.One minor refinement you might consider: if
output_messageis absent for a turn, you currently append an assistant message withcontent=Noneandtype=None. Skipping the output message entirely in that case would produce a cleaner history, but it’s not required by the current tests.
209-288: GET conversation endpoint correctly standardized to new response modelsThe refactored
get_conversation_endpoint_handlernow:
- Uses
BadRequestResponsefor invalid IDs,ForbiddenResponse.conversationfor access‑denied cases,NotFoundResponsefor missing conversations or missing sessions,ServiceUnavailableResponseforAPIConnectionError,InternalServerErrorResponse.database_error()forSQLAlchemyError.This lines up with the unit tests in
tests/unit/app/endpoints/test_conversations.pyand the broader error‑model strategy. The only small oddity is catchingSQLAlchemyErrorinside the Llama Stack clienttryblock, whereas DB access primarily occurs in helpers likeretrieve_conversationthat are called before the block; but it’s harmless and may still guard future DB usage in that region.src/app/endpoints/query_v2.py (1)
198-216: Consider making get_topic_summary less dependent on globals
get_topic_summarycurrently pulls the system prompt from the globalconfigurationsingleton and assumesclient.responses.createandresponse.outputalways succeed. That’s fine given the currentcheck_configuration_loaded(configuration)at the endpoint level, but it makes this helper harder to reuse or unit test in isolation.If you expect additional call sites in the future, consider optionally passing
AppConfig(or the prompt string) as a parameter and adding a small try/except aroundclient.responses.createto surface a clearer failure (e.g., log + rethrow) while still letting higher-level handlers map to the appropriateInternalServerErrorResponse/ServiceUnavailableResponse.Also applies to: 218-234
src/app/endpoints/rags.py (1)
147-154: BadRequestError→404 mapping might be too broad
BadRequestErrorfromclient.vector_stores.retrieveis treated unconditionally as “rag not found” and mapped toNotFoundResponse(resource="rag", resource_id=rag_id). If llama_stack_client ever usesBadRequestErrorfor other client-side issues (e.g., malformed IDs, validation errors), those would be surfaced as 404s instead of 400-type failures.If the client library exposes a more specific signal (status code, error type, or message) for the not-found case, it would be safer to branch on that and only construct
NotFoundResponsewhen you’re sure it’s a genuine 404-equivalent scenario. Otherwise this behavior is acceptable but a bit lossy.tests/unit/app/endpoints/test_tools.py (1)
211-253: Partial-failure test currently duplicates the full-failure scenario
test_tools_endpoint_partial_failure’s docstring says “one MCP server failing with APIConnectionError”, butmock_client.tools.list.side_effect = api_errorcauses every tools call to raise the sameAPIConnectionError, making this effectively identical in behavior totest_tools_endpoint_api_connection_error.If you want to exercise a genuine partial-failure scenario (where some toolgroups succeed and one fails, and the endpoint’s behavior differs from the all-fail case), consider adjusting the side effect list, for example:
- api_error = APIConnectionError(request=mocker.Mock()) - mock_client.tools.list.side_effect = api_error + api_error = APIConnectionError(request=mocker.Mock()) + mock_client.tools.list.side_effect = [ + [mock_tools_response[0]], # first toolgroup succeeds + api_error, # second toolgroup fails + ]and then asserting whatever aggregate behavior you expect (e.g., fail-fast with 503 vs partial results).
Also applies to: 255-289
src/app/endpoints/streaming_query.py (1)
797-824: Docstrings and behavior diverge on connection errors and quotas
streaming_query_endpoint_handler_baseandstreaming_query_endpoint_handlerdocstrings still say “HTTP 500 if unable to connect to Llama Stack”, but the implementation now:
- Maps
APIConnectionErrortoServiceUnavailableResponse→ HTTP 503.- Maps
RateLimitErrortoQuotaExceededResponse→ HTTP 429.It would be good to update the docstrings to accurately describe 503 for backend connectivity issues and 429 for quota/RateLimitError conditions, and to mention the standardized error payloads.
- Raises: - HTTPException: Returns HTTP 500 if unable to connect to Llama Stack + Raises: + HTTPException: + - 429: Quota exceeded (RateLimitError from backend) + - 500: Lightspeed Stack configuration not loaded + - 503: Unable to connect to Llama Stackand similarly adjust the wrapper endpoint’s docstring.
Also applies to: 828-836, 903-921
src/app/endpoints/providers.py (1)
54-93: Aligngroup_providersdocstring with its type hintThe implementation and return shape used to build
ProvidersListResponselook correct, but the parameter docstring still talks about a generic "list of ProviderInfo objects" while the type hint isProviderListResponse. Consider tightening the docstring to match the actual alias to avoid confusion for future readers.-def group_providers(providers: ProviderListResponse) -> dict[str, list[dict[str, Any]]]: - """Group a list of ProviderInfo objects by their API type. +def group_providers(providers: ProviderListResponse) -> dict[str, list[dict[str, Any]]]: + """Group providers returned by Llama Stack by their API type. @@ - Args: - providers: List of ProviderInfo objects. + Args: + providers: ProviderListResponse from the Llama Stack client.Also applies to: 95-113
src/app/endpoints/feedback.py (1)
117-155: Configuration checks and feedback storage error handling; consider guarding/statusas wellUsing
check_configuration_loaded(configuration)infeedback_endpoint_handlerandupdate_feedback_status, and mappingOSErrorinstore_feedbacktoInternalServerErrorResponse.feedback_path_invalid(...), gives you consistent 500 responses when configuration or storage is broken. One follow-up you might consider is adding the same configuration guard to the/feedback/statusendpoint so all three feedback routes return the standardized "configuration not loaded" error when config is missing, rather than depending on whatever exceptionis_feedback_enabled()might raise.@router.get("/status", responses=feedback_get_response) def feedback_status() -> StatusResponse: """ @@ - """ - logger.debug("Feedback status requested") + """ + check_configuration_loaded(configuration) + logger.debug("Feedback status requested")Also applies to: 158-173, 192-193
| try: | ||
| model_list = await AsyncLlamaStackClientHolder().get_client().models.list() | ||
| except APIConnectionError as e: | ||
| response = ServiceUnavailableResponse(backend_name="Llama Stack", cause=str(e)) | ||
| raise HTTPException(**response.model_dump()) from e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify the error handling pattern and exception coverage.
The error handling follows the PR's standardized approach. Please verify:
-
HTTPException construction: Confirm that
ServiceUnavailableResponse.model_dump()returns a dict with the correct keys (status_code,detail) expected byHTTPException.__init__(). -
Exception coverage: Only
APIConnectionErroris caught. Verify that other exceptions frommodels.list()(e.g., timeout, authentication errors, unexpected API errors) are either handled by upstream middleware or should be caught here as well.
Run the following script to verify the ServiceUnavailableResponse implementation and broader error handling:
🏁 Script executed:
#!/bin/bash
# Description: Verify ServiceUnavailableResponse structure and error handling patterns
# Search for ServiceUnavailableResponse class definition
ast-grep --pattern 'class ServiceUnavailableResponse($$$) {
$$$
}'
# Search for model_dump usage with HTTPException across the codebase
rg -n -C2 'HTTPException\(\*\*.*\.model_dump\(\)' --type=py
# Search for other exceptions that might be raised by models.list()
rg -n -C3 'models\.list\(\)' --type=pyLength of output: 28635
🏁 Script executed:
#!/bin/bash
# Find ServiceUnavailableResponse class definition
fd -type f -name "*.py" | xargs rg -l "class ServiceUnavailableResponse" --type=py
# Show the class definition
rg -n -A 20 "class ServiceUnavailableResponse" --type=pyLength of output: 1480
🏁 Script executed:
#!/bin/bash
# Find AbstractErrorResponse class definition
rg -n -A 30 "class AbstractErrorResponse" src/models/responses.py
# Also check if model_dump is overridden
rg -n "def model_dump" src/models/responses.py -A 5Length of output: 1209
🏁 Script executed:
#!/bin/bash
# Check HTTPException imports
rg -n "from.*HTTPException|import.*HTTPException" src/metrics/utils.py
# Check what exceptions are imported in metrics/utils.py
rg -n "from|import" src/metrics/utils.py -B 5 -A 5 | head -50
# Look for other exception handlers for models.list() calls to see patterns
rg -n -B 5 -A 10 "models\.list\(\)" src/app/endpoints/Length of output: 5243
🏁 Script executed:
#!/bin/bash
# Check exception handling context in query.py around models.list()
rg -n -B 15 -A 20 "await client.models.list\(\)" src/app/endpoints/query.py | head -80
# Check exception handling context in streaming_query.py around models.list()
rg -n -B 15 -A 30 "await client.models.list\(\)" src/app/endpoints/streaming_query.py | head -80Length of output: 3369
🏁 Script executed:
#!/bin/bash
# Check the full try/except structure in query.py that contains models.list()
rg -n "try:" src/app/endpoints/query.py -A 200 | grep -A 200 "282:"
# Check the full try/except structure in streaming_query.py that contains models.list()
rg -n "try:" src/app/endpoints/streaming_query.py -A 200 | grep -A 200 "858:"Length of output: 17203
🌐 Web query:
llama_stack_client models.list() exception types what errors
💡 Result:
Short answer — exceptions you should expect from client.models.list():
-
Base classes:
- llama_stack_client.APIError (base). [1][2]
- llama_stack_client.APIConnectionError (network / I/O / httpx underlying). [1][2]
- llama_stack_client.APIStatusError (non-2xx HTTP responses; has .status_code and .response). [1][2]
-
Status-specific subclasses (raised for 4xx/5xx codes):
- BadRequestError (400)
- AuthenticationError (401)
- PermissionDeniedError (403)
- NotFoundError (404)
- UnprocessableEntityError (422)
- RateLimitError (429)
- InternalServerError (>=500)
- (other / unexpected status → other APIStatusError variants). [1][2]
-
Client-side parse/serialization errors / invalid data are reported as APIError subclasses (e.g., invalid response parsing). Retries: connection/timeouts, 408, 409, 429 and >=500 are retried by default (configurable). [1][2]
References: official llama-stack-client Python docs / repo and PyPI package docs. [1][2]
Add exception handling for APIStatusError and other APIError subclasses.
The error handling pattern is correct for HTTPException construction, but exception coverage is incomplete. Models.list() from llama_stack_client can raise APIConnectionError, APIStatusError (with subclasses like BadRequestError, AuthenticationError, NotFoundError, InternalServerError), and APIError. Currently, only APIConnectionError is caught at line 30. Other exceptions—particularly API status errors (4xx/5xx responses) and serialization errors—will propagate uncaught.
Add exception handling for APIStatusError and map appropriate HTTP status responses:
- 5xx errors →
ServiceUnavailableResponse - 4xx errors → appropriate response type (e.g.,
NotFoundResponse,UnauthorizedResponse) - Generic
APIError→InternalServerErrorResponse
This is critical for metrics setup, as uncaught exceptions during initialization could crash the service startup.
🤖 Prompt for AI Agents
In src/metrics/utils.py around lines 28 to 32, the current try/except only
catches APIConnectionError from
AsyncLlamaStackClientHolder().get_client().models.list(); extend the exception
handling to also catch APIStatusError and APIError (and their subclasses) and
map them to appropriate HTTPException responses: for APIStatusError check the
status code and raise ServiceUnavailableResponse for 5xx, NotFoundResponse for
404, UnauthorizedResponse for 401/403, BadRequestResponse for 400, etc.; catch
generic APIError and raise InternalServerErrorResponse; ensure each branch
constructs the matching response object (backend_name="Llama Stack",
cause=str(e)) and raises HTTPException(**response.model_dump()) from e so
initialization failures are handled without crashing.
| @pytest.fixture(name="minimal_config") | ||
| def minimal_config_fixture() -> AppConfig: | ||
| """Create a minimal AppConfig with only required fields. | ||
| This fixture provides a minimal valid configuration that can be used | ||
| in tests that don't need specific configuration values. It includes | ||
| only the required fields to avoid unnecessary instantiation. | ||
| Returns: | ||
| AppConfig: A minimal AppConfig instance with required fields only. | ||
| """ | ||
| cfg = AppConfig() | ||
| cfg.init_from_dict( | ||
| { | ||
| "name": "test", | ||
| "service": {"host": "localhost", "port": 8080}, | ||
| "llama_stack": { | ||
| "api_key": "test-key", | ||
| "url": "http://test.com:1234", | ||
| "use_as_library_client": False, | ||
| }, | ||
| "user_data_collection": {}, | ||
| "authentication": {"module": "noop"}, | ||
| "authorization": {"access_rules": []}, | ||
| } | ||
| ) | ||
| return cfg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minimal_config leaks global AppConfig state across tests
AppConfig is a singleton; calling cfg.init_from_dict(...) here mutates the global configuration for the whole process. Because the fixture doesn’t restore the previous state, any test that uses minimal_config can affect later tests that rely on a different configuration or on “no configuration loaded”, making the suite order-dependent.
You can keep tests isolated by snapshotting and restoring the internal state in the fixture teardown.
-@pytest.fixture(name="minimal_config")
-def minimal_config_fixture() -> AppConfig:
+@pytest.fixture(name="minimal_config")
+def minimal_config_fixture() -> Generator[AppConfig, None, None]:
@@
- cfg = AppConfig()
- cfg.init_from_dict(
+ cfg = AppConfig()
+
+ # Snapshot current singleton state so we can restore it after the test
+ # pylint: disable=protected-access
+ original_configuration = cfg._configuration
+ original_conversation_cache = cfg._conversation_cache
+ original_quota_limiters = cfg._quota_limiters
+
+ cfg.init_from_dict(
{
@@
- "authorization": {"access_rules": []},
+ "authorization": {"access_rules": []},
}
)
- return cfg
+ try:
+ yield cfg
+ finally:
+ # Restore previous configuration and cached resources
+ cfg._configuration = original_configuration
+ cfg._conversation_cache = original_conversation_cache
+ cfg._quota_limiters = original_quota_limiters(Based on learnings about AppConfig being a singleton with cached state.)
Committable suggestion skipped: line range outside the PR's diff.
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, very nice work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/authentication/k8s.py (1)
133-158: Confirm initialization ofversion_databefore the try block in_get_cluster_id.The review comment correctly identifies a potential
UnboundLocalErrorat line 150. Ifget_cluster_custom_object()raises aTypeErrorbefore the variable assignment completes,version_dataremains undefined. The subsequent log statement at line 150 would then fail withUnboundLocalError, bypassing the intendedClusterIDUnavailableErrorhandling.The suggested fix is valid:
@classmethod def _get_cluster_id(cls) -> str: version_data: dict[str, Any] | None = None try: custom_objects_api = cls.get_custom_objects_api() version_data = custom_objects_api.get_cluster_custom_object( "config.openshift.io", "v1", "clusterversions", "version" ) cluster_id = version_data["spec"]["clusterID"] cls._cluster_id = cluster_id return cluster_id except TypeError as e: logger.error( "Failed to get cluster_id, version object is: %s", version_data ) raise ClusterIDUnavailableError("Failed to get cluster ID") from eInitialize
version_datatoNonebefore the try block to ensure it is always defined and safe to log.tests/unit/app/endpoints/test_streaming_query.py (1)
116-145: Fix assertion placement and connection‑error mocking in streaming tests.Two issues in the new/updated tests:
Assertions inside
pytest.raisesblock (Lines 166‑168)
Intest_streaming_query_endpoint_handler_configuration_not_loaded, the assertions are placed after the awaited call but still inside thewith pytest.raises(...)block. Oncestreaming_query_endpoint_handlerraises, the rest of the block is skipped, so the status/detail assertions never execute. Move the asserts outside thewithblock:with pytest.raises(HTTPException) as e: await streaming_query_endpoint_handler(request, query_request, auth=MOCK_AUTH) assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR assert e.value.detail["response"] == "Configuration is not loaded" # type: ignoreMocking the wrong attribute for connection errors (Lines 189‑191)
Production code atsrc/app/endpoints/streaming_query.py:862callsawait client.models.list(), but the test setsmock_client.models.side_effect = APIConnectionError(...). That won't affectmodels.list(). Change to:mock_client.models.list.side_effect = APIConnectionError(request=query_request) # type: ignoreso that the
models.list()call actually raises and exercises the connection‑error path.The rest of the refactor in this file (real
QueryRequestusage in multiple tests, the configuration autouse fixture, and the MCP/streaming flow assertions) looks coherent with the new error‑handling design.
♻️ Duplicate comments (3)
tests/unit/app/endpoints/test_providers.py (1)
182-182: Docstring still says 500 while test now expects 503.The test asserts
HTTP_503_SERVICE_UNAVAILABLEand"Unable to connect to Llama Stack"at lines 199 and 202, but the docstring still references HTTP 500.Apply this diff to update the docstring:
- """Test that /providers/{provider_id} raises HTTP 500 if Llama Stack connection fails.""" + """Test that /providers/{provider_id} raises HTTP 503 if Llama Stack connection fails."""src/metrics/utils.py (1)
5-6: Metrics error handling improved; consider also catching genericAPIError.The new flow (
check_configuration_loaded+ catchingAPIConnectionError/APIStatusErrorand mapping them toServiceUnavailableResponseviaHTTPException(**response.model_dump())) is a solid alignment with the standardized error model.You still don’t cover generic
APIErrorsubclasses fromllama_stack_client(e.g., parsing/serialization or unexpected API errors). Those will surface as generic 500s instead of the structured 503-style backend error you now use for the other failure modes. If you want full consistency, consider extending theexceptclause to includeAPIErroras well and mapping it either toServiceUnavailableResponseorInternalServerErrorResponse.Also applies to: 9-10, 16-19, 23-32
src/models/responses.py (1)
720-782: FixConversationDeleteResponse.successand align the “not found” example
ConversationDeleteResponse.__init__ignores thedeletedflag and always setssuccess=True, so clients seesuccess: Trueeven when the conversation was not deleted. The"not found"example also showssuccess: Trueand uses a different message text than the runtime path.Consider updating both the initializer and the example to reflect the real outcome:
def __init__(self, *, deleted: bool, conversation_id: str) -> None: @@ response_msg = ( "Conversation deleted successfully" if deleted else "Conversation cannot be deleted" ) super().__init__( conversation_id=conversation_id, # type: ignore[call-arg] - success=True, # type: ignore[call-arg] + success=deleted, # type: ignore[call-arg] response=response_msg, # type: ignore[call-arg] ) @@ { "label": "not found", "value": { "conversation_id": "123e4567-e89b-12d3-a456-426614174000", - "success": True, - "response": "Conversation can not be deleted", + "success": False, + "response": "Conversation cannot be deleted", }, },
🧹 Nitpick comments (5)
tests/unit/app/endpoints/test_providers.py (1)
184-184: Consider removing redundant authorization mock.Unlike other tests that use
minimal_config(lines 43, 67, 114, 151), this test additionally callsmock_authorization_resolvers. Sinceminimal_configis already being used, this call may be redundant.If the mock is unnecessary, apply this diff:
- mocker.patch("app.endpoints.providers.configuration", minimal_config) - mock_authorization_resolvers(mocker) + mocker.patch("app.endpoints.providers.configuration", minimal_config)tests/e2e/features/feedback.feature (1)
111-115: Structured error response looks good.The change to a nested detail object with "response" and "cause" fields aligns with the PR's standardization goal.
Minor inconsistency: Line 113 ends with a period (
"Storing feedback is disabled.") while line 112 doesn't ("Storing feedback is disabled"). Consider making the punctuation consistent.src/authentication/k8s.py (1)
174-207: K8s auth flow is consistent with standardized responses; consider distinguishing backend failures from invalid tokens.The new authentication flow looks good overall:
extract_user_token(request.headers)centralizes header parsing and maps missing/malformed headers to structuredUnauthorizedResponses.get_user_infonow maps failures to obtain the authn API to aServiceUnavailableResponse("Kubernetes API", ...)viaHTTPException(**response.model_dump()), which is appropriate.K8SAuthDependency.__call__:
- Treats
user_info is Noneas 401 viaUnauthorizedResponse(cause="Invalid or expired Kubernetes token").- Maps kube:admin cluster‑ID failures to a 500
InternalServerErrorResponse.- Maps SAR API failures to a 503
ServiceUnavailableResponseand authorization denials toForbiddenResponse.endpoint.One nuance: in
get_user_info, any exception raised byauth_api.create_token_review(token_review)is caught broadly and results inNone, which then becomes a 401 “Invalid or expired Kubernetes token”. That conflates Kubernetes API outages or server‑side errors with bad credentials. Given the rest of this PR’s push toward differentiating backend failures (5xx/503) from client errors (401/403), you might want to:
- Catch
ApiException(and possibly connection‑level exceptions) fromcreate_token_reviewand map them to aServiceUnavailableResponse("Kubernetes API", "Unable to validate token")viaHTTPException.- Reserve the
user_info is None→ 401 path for genuinely unauthenticated tokens.If you’re comfortable preserving the previous behavior where TokenReview failures looked like invalid tokens, the current implementation is still consistent—but calling out this distinction now may save debugging time later.
Also applies to: 228-285
src/models/responses.py (1)
367-370: Preferdefault_factory=listforrag_chunks
rag_chunkscurrently usesField([], ...)as the default. While Pydantic deep‑copies defaults, usingdefault_factory=listis clearer and avoids any confusion about shared mutable defaults.- rag_chunks: list[RAGChunk] = Field( - [], - description="List of RAG chunks used to generate the response", - ) + rag_chunks: list[RAGChunk] = Field( + default_factory=list, + description="List of RAG chunks used to generate the response", + )You might apply the same pattern anywhere else you later add list defaults for consistency.
tests/unit/app/endpoints/test_shields.py (1)
170-216: Tighten shields “list” tests and remove unusedcfgsetupIn both
test_shields_endpoint_handler_unable_to_retrieve_shields_listandtest_shields_endpoint_handler_success_with_shields_data:
- A real
AppConfig(cfg) is constructed but never used; instead,configurationis patched with a plainMock.- Assertions are very weak (
response is not None/ minimal shape checks), so these tests would still pass even if the handler returned an unexpected status or payload type.Consider:
Either patch
app.endpoints.shields.configurationwithcfg(for consistency with other tests), or drop the unusedcfgcreation.Strengthen assertions along the lines of:
assert isinstance(response, ShieldsResponse)assert response.shields == []for the “unable_to_retrieve_shields_list” case.- For the “success_with_shields_data” case, assert identifiers and list length (similar to what you already do, but you can also assert the response type).
This will better validate the endpoint behavior and keep the test setup aligned with the configuration usage pattern.
Also applies to: 285-352
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
src/authentication/k8s.py(4 hunks)src/authentication/utils.py(1 hunks)src/authorization/middleware.py(4 hunks)src/metrics/utils.py(2 hunks)src/models/responses.py(19 hunks)tests/e2e/features/feedback.feature(4 hunks)tests/unit/app/endpoints/test_config.py(4 hunks)tests/unit/app/endpoints/test_feedback.py(6 hunks)tests/unit/app/endpoints/test_providers.py(7 hunks)tests/unit/app/endpoints/test_shields.py(8 hunks)tests/unit/app/endpoints/test_streaming_query.py(6 hunks)tests/unit/authentication/test_noop_with_token.py(4 hunks)tests/unit/authentication/test_utils.py(3 hunks)tests/unit/models/responses/test_error_responses.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/unit/authentication/test_utils.py
- tests/unit/app/endpoints/test_config.py
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-09-02T11:09:40.404Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/features/environment.py:87-95
Timestamp: 2025-09-02T11:09:40.404Z
Learning: In the lightspeed-stack e2e tests, noop authentication tests use the default lightspeed-stack.yaml configuration, while noop-with-token tests use the Authorized tag to trigger a config swap to the specialized noop-with-token configuration file.
Applied to files:
tests/unit/authentication/test_noop_with_token.py
📚 Learning: 2025-09-02T11:15:02.411Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/test_list.txt:2-3
Timestamp: 2025-09-02T11:15:02.411Z
Learning: In the lightspeed-stack e2e tests, the Authorized tag is intentionally omitted from noop authentication tests because they are designed to test against the default lightspeed-stack.yaml configuration rather than the specialized noop-with-token configuration.
Applied to files:
tests/unit/authentication/test_noop_with_token.py
📚 Learning: 2025-08-19T08:57:27.714Z
Learnt from: onmete
Repo: lightspeed-core/lightspeed-stack PR: 417
File: src/lightspeed_stack.py:60-63
Timestamp: 2025-08-19T08:57:27.714Z
Learning: In the lightspeed-stack project, file permission hardening (chmod 0o600) for stored configuration JSON files is not required as it's not considered a security concern in their deployment environment.
Applied to files:
src/models/responses.py
🧬 Code graph analysis (10)
src/authentication/k8s.py (2)
src/authentication/utils.py (1)
extract_user_token(8-30)src/models/responses.py (5)
ForbiddenResponse(1253-1358)InternalServerErrorResponse(1544-1652)ServiceUnavailableResponse(1655-1685)UnauthorizedResponse(1178-1250)endpoint(1325-1329)
tests/unit/app/endpoints/test_providers.py (2)
src/app/endpoints/providers.py (2)
get_provider_endpoint_handler(118-160)providers_endpoint_handler(56-92)src/configuration.py (2)
configuration(73-77)AppConfig(39-181)
src/authentication/utils.py (1)
src/models/responses.py (1)
UnauthorizedResponse(1178-1250)
tests/unit/app/endpoints/test_shields.py (4)
src/app/endpoints/shields.py (1)
shields_endpoint_handler(42-82)src/configuration.py (2)
configuration(73-77)AppConfig(39-181)src/models/responses.py (1)
ShieldsResponse(128-152)src/client.py (1)
get_client(49-55)
tests/unit/app/endpoints/test_streaming_query.py (2)
src/configuration.py (2)
configuration(73-77)AppConfig(39-181)tests/unit/utils/auth_helpers.py (1)
mock_authorization_resolvers(8-26)
src/authorization/middleware.py (2)
src/models/responses.py (4)
ForbiddenResponse(1253-1358)InternalServerErrorResponse(1544-1652)generic(1599-1604)endpoint(1325-1329)src/authorization/resolvers.py (3)
check_access(123-124)check_access(134-138)check_access(171-188)
src/metrics/utils.py (3)
src/configuration.py (1)
configuration(73-77)src/models/responses.py (1)
ServiceUnavailableResponse(1655-1685)src/utils/endpoints.py (1)
check_configuration_loaded(116-130)
src/models/responses.py (2)
src/quota/quota_exceed_error.py (1)
QuotaExceedError(6-38)src/models/config.py (3)
config(140-146)Action(329-377)Configuration(623-649)
tests/unit/models/responses/test_error_responses.py (3)
src/models/responses.py (23)
AbstractErrorResponse(1087-1140)BadRequestResponse(1143-1175)DetailModel(1080-1084)ForbiddenResponse(1253-1358)InternalServerErrorResponse(1544-1652)QuotaExceededResponse(1461-1541)ServiceUnavailableResponse(1655-1685)UnauthorizedResponse(1178-1250)UnprocessableEntityResponse(1419-1458)openapi_response(45-58)openapi_response(788-812)openapi_response(1117-1140)conversation(1313-1322)endpoint(1325-1329)feedback_disabled(1332-1337)model(1522-1526)from_exception(1529-1533)generic(1599-1604)configuration_not_loaded(1607-1612)feedback_path_invalid(1615-1620)query_failed(1623-1628)cache_unavailable(1631-1636)database_error(1639-1644)src/quota/quota_exceed_error.py (1)
QuotaExceedError(6-38)docs/demos/lcore/js/reveal.js (1)
result(2444-2444)
tests/unit/app/endpoints/test_feedback.py (5)
src/app/endpoints/feedback.py (6)
assert_feedback_enabled(74-89)feedback_endpoint_handler(94-123)feedback_status(159-173)is_feedback_enabled(61-71)store_feedback(126-155)update_feedback_status(178-215)src/configuration.py (2)
configuration(73-77)AppConfig(39-181)src/models/config.py (2)
config(140-146)UserDataCollection(223-256)src/models/requests.py (2)
FeedbackRequest(252-395)FeedbackStatusUpdateRequest(398-423)tests/unit/utils/auth_helpers.py (1)
mock_authorization_resolvers(8-26)
🪛 GitHub Check: Bandit
src/models/responses.py
[warning] 1032-1033:
Probable insecure usage of temp file/directory.
[warning] 1032-1033:
Probable insecure usage of temp file/directory.
[warning] 1039-1039:
Probable insecure usage of temp file/directory.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: e2e_tests (azure)
- GitHub Check: e2e_tests (ci)
- GitHub Check: build-pr
🔇 Additional comments (10)
tests/e2e/features/feedback.feature (3)
262-262: Correct status code for authentication failure.The change from 400 to 401 is correct per HTTP semantics—401 (Unauthorized) is the appropriate status for missing or invalid authentication credentials. The structured error response with distinct "response" (general) and "cause" (specific) fields provides clearer diagnostic information.
Also applies to: 267-268
277-277: Consistent unauthorized handling.The status code correction to 401 and structured error format match the previous unauthorized scenario, ensuring consistency across the feature file.
Also applies to: 282-283
308-309: Error messages verified—no issues found.The E2E test error messages match the backend implementation. The backend's
InternalServerErrorResponse.feedback_path_invalid()method generates exactly the same messages:"Failed to store feedback"with cause"Failed to store feedback at directory: {path}". Unit and integration tests confirm this format is correctly returned by the endpoint.src/authorization/middleware.py (1)
4-24: Standardized authorization error responses look correct.Using
InternalServerErrorResponse.generic()for unknown auth modules / missingauthandForbiddenResponse.endpoint(user_id=auth[0])for access denial aligns with the new error model and keeps semantics clear. TheHTTPException(**response.model_dump())pattern here is fine as long as the response model continues to dumpstatus_codeanddetailin the shape FastAPI expects.Also applies to: 66-69, 84-85, 93-94
tests/unit/app/endpoints/test_streaming_query.py (1)
1715-1755: Quota‑exceeded test expectations align with the new error response.
test_query_endpoint_quota_exceededcorrectly asserts HTTP 429 and checks that the structured detail includes the updated"response": "The model quota has been exceeded"text and mentions the model name in"cause". This matches the standardized quota‑exceeded error semantics introduced in the PR.tests/unit/app/endpoints/test_feedback.py (2)
33-52: Feedback tests correctly exercise new configuration and structured error responses.
test_is_feedback_enabled/test_is_feedback_disabledand the newfeedback_status_*tests mockAppConfig._configuration.user_data_collectionwithUserDataCollectionand verify thatis_feedback_enabled/feedback_statusreturn the expected booleans and status payloads; this mirrors how production code reads configuration.test_assert_feedback_enabled_disablednow asserts that the 403 detail is a dict with"response": "Storing feedback is disabled"and"cause": "Storing feedback is disabled.", matchingForbiddenResponse.feedback_disabled().test_feedback_endpoint_handler_errorandtest_store_feedback_on_io_errorboth simulate filesystem failures and correctly assert a 500 with detail containing"response": "Failed to store feedback"and a"cause"mentioning the directory, in line withInternalServerErrorResponse.feedback_path_invalid(...).test_store_feedbackandtest_feedback_endpoint_valid_requestsvalidate successful persistence (and the stored JSON shape) and successful endpoint behavior for various payloads.These updates keep the tests aligned with the new standardized response models without over‑mocking production behavior.
Also applies to: 65-67, 123-155, 177-245, 247-285, 296-310, 312-345
55-77: The review comment is based on incorrect assumptions about the pytest configuration.The codebase has
asyncio_mode = "auto"enabled inpyproject.toml(under[tool.pytest.ini_options]). With this setting, pytest-asyncio automatically handles async tests without requiring the@pytest.mark.asynciodecorator. The teststest_assert_feedback_enabled_disabledandtest_assert_feedback_enabledwill be properly awaited and executed, despite lacking the decorator.While there is some stylistic inconsistency in the file (some async tests have the decorator, some don't), this has no functional impact on test execution. The review's warning only applies if
asyncio_modewere set to something other than"auto", which is not the case here.Likely an incorrect or invalid review comment.
src/authentication/utils.py (1)
5-28:extract_user_tokenerror handling now matches the documented contractThe function’s behavior (raising a structured 401
HTTPExceptionviaUnauthorizedResponsefor missing/malformed Authorization headers) is consistent with the updated docstring and the response model. No further changes needed here.tests/unit/authentication/test_noop_with_token.py (1)
67-88: Tests correctly assert 401 and structureddetailfor noop-with-token auth failuresThe updated tests now verify both the 401 status code and the
{response, cause}structure onHTTPException.detail, matchingUnauthorizedResponsebehavior. The use ofcast(dict[str, str], exc_info.value.detail)is appropriate for type checking.Also applies to: 108-111
tests/unit/models/responses/test_error_responses.py (1)
1-687: Strong, comprehensive coverage for error response modelsThis test module does a good job exercising constructors, factory helpers, and
openapi_response()for all error types, including edge cases like missing labels. It tightly couples tests to the documented examples and status codes, which should make future regressions in error payloads or OpenAPI docs very visible.
| with pytest.raises(HTTPException) as e: | ||
| await shields_endpoint_handler(request=request, auth=auth) | ||
| assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR | ||
| assert e.detail["response"] == "Configuration is not loaded" | ||
| assert e.value.detail["response"] == "Configuration is not loaded" # type: ignore | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move assertions outside the pytest.raises block
In test_shields_endpoint_handler_configuration_not_loaded, the assertions are inside the with pytest.raises(...) block. Once shields_endpoint_handler raises, the remaining lines in the block are not executed, so the status and detail aren’t actually checked.
Consider:
- with pytest.raises(HTTPException) as e:
- await shields_endpoint_handler(request=request, auth=auth)
- assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
- assert e.value.detail["response"] == "Configuration is not loaded" # type: ignore
+ with pytest.raises(HTTPException) as e:
+ await shields_endpoint_handler(request=request, auth=auth)
+ assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
+ assert e.value.detail["response"] == "Configuration is not loaded" # type: ignore🤖 Prompt for AI Agents
In tests/unit/app/endpoints/test_shields.py around lines 39-43, the test places
assertions inside the with pytest.raises(HTTPException) block so they never run
after the exception is raised; move the assert lines outside the with block: use
the context manager only to capture the exception (e.g., with pytest.raises(...)
as e: await shields_endpoint_handler(...)), then after the block assert
e.value.status_code and e.value.detail["response"] == "Configuration is not
loaded" so the exception is both raised and validated.
| with pytest.raises(HTTPException) as e: | ||
| await shields_endpoint_handler(request=request, auth=auth) | ||
| assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR | ||
| assert e.detail["response"] == "Unable to connect to Llama Stack" | ||
| assert e.value.status_code == status.HTTP_503_SERVICE_UNAVAILABLE | ||
| assert e.value.detail["response"] == "Service unavailable" # type: ignore | ||
| assert "Unable to connect to Llama Stack" in e.value.detail["cause"] # type: ignore | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same pytest.raises indentation issue in connection-error test
test_shields_endpoint_llama_stack_connection_error has the same pattern: assertions live inside the with pytest.raises(...) block and are never executed after the exception is raised.
Suggested fix:
- with pytest.raises(HTTPException) as e:
- await shields_endpoint_handler(request=request, auth=auth)
- assert e.value.status_code == status.HTTP_503_SERVICE_UNAVAILABLE
- assert e.value.detail["response"] == "Service unavailable" # type: ignore
- assert "Unable to connect to Llama Stack" in e.value.detail["cause"] # type: ignore
+ with pytest.raises(HTTPException) as e:
+ await shields_endpoint_handler(request=request, auth=auth)
+ assert e.value.status_code == status.HTTP_503_SERVICE_UNAVAILABLE
+ assert e.value.detail["response"] == "Service unavailable" # type: ignore
+ assert "Unable to connect to Llama Stack" in e.value.detail["cause"] # type: ignore📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| with pytest.raises(HTTPException) as e: | |
| await shields_endpoint_handler(request=request, auth=auth) | |
| assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR | |
| assert e.detail["response"] == "Unable to connect to Llama Stack" | |
| assert e.value.status_code == status.HTTP_503_SERVICE_UNAVAILABLE | |
| assert e.value.detail["response"] == "Service unavailable" # type: ignore | |
| assert "Unable to connect to Llama Stack" in e.value.detail["cause"] # type: ignore | |
| with pytest.raises(HTTPException) as e: | |
| await shields_endpoint_handler(request=request, auth=auth) | |
| assert e.value.status_code == status.HTTP_503_SERVICE_UNAVAILABLE | |
| assert e.value.detail["response"] == "Service unavailable" # type: ignore | |
| assert "Unable to connect to Llama Stack" in e.value.detail["cause"] # type: ignore |
🤖 Prompt for AI Agents
In tests/unit/app/endpoints/test_shields.py around lines 271 to 276, the
assertions for the connection-error test are placed inside the with
pytest.raises(HTTPException) block so they never run; move the awaits that are
expected to raise into the with block, then dedent the assert lines so they
execute after the context manager (i.e., use with pytest.raises(...) as e: await
shields_endpoint_handler(...), then on the next lines outside the with assert
e.value.status_code == ..., assert e.value.detail["response"] == ..., and assert
"Unable to connect to Llama Stack" in e.value.detail["cause"]).
Description
Type of change
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
Bug Fixes
Improvements
New Features
✏️ Tip: You can customize this high-level summary in your review settings.