RSPEED-2827: validate model exists before rlsapi_v1 inference by major · Pull Request #1471 · lightspeed-core/lightspeed-stack

major · 2026-04-07T17:37:09Z

Description

The rlsapi_v1 /infer endpoint skips model existence validation when
default_model and default_provider are both configured. If either
value has a typo or the model was removed from Llama Stack, inference
fails with an opaque 500 instead of a useful error.

This adds check_model_configured() before inference, returning a
clean 404 with NotFoundResponse when the model isn't found. Matches
the existing pattern in responses.py (line 208).

Validation is extracted into _resolve_validated_model_id() so
infer_endpoint complexity stays at B(9).

Type of change

Bug fix

Tools used to create PR

Assisted-by: Claude (opencode)
Generated by: N/A

Related Tickets & Documents

Related Issue: RSPEED-2827

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

uv run pytest tests/unit/app/endpoints/test_rlsapi_v1.py - 64 tests pass
uv run make verify - all linters clean
uv run radon cc src/app/endpoints/rlsapi_v1.py -s -n C - no C+ ratings
New test test_infer_model_not_found_returns_404 verifies the 404 path

Summary by CodeRabbit

Bug Fixes
- Inference requests now validate that the selected model exists and return a clear 404 if the model is not available.
Documentation
- The infer endpoint’s API documentation now includes the 404 "model not found" response.
Tests
- Added tests and fixtures to verify the 404 behavior when a model is missing.

coderabbitai · 2026-04-07T17:37:26Z

Walkthrough

Adds model-existence validation to the inference endpoint: a helper resolves the default model ID and checks it with Llama Stack, causing the endpoint to return 404 if the model is not configured. OpenAPI responses updated; unit and integration tests mock and verify the new behavior.

Changes

Cohort / File(s)	Summary
Endpoint — model validation `src/app/endpoints/rlsapi_v1.py`	Added `_resolve_validated_model_id()` which calls `_get_default_model_id()` and `check_model_configured()`; `infer` endpoint now uses this helper and short-circuits with a 404 `NotFoundResponse` when the model is not configured. Added `NotFoundResponse` import and extended OpenAPI `infer_responses` with `404`.
Unit tests — infer 404 case & autouse fixture `tests/unit/app/endpoints/test_rlsapi_v1.py`	Added an `autouse` fixture `mock_model_configured` that patches `check_model_configured` to return `True` by default; added `test_infer_model_not_found_returns_404` which patches it to `False` and asserts the endpoint raises `HTTPException` 404 with expected detail.
Integration tests — autouse fixture `tests/integration/endpoints/test_rlsapi_v1_integration.py`	Added an `autouse` fixture `mock_model_configured` (AsyncMock returning `True`) so integration tests bypass model-existence checks by default.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant API as API (rlsapi_v1)
  participant Llama as Llama Stack

  Client->>API: POST /infer (no model specified)
  API->>API: _get_default_model_id()
  API->>Llama: check_model_configured(model_id)
  alt model configured
    Llama-->>API: true
    API->>API: proceed with inference flow
    API-->>Client: 200 response (inference result)
  else model not configured
    Llama-->>API: false
    API-->>Client: 404 NotFoundResponse (model not found)
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main change: adding model validation before rlsapi_v1 inference to return 404 errors instead of opaque 500s.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/app/endpoints/rlsapi_v1.py`:
- Line 695: The OpenAPI responses for infer_endpoint are missing the 404 path
that can be returned by _resolve_validated_model_id(); update the
infer_responses declaration to include a 404 response (e.g., status code "404"
with an appropriate description and error schema or reference used by other
endpoints) so the infer_endpoint contract advertises "model not found" responses
returned by _resolve_validated_model_id().

In `@tests/unit/app/endpoints/test_rlsapi_v1.py`:
- Around line 563-565: The test currently asserts on a lowercased string of the
exception detail which is brittle; update the assertions in test_rlsapi_v1
(around the failing block using exc_info) to treat exc_info.value.detail as a
structured dict and assert on specific fields: check that detail["response"]
equals status.HTTP_404_NOT_FOUND (or the expected response text) and assert
detail["cause"] contains or equals the expected cause message (use
case-insensitive comparison if needed). Reference the existing exc_info variable
and the detail attribute to locate and replace the two string-based asserts with
direct assertions on detail["response"] and detail["cause"].

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ca91af11-5b3a-416c-8a7b-84a0c1291f51

📥 Commits

Reviewing files that changed from the base of the PR and between 6767256 and 153d6ae.

📒 Files selected for processing (2)

src/app/endpoints/rlsapi_v1.py
tests/unit/app/endpoints/test_rlsapi_v1.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: E2E Tests for Lightspeed Evaluation job
GitHub Check: build-pr
GitHub Check: E2E: server mode / ci
GitHub Check: E2E: library mode / ci

🧰 Additional context used

📓 Path-based instructions (4)

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async unit tests

Files:

tests/unit/app/endpoints/test_rlsapi_v1.py

tests/unit/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use pytest-mock for AsyncMock objects in unit tests

Files:

tests/unit/app/endpoints/test_rlsapi_v1.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules (e.g., from authentication import get_auth_dependency)
Use from llama_stack_client import AsyncLlamaStackClient for Llama Stack imports
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use complete type annotations for all class attributes; avoid using Any
Follow Google Python docstring conventions for all modules, classes, and functions
Include Parameters:, Returns:, Raises: sections in function docstrings as needed

Files:

src/app/endpoints/rlsapi_v1.py

src/app/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/app/**/*.py: Use from fastapi import APIRouter, HTTPException, Request, status, Depends for FastAPI dependencies
Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/rlsapi_v1.py

🧠 Learnings (3)

📓 Common learnings

Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1469
File: src/models/config.py:1928-1933
Timestamp: 2026-04-07T14:44:42.022Z
Learning: In lightspeed-core/lightspeed-stack, `allow_verbose_infer` (previously `customization.allow_verbose_infer`, now `rlsapi_v1.allow_verbose_infer`) is only used internally by the `rlsapi_v1` `/infer` endpoint and has a single known consumer (the PR author). Backward compatibility for this config field relocation is intentionally not required and should not be flagged in future reviews.

📚 Learning: 2026-04-05T12:19:36.009Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to tests/unit/**/*.py : Use `pytest-mock` for AsyncMock objects in unit tests

Applied to files:

tests/unit/app/endpoints/test_rlsapi_v1.py

📚 Learning: 2026-04-06T20:18:07.852Z

Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.

Applied to files:

src/app/endpoints/rlsapi_v1.py

🪛 GitHub Actions: Integration tests

src/app/endpoints/rlsapi_v1.py

[error] 256-256: infer_endpoint failed while resolving validated model id: 'if not await check_model_configured(client, model_id)' raised TypeError from mocked client.

[warning] authentication.noop: No-op authentication dependency in insecure mode (development-only).

🔇 Additional comments (2)

tests/unit/app/endpoints/test_rlsapi_v1.py (1)

149-160: Good default isolation for model validation path.

The autouse fixture is a solid baseline and correctly uses pytest-mock AsyncMock, making endpoint tests deterministic after the new validation call.

src/app/endpoints/rlsapi_v1.py (1)

254-257: No changes needed. The code already properly handles validation-time connection failures.

check_model_configured() explicitly catches APIConnectionError and converts it to HTTPException with ServiceUnavailableResponse (503). When _resolve_validated_model_id() calls this function, the exception is already handled within check_model_configured(), so any APIConnectionError will propagate as an HTTPException before reaching the caller. The docstring correctly documents that HTTPException: 503 can be raised during validation, and the endpoint's response schema includes the 503 status code.

The implementation is correct and compliant with the coding guidelines for handling APIConnectionError from Llama Stack.

Add check_model_configured() call to the /infer handler so a misconfigured default_model/default_provider gets a clear 404 instead of an opaque 500 from the inference call. Matches the existing pattern in responses.py. Extract validation into _resolve_validated_model_id() to keep infer_endpoint complexity at B(9). Signed-off-by: Major Hayden <major@redhat.com>

tisnik

LGTM

tisnik · 2026-04-08T16:42:38Z

@coderabbitai review

coderabbitai · 2026-04-08T16:42:46Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/app/endpoints/test_rlsapi_v1.py`:
- Around line 538-565: The test test_infer_model_not_found_returns_404 should
also assert that LLM inference is never invoked; patch the internal _call_llm
used by infer_endpoint (e.g. patch "app.endpoints.rlsapi_v1._call_llm") with a
mock/AsyncMock before calling infer_endpoint, then after the HTTPException
assertion add an assertion that that mock was not called (or not awaited) to
prove the function exited early when check_model_configured returned False.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f940374a-2e86-4aea-b5a1-1105785b0177

📥 Commits

Reviewing files that changed from the base of the PR and between 153d6ae and 0984fb5.

📒 Files selected for processing (3)

src/app/endpoints/rlsapi_v1.py
tests/integration/endpoints/test_rlsapi_v1_integration.py
tests/unit/app/endpoints/test_rlsapi_v1.py

📜 Review details

🧰 Additional context used

📓 Path-based instructions (4)

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async unit tests

Files:

tests/integration/endpoints/test_rlsapi_v1_integration.py
tests/unit/app/endpoints/test_rlsapi_v1.py

tests/unit/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Use pytest-mock for AsyncMock objects in unit tests

Files:

tests/unit/app/endpoints/test_rlsapi_v1.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules (e.g., from authentication import get_auth_dependency)
Use from llama_stack_client import AsyncLlamaStackClient for Llama Stack imports
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use complete type annotations for all class attributes; avoid using Any
Follow Google Python docstring conventions for all modules, classes, and functions
Include Parameters:, Returns:, Raises: sections in function docstrings as needed

Files:

src/app/endpoints/rlsapi_v1.py

src/app/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/app/**/*.py: Use from fastapi import APIRouter, HTTPException, Request, status, Depends for FastAPI dependencies
Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/rlsapi_v1.py

🧠 Learnings (3)

📓 Common learnings

Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1469
File: src/models/config.py:1928-1933
Timestamp: 2026-04-07T14:44:42.022Z
Learning: In lightspeed-core/lightspeed-stack, `allow_verbose_infer` (previously `customization.allow_verbose_infer`, now `rlsapi_v1.allow_verbose_infer`) is only used internally by the `rlsapi_v1` `/infer` endpoint and has a single known consumer (the PR author). Backward compatibility for this config field relocation is intentionally not required and should not be flagged in future reviews.

📚 Learning: 2026-04-05T12:19:36.009Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to tests/unit/**/*.py : Use `pytest-mock` for AsyncMock objects in unit tests

Applied to files:

tests/integration/endpoints/test_rlsapi_v1_integration.py
tests/unit/app/endpoints/test_rlsapi_v1.py

📚 Learning: 2026-04-06T20:18:07.852Z

Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.

Applied to files:

src/app/endpoints/rlsapi_v1.py

🔇 Additional comments (3)

tests/integration/endpoints/test_rlsapi_v1_integration.py (1)

93-99: Good isolation of model-validation side effects in integration tests.

This keeps existing integration coverage deterministic while the new /infer validation behavior is exercised in unit tests.

tests/unit/app/endpoints/test_rlsapi_v1.py (1)

149-159: Solid default fixture for new model validation behavior.

Using an autouse check_model_configured=True baseline keeps unrelated unit tests focused and stable.

src/app/endpoints/rlsapi_v1.py (1)

242-261: Nice bug fix: explicit model validation before inference with clear 404 semantics.

The new _resolve_validated_model_id() flow plus endpoint integration cleanly converts misconfigured models into deterministic API behavior.

Also applies to: 696-696

coderabbitai · 2026-04-08T16:47:11Z

+@pytest.mark.asyncio
+async def test_infer_model_not_found_returns_404(
+    mocker: MockerFixture,
+    mock_configuration: AppConfig,
+    mock_llm_response: None,
+    mock_auth_resolvers: None,
+    mock_request_factory: Callable[..., Any],
+    mock_background_tasks: Any,
+) -> None:
+    """Test /infer returns HTTP 404 when configured model does not exist in Llama Stack."""
+    mocker.patch(
+        "app.endpoints.rlsapi_v1.check_model_configured",
+        new=mocker.AsyncMock(return_value=False),
+    )
+
+    infer_request = RlsapiV1InferRequest(question="How do I list files?")
+    mock_request = mock_request_factory()
+
+    with pytest.raises(HTTPException) as exc_info:
+        await infer_endpoint(
+            infer_request=infer_request,
+            request=mock_request,
+            background_tasks=mock_background_tasks,
+            auth=MOCK_AUTH,
+        )
+    assert exc_info.value.status_code == status.HTTP_404_NOT_FOUND
+    assert exc_info.value.detail["response"] == "Model not found"  # type: ignore[index]
+    assert "gpt-4-turbo" in exc_info.value.detail["cause"]  # type: ignore[index]


🧹 Nitpick | 🔵 Trivial

Strengthen the 404 test by asserting LLM inference is not called.

The test validates error shape/status, but it should also prove early exit before _call_llm.

Suggested test hardening

async def test_infer_model_not_found_returns_404( mocker: MockerFixture, mock_configuration: AppConfig, - mock_llm_response: None, mock_auth_resolvers: None, mock_request_factory: Callable[..., Any], mock_background_tasks: Any, ) -> None: @@ mocker.patch( "app.endpoints.rlsapi_v1.check_model_configured", new=mocker.AsyncMock(return_value=False), ) + mock_call_llm = mocker.patch( + "app.endpoints.rlsapi_v1._call_llm", + new=mocker.AsyncMock(), + ) @@ assert exc_info.value.status_code == status.HTTP_404_NOT_FOUND assert exc_info.value.detail["response"] == "Model not found" # type: ignore[index] assert "gpt-4-turbo" in exc_info.value.detail["cause"] # type: ignore[index] + mock_call_llm.assert_not_called()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/app/endpoints/test_rlsapi_v1.py` around lines 538 - 565, The test test_infer_model_not_found_returns_404 should also assert that LLM inference is never invoked; patch the internal _call_llm used by infer_endpoint (e.g. patch "app.endpoints.rlsapi_v1._call_llm") with a mock/AsyncMock before calling infer_endpoint, then after the HTTPException assertion add an assertion that that mock was not called (or not awaited) to prove the function exited early when check_model_configured returned False.

coderabbitai Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread src/app/endpoints/rlsapi_v1.py

Comment thread tests/unit/app/endpoints/test_rlsapi_v1.py

major force-pushed the rlsapi-v1-model-validation branch from 153d6ae to 6c6a083 Compare April 7, 2026 17:44

major force-pushed the rlsapi-v1-model-validation branch from 6c6a083 to 0984fb5 Compare April 7, 2026 17:45

tisnik approved these changes Apr 8, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 8, 2026

View reviewed changes

tisnik merged commit db40e0b into lightspeed-core:main Apr 8, 2026
22 of 24 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RSPEED-2827: validate model exists before rlsapi_v1 inference#1471

RSPEED-2827: validate model exists before rlsapi_v1 inference#1471
tisnik merged 1 commit into
lightspeed-core:mainfrom
major:rlsapi-v1-model-validation

major commented Apr 7, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

tisnik left a comment

Uh oh!

tisnik commented Apr 8, 2026

Uh oh!

coderabbitai Bot commented Apr 8, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

major commented Apr 7, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

tisnik commented Apr 8, 2026

Uh oh!

coderabbitai Bot commented Apr 8, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

major commented Apr 7, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 7, 2026 •

edited

Loading