Skip to content

Conversation

@omertuc
Copy link
Contributor

@omertuc omertuc commented Aug 11, 2025

Description

In 3f7ed75 we accidentally stored the model ID as "<provider>/<model>", which is not what we want. We want to store the model ID as is, so we can re-use it later when the user omits it, without string manipulation.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

3f7ed75

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Modified unit tests. Made sure omitting the model ID and provider ID when giving the conversation ID actually works now.

Summary by CodeRabbit

  • New Features

    • None; no user-facing changes.
  • Refactor

    • Improved model selection to use a provider-qualified model identifier for response retrieval in query and streaming endpoints while preserving existing identifiers for logging and persistence.
    • No changes to public APIs.
  • Tests

    • Updated unit tests to unpack and validate the new triple return (qualified model ID, model ID, provider ID).
    • Adjusted mocks and assertions to cover the qualified model identifier behavior.

In 3f7ed75 we accidentally stored the
model ID as "<provider>/<model>", which is not what we want. We want to
store the model ID as is, so we can re-use it later when the user omits
it, without string manipulation.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 11, 2025

Walkthrough

The model selection utility now returns (llama_stack_model_id, model_id, provider_id). Both query and streaming_query handlers unpack the triple and pass llama_stack_model_id to retrieve_response, while continuing to use model_id/provider_id for metrics and persistence. Tests were updated to reflect the new return signature and assertions.

Changes

Cohort / File(s) Summary
Endpoint: query
src/app/endpoints/query.py
select_model_and_provider_id now returns (llama_stack_model_id, model_id, provider_id). Call sites updated to unpack three values and use llama_stack_model_id for retrieve_response. Return annotations and internal returns adjusted accordingly.
Endpoint: streaming_query
src/app/endpoints/streaming_query.py
Updated to unpack three values from select_model_and_provider_id; retrieve_response now uses llama_stack_model_id. No public signature changes.
Unit tests: query
tests/unit/app/endpoints/test_query.py
Tests updated to destructure three values and assert on llama_stack_model_id alongside model_id and provider_id. Mocks adjusted to return triples.
Unit tests: streaming_query
tests/unit/app/endpoints/test_streaming_query.py
Test patches updated to return triples from select_model_and_provider_id; assertions and tuple unpacking adapted.

Sequence Diagram(s)

sequenceDiagram
  actor Client
  participant API as Query/Streaming Handlers
  participant Selector as select_model_and_provider_id
  participant Retriever as retrieve_response

  Client->>API: request(model_id?, provider_id?)
  API->>Selector: select_model_and_provider_id(models, model_id, provider_id)
  Selector-->>API: (llama_stack_model_id, model_id, provider_id)
  API->>Retriever: retrieve_response(model_id=llama_stack_model_id, ...)
  Retriever-->>API: response
  API-->>Client: response (metrics use model_id/provider_id)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~15–20 minutes

Possibly related PRs

Suggested labels

bug

Suggested reviewers

  • manstis
  • tisnik

Poem

In burrows of code I twitch my ear,
A triple hops where a double was near.
Llama tracks guide the model’s way,
While metrics nibble on IDs each day.
Thump-thump! Retrieval’s on the right trail—
Provider/model never fails. 🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
src/app/endpoints/query.py (1)

263-272: Bug: fallback returns provider-prefixed identifier as model_id when selecting first LLM

When no model/provider hints are provided and you pick the first available LLM, you set:

  • model_id = model.identifier
  • return model_id, model_id, provider_id

If model.identifier is "provider/model", this persists "provider/model" as the plain model_id, reintroducing the bug this PR aims to fix. Extract the plain model name and return the proper triple.

Apply this diff:

         try:
             model = next(
                 m
                 for m in models
                 if m.model_type == "llm"  # pyright: ignore[reportAttributeAccessIssue]
             )
-            model_id = model.identifier
-            provider_id = model.provider_id
-            logger.info("Selected model: %s", model)
-            return model_id, model_id, provider_id
+            # Fully-qualified identifier as exposed by LlamaStack
+            llama_stack_model_id = model.identifier
+            provider_id = model.provider_id
+            # Persist the plain model id (without provider prefix) to DB/metrics
+            plain_model_id = llama_stack_model_id.split("/", 1)[-1]
+            logger.info("Selected model: %s", model)
+            return llama_stack_model_id, plain_model_id, provider_id

This ensures:

  • llama_stack_model_id: provider/model
  • model_id (plain): model
  • provider_id: provider
🧹 Nitpick comments (7)
src/app/endpoints/streaming_query.py (1)

424-429: Triple unpack aligns with the new contract

Destructuring (llama_stack_model_id, model_id, provider_id) here is correct and keeps model_id/provider_id available for persistence/metrics.

Add a brief inline comment to clarify the semantics of each tuple element to avoid future misuse.

src/app/endpoints/query.py (2)

242-243: Signature now returns a triple (llama_stack_model_id, model_id, provider_id)

Type hint update looks good. Consider clarifying the docstring to explicitly define each element’s semantics (full vs. plain identifiers).


300-301: Return ordering and values are correct in the validated path

Returning (llama_stack_model_id, model_id, provider_id) after validation matches the new contract and preserves the plain model_id for persistence.

Optionally, defensively guard against model_id containing '/' here by normalizing it before persistence to avoid future regressions if callers pass a provider-prefixed model.

tests/unit/app/endpoints/test_streaming_query.py (3)

268-269: Strengthen the mock to better validate tuple semantics

Currently the mock returns identical values for the first two positions, making it hard to catch regressions if the wrong element is passed to retrieve_response. Use distinct values and (in a test) assert the first element is used for the agent call.

Consider:

-        return_value=("fake_model_id", "fake_model_id", "fake_provider_id"),
+        # First element is the fully-qualified llama id; second is the plain model id
+        return_value=("provX/modelX", "modelX", "provX"),

Additionally, in test_auth_tuple_unpacking_in_streaming_query_endpoint_handler, add an assertion that retrieve_response was called with the fully-qualified id:

# After the handler call:
# args: (client, model_id, query_request, token, ...)
assert mock_retrieve_response.call_args[0][1] == "test_model"  # or "provX/modelX" if you adjust the mock here too

1281-1283: Updated mock triple looks consistent

Return shape matches the new contract. Consider the suggestion above to use distinct values for first/second to catch mis-wiring.


1327-1329: Updated mock triple looks consistent

Same note: distinct first/second values would improve regression detection.

tests/unit/app/endpoints/test_query.py (1)

135-136: Mock triple matches the new return signature

Good. To make tests more robust, prefer using a distinct first/second element (e.g., "provX/modelX" vs "modelX") when you want to assert which one is used where.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 93f1079 and 202aa1c.

📒 Files selected for processing (4)
  • src/app/endpoints/query.py (4 hunks)
  • src/app/endpoints/streaming_query.py (1 hunks)
  • tests/unit/app/endpoints/test_query.py (7 hunks)
  • tests/unit/app/endpoints/test_streaming_query.py (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: e2e_tests
🔇 Additional comments (8)
src/app/endpoints/streaming_query.py (1)

430-436: Correct identifier used for LlamaStack call

Passing llama_stack_model_id into retrieve_response ensures the agent receives the fully-qualified identifier while DB/metrics continue to use the plain model and provider IDs.

src/app/endpoints/query.py (1)

186-195: Using llama_stack_model_id for retrieval is correct; keep model/provider for metrics

The triple destructuring and forwarding llama_stack_model_id into retrieve_response preserves the intent: API calls use the fully-qualified id, while DB/metrics use the plain model and provider.

tests/unit/app/endpoints/test_streaming_query.py (1)

1374-1376: Updated mock triple looks consistent

No issues.

tests/unit/app/endpoints/test_query.py (5)

217-223: Asserting tuple decomposition from request inputs is correct

The assertions reflect the desired behavior: llama id includes provider/model, plain model is stored separately.


253-260: Asserting tuple decomposition from configuration defaults is correct

Looks good and aligned with the contract.


1140-1142: Mock triple matches the new return signature

Consistent with updated code paths.


1179-1181: Mock triple matches the new return signature

Consistent with updated code paths.


1218-1220: Mock triple matches the new return signature

Consistent with updated code paths.

Comment on lines +279 to 287
llama_stack_model_id, model_id, provider_id = select_model_and_provider_id(
model_list, query_request.model, query_request.provider
)

# Assert return the first available LLM model when no model/provider is
# specified in the request or in the configuration
assert llama_stack_model_id == "first_model"
assert model_id == "first_model"
assert provider_id == "provider1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Add coverage for fallback with provider-prefixed identifiers

This test uses identifiers without provider prefixes, so it won’t catch the regression where the fallback returns a provider-prefixed model as the plain model. Add a companion test where the first available LLM has identifier "provider1/model1" and assert that the plain model is "model1".

Proposed additional test:

def test_select_model_and_provider_id_first_from_list_with_prefixed_identifier(mocker):
    """Ensure fallback splits provider/model correctly for persistence."""
    model_list = [
        mocker.Mock(identifier="provider1/model1", model_type="llm", provider_id="provider1"),
        mocker.Mock(identifier="provider2/model2", model_type="llm", provider_id="provider2"),
    ]
    query_request = QueryRequest(query="What is OpenStack?")
    llama_stack_model_id, model_id, provider_id = select_model_and_provider_id(
        model_list, query_request.model, query_request.provider
    )
    assert llama_stack_model_id == "provider1/model1"
    assert model_id == "model1"        # plain model id
    assert provider_id == "provider1"

🏁 Script executed:

#!/bin/bash
# Locate the implementation of select_model_and_provider_id
rg "def select_model_and_provider_id" -n -A 10

Length of output: 920


🏁 Script executed:

#!/bin/bash
# Display the full implementation of select_model_and_provider_id for review
sed -n '240,340p' src/app/endpoints/query.py

Length of output: 3986


Ensure fallback handles provider-prefixed model identifiers

The current fallback branch in select_model_and_provider_id unconditionally assigns both returned IDs to model.identifier, so if that identifier already contains a provider prefix (e.g. "provider1/model1"), the plain model_id also comes back as "provider1/model1". We should:

  • Add a unit test to catch this case.
  • Update the fallback implementation to split out the plain model ID.

Locations to update:

  • tests/unit/app/endpoints/test_query.py
  • src/app/endpoints/query.py (fallback branch around line 250)

Proposed test addition:

def test_select_model_and_provider_id_first_with_prefixed_identifier(mocker):
    """Fallback should split provider/model correctly when identifier is prefixed."""
    model_list = [
        mocker.Mock(identifier="provider1/model1", model_type="llm", provider_id="provider1"),
        mocker.Mock(identifier="provider2/model2", model_type="llm", provider_id="provider2"),
    ]
    query_request = QueryRequest(query="foo")
    llama_stack_model_id, model_id, provider_id = select_model_and_provider_id(
        model_list, query_request.model, query_request.provider
    )
    assert llama_stack_model_id == "provider1/model1"
    assert model_id == "model1"        # plain model id
    assert provider_id == "provider1"

Proposed implementation diff in src/app/endpoints/query.py:

         model = next(m for m in models if m.model_type == "llm")
-        model_id = model.identifier
-        provider_id = model.provider_id
-        return model_id, model_id, provider_id
+        full_id = model.identifier
+        provider_id = model.provider_id
+        # split off plain model ID if a prefix is present
+        plain_id = full_id.split("/", 1)[1] if "/" in full_id else full_id
+        return full_id, plain_id, provider_id
🤖 Prompt for AI Agents
In tests/unit/app/endpoints/test_query.py around lines 279-287, add a unit test
that supplies a model_list whose first model.identifier is provider-prefixed
(e.g. "provider1/model1") and asserts that select_model_and_provider_id returns
llama_stack_model_id as the full "provider1/model1", model_id as the plain
"model1", and provider_id as "provider1"; and in src/app/endpoints/query.py
around line ~250 update the fallback branch so that when using model.identifier
you detect a provider prefix (split on the first '/' if present), return the
full identifier as llama_stack_model_id, the right-hand segment as model_id, and
the provider part as provider_id (fall back to existing model.provider_id when
no prefix), keeping other behavior unchanged.

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tisnik tisnik merged commit 32ec6dd into lightspeed-core:main Aug 11, 2025
18 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Sep 2, 2025
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants