Skip to content

Conversation

@tisnik
Copy link
Contributor

@tisnik tisnik commented Aug 19, 2025

Description

LCORE-437: check Llama Stack version on startup

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

  • Related Issue #LCORE-437

Summary by CodeRabbit

  • New Features

    • Startup now validates Llama Stack compatibility; the service won’t start on incompatible versions and provides a clear admin-facing message.
  • Documentation

    • Added guidance on the Llama Stack version check.
    • Expanded service configuration docs with CORS defaults.
  • Tests

    • Introduced unit tests covering version compatibility scenarios.
  • Chores

    • Updated project dependencies to support semantic version parsing and improved testing utilities.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 19, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds startup-time Llama Stack version validation using new constants and a utility module. Integrates the check into the stack initialization flow. Updates documentation and service configuration notes. Declares new dependencies (semver, pytest-subtests). Adds unit tests covering version boundaries and error messaging.

Changes

Cohort / File(s) Summary
Documentation updates
README.md
Documents Llama Stack version check and CORS defaults; updates TOC.
Dependencies
pyproject.toml
Adds semver (runtime) and pytest-subtests (dev) dependencies.
Version constants
src/constants.py
Introduces MINIMAL_SUPPORTED_LLAMA_STACK_VERSION and MAXIMAL_SUPPORTED_LLAMA_STACK_VERSION (both "0.2.17").
Startup integration
src/lightspeed_stack.py
Imports and invokes async version check after obtaining AsyncLlamaStack client; blocks startup on incompatibility.
Version check utility
src/utils/llama_stack_version.py
Adds async check_llama_stack_version and compare_versions; validates current version within [min, max] via semver and raises on mismatch.
Unit tests
tests/unit/utils/test_llama_stack_version.py
Tests in-range and out-of-range scenarios, including boundary and larger versions using subtests and AsyncMock.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Admin as Admin/Operator
  participant Service as Service Startup
  participant Llama as Llama Stack (Client)
  participant Util as Version Check Utility

  Admin->>Service: Start service
  Service->>Llama: Initialize client
  Service->>Util: check_llama_stack_version(client)
  Util->>Llama: inspect.version()
  Llama-->>Util: version string
  Util->>Util: compare_versions(current, MIN, MAX)
  alt Version within [MIN, MAX]
    Util-->>Service: OK
    Service->>Service: Continue startup (dump config / start server)
  else Version out of range
    Util-->>Service: Raise Exception (incompatible)
    Service-->>Admin: Fail startup with message
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

A rabbit taps the version drum,
“Semver sings—does it succumb?”
If bounds align, we leap and run;
If not, we pause—the boot is done.
Two ears up, checks complete—
Stack and service now can meet. 🐇✨

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (9)
pyproject.toml (1)

39-39: Constrain semver to a compatible release range to avoid accidental major upgrades

semver>=3.0.4 allows future 4.x which may include breaking changes. Prefer a compatible cap.

Apply:

-    "semver>=3.0.4",
+    "semver>=3.0.4,<4.0",
README.md (1)

30-41: Fix TOC list indentation to satisfy MD007 (unordered list indentation)

markdownlint flags inconsistent nesting indentation on these new entries. Use 2 spaces for each nested level.

Apply:

-    * [Llama Stack version check](#llama-stack-version-check)
+  * [Llama Stack version check](#llama-stack-version-check)
...
-    * [CORS](#cors)
-        * [Default values](#default-values)
-    * [Allow credentials](#allow-credentials)
+  * [CORS](#cors)
+    * [Default values](#default-values)
+  * [Allow credentials](#allow-credentials)
src/utils/llama_stack_version.py (3)

5-6: Avoid importing from a private module path for typing

llama_stack_client._client is an internal/private path and may break across releases. Prefer the public import or gate the import under TYPE_CHECKING to avoid a hard runtime dependency on the private path.

Apply:

-from llama_stack_client._client import AsyncLlamaStackClient
+from typing import TYPE_CHECKING, Any
+if TYPE_CHECKING:
+    # Prefer public API; adjust if the package exposes a different stable path
+    from llama_stack_client import AsyncLlamaStackClient  # type: ignore
+else:
+    AsyncLlamaStackClient = Any  # runtime-agnostic

14-23: Log the detected version and allowed range at startup

A short log line helps with diagnostics when startup fails due to version skew.

Apply:

+import logging
+logger = logging.getLogger(__name__)
@@
     version_info = await client.inspect.version()
+    logger.info(
+        "Llama Stack version detected: %s (supported range: %s..%s)",
+        version_info.version,
+        MINIMAL_SUPPORTED_LLAMA_STACK_VERSION,
+        MAXIMAL_SUPPORTED_LLAMA_STACK_VERSION,
+    )

31-38: Use a dedicated exception type instead of bare Exception

This makes failures easier to catch and reason about at higher levels.

Apply:

+class LlamaStackVersionError(RuntimeError):
+    """Raised when Llama Stack version is out of the supported range."""
@@
-    if current_version < minimal_version:
-        raise Exception(
+    if current_version < minimal_version:
+        raise LlamaStackVersionError(
             f"Llama Stack version >= {minimal_version} is required, but {current_version} is used"
         )
-    if current_version > maximal_version:
-        raise Exception(
+    if current_version > maximal_version:
+        raise LlamaStackVersionError(
             f"Llama Stack version <= {maximal_version} is required, but {current_version} is used"
         )
tests/unit/utils/test_llama_stack_version.py (4)

45-59: Tighten the assertion to avoid false positives and make failures more actionable.

The regex ">= .*" is permissive and could pass on a message that doesn't include the expected minimal bound verbatim. Construct a precise pattern using re.escape and bind it to match. Keep Exception type if you're not ready to lock to a specific type.

Apply this diff within this block:

-    with pytest.raises(
-        Exception, match="Llama Stack version >= .* is required, but 0.0.0 is used"
-    ):
+    pattern = rf"Llama Stack version >= {re.escape(MINIMAL_SUPPORTED_LLAMA_STACK_VERSION)} is required, but {re.escape('0.0.0')} is used"
+    with pytest.raises(Exception, match=pattern):
         await check_llama_stack_version(mock_client)

Note: This requires importing re once at the top. See separate suggestion on the import. Optionally, if compare_versions raises a specific type (e.g., ValueError or a custom error), assert that for stronger tests; I can update the tests once you confirm the exception type.


3-5: Add missing import for regex escaping used in assertions.

Required by the refined regex patterns suggested in other comments.

 import pytest
+import re
 from semver import Version

61-70: Make the “too big” message assertion precise as well.

Mirror the lower-bound improvement for the upper-bound check using re.escape to avoid brittle tests.

-    with pytest.raises(
-        Exception,
-        match=f"Llama Stack version <= .* is required, but {bigger_version} is used",
-    ):
+    pattern = rf"Llama Stack version <= {re.escape(MAXIMAL_SUPPORTED_LLAMA_STACK_VERSION)} is required, but {re.escape(str(bigger_version))} is used"
+    with pytest.raises(Exception, match=pattern):
         await check_llama_stack_version(mock_client)

81-103: Simplify version bumps using semver’s bump_ helpers.*

Slightly cleaner and intention-revealing than manual arithmetic; avoids relying on positional ctor args.

-    with subtests.test(msg="Increased patch number"):
-        bigger_version = Version(
-            max_version.major, max_version.minor, max_version.patch + 1
-        )
+    with subtests.test(msg="Increased patch number"):
+        bigger_version = max_version.bump_patch()
         await _check_version_must_fail(mock_client, bigger_version)

-    with subtests.test(msg="Increased minor number"):
-        bigger_version = Version(
-            max_version.major, max_version.minor + 1, max_version.patch
-        )
+    with subtests.test(msg="Increased minor number"):
+        bigger_version = max_version.bump_minor()
         await _check_version_must_fail(mock_client, bigger_version)

-    with subtests.test(msg="Increased major number"):
-        bigger_version = Version(
-            max_version.major + 1, max_version.minor, max_version.patch
-        )
+    with subtests.test(msg="Increased major number"):
+        bigger_version = max_version.bump_major()
         await _check_version_must_fail(mock_client, bigger_version)

-    with subtests.test(msg="Increased all numbers"):
-        bigger_version = Version(
-            max_version.major + 1, max_version.minor + 1, max_version.patch + 1
-        )
+    with subtests.test(msg="Increased all numbers"):
+        bigger_version = max_version.bump_major().bump_minor().bump_patch()
         await _check_version_must_fail(mock_client, bigger_version)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6291eb9 and 0db2125.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • README.md (2 hunks)
  • pyproject.toml (2 hunks)
  • src/constants.py (1 hunks)
  • src/lightspeed_stack.py (2 hunks)
  • src/utils/llama_stack_version.py (1 hunks)
  • tests/unit/utils/test_llama_stack_version.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/lightspeed_stack.py (2)
src/utils/llama_stack_version.py (1)
  • check_llama_stack_version (14-23)
src/client.py (2)
  • AsyncLlamaStackClientHolder (18-50)
  • get_client (44-50)
tests/unit/utils/test_llama_stack_version.py (1)
src/utils/llama_stack_version.py (1)
  • check_llama_stack_version (14-23)
🪛 markdownlint-cli2 (0.17.2)
README.md

30-30: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


31-31: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


32-32: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


33-33: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


34-34: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


35-35: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


36-36: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


37-37: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


38-38: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


39-39: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


40-40: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

🪛 LanguageTool
README.md

[grammar] ~252-~252: Ensure spelling is correct
Context: ...nge defined by these two constants, the serivce won't start and administrator will be i...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~252-~252: There might be a mistake here.
Context: ... two constants, the serivce won't start and administrator will be informed about th...

(QB_NEW_EN)

🔇 Additional comments (6)
pyproject.toml (1)

95-95: Add of pytest-subtests looks good

Dev-only dependency aligns with the new boundary subtests.

src/lightspeed_stack.py (2)

15-15: Import of the version check is correct and placed appropriately

Fits the boot sequence right after client holder import.


67-72: Avoid sharing the same async client across multiple asyncio event loops

Calling asyncio.run() twice spins up two separate event loops, but your AsyncLlamaStackClient (or its internal aiohttp.ClientSession) is most likely bound to the first loop. Handing that same client to a second loop can lead to “attached to a different loop” runtime errors.

Apply this pattern to run both load() and check_llama_stack_version() inside one loop:

-    asyncio.run(
-        AsyncLlamaStackClientHolder().load(configuration.configuration.llama_stack)
-    )
-    client = AsyncLlamaStackClientHolder().get_client()
-    asyncio.run(check_llama_stack_version(client))
+    async def _bootstrap() -> None:
+        await AsyncLlamaStackClientHolder().load(
+            configuration.configuration.llama_stack
+        )
+        client = AsyncLlamaStackClientHolder().get_client()
+        await check_llama_stack_version(client)
+
+    asyncio.run(_bootstrap())

• If you later use this client in Uvicorn’s event loop, move both initialization and the version check into your app’s startup/lifespan event so everything lives on one loop.
• Please verify that AsyncLlamaStackClient truly binds resources (e.g., aiohttp.ClientSession) to its creating loop—if not, this change may be unnecessary.

src/constants.py (1)

3-6: Version bounds added — confirm intentional strictness

Both minimal and maximal are set to “0.2.17”, effectively allowing only that exact version. This is fine if you want to hard-gate, but consider whether patch-level updates (0.2.18, etc.) should be allowed without code changes.

If you want to allow patch bumps within 0.2.x, we can change MAXIMAL to the next minor boundary internally (e.g., compare against <= 0.2.x). I can propose a small helper to compute that safely with semver if helpful.

tests/unit/utils/test_llama_stack_version.py (2)

16-28: LGTM: Minimal boundary test is correct and focused.

Covers the inclusive lower bound and ensures no exception is raised. Good use of AsyncMock to simulate the client call.


30-42: LGTM: Maximal boundary test is correct and complements lower bound coverage.

Validates the inclusive upper bound path with a clean await.

Comment on lines 1 to 17
"""Unit tests for utility function to check Llama Stack version."""

import pytest
from semver import Version

from llama_stack_client.types import VersionInfo

from utils.llama_stack_version import check_llama_stack_version

from constants import (
MINIMAL_SUPPORTED_LLAMA_STACK_VERSION,
MAXIMAL_SUPPORTED_LLAMA_STACK_VERSION,
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify fixture/plugins and import path configuration to keep tests stable across environments.

  • The tests rely on:
    • pytest-asyncio (for @pytest.mark.asyncio)
    • pytest-mock (for the mocker fixture and AsyncMock)
    • pytest-subtests (for the subtests fixture)
    • semver (python-semver v3+ to use Version.parse and bump_* if you adopt the refactor)
  • Imports use src-layout-style modules (utils., constants.). Ensure the test runner has PYTHONPATH configured or package_dir={"": "src"} so these imports resolve.

Run this script to confirm dependency declarations and src-layout configuration:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Check declared dependencies =="
rg -n --no-heading -C2 'pytest-asyncio|pytest-?mock|pytest-subtests|(^|\s)semver(\s|==|>=|$)'

echo
echo "== Check src-layout configuration =="
# Look for src layout hints in common config files
rg -n --no-heading -C2 'package_dir\s*=\s*\{[^}]*""\s*:\s*"src"|[tool.setuptools].*package-dir.*src' pyproject.toml setup.cfg setup.py 2>/dev/null || true
rg -n --no-heading -C2 'pythonpath|addopts.*--pyargs' pytest.ini tox.ini 2>/dev/null || true

echo
echo "== Identify compare_versions exception type (to optionally assert a specific type) =="
rg -nP -C3 '(?m)^\s*def\s+compare_versions\s*\(|raise\s+(?P<exc>\w+Error|ValueError|RuntimeError)\b' -g '!**/tests/**'

Length of output: 21702


Action Required: Declare semver Dependency & Configure src/ Layout for Tests

  • ✅ Confirmed: pytest-asyncio, pytest-mock and pytest-subtests are declared in pyproject.toml.
  • ❌ Missing: semver (python-semver) is not listed—yet both your tests and src/utils/llama_stack_version.py import Version from semver.
  • ❌ No src-layout hints found in pyproject.toml, setup.cfg, setup.py (e.g. package_dir = {"": "src"}) nor PYTHONPATH/--pyargs in pytest.ini or tox.ini. Imports like utils.* and constants.* will fail unless the test runner’s Python path includes src/.

Please address:

  • Add "semver>=<minimum-required-version>" under [project.dependencies] in pyproject.toml.
  • Configure your package to use the src/ layout (e.g. package_dir = {"": "src"}) or update your test runner to include src/ on PYTHONPATH.
🤖 Prompt for AI Agents
In tests/unit/utils/test_llama_stack_version.py lines 1-14: the test and the
implementation import Version from the semver package and rely on project
modules under src/, but pyproject.toml is missing the semver dependency and the
repo lacks a src/ layout configuration for tests; add "semver>=X.Y.Z" (replace
X.Y.Z with the minimum supported semver version) to [project.dependencies] in
pyproject.toml, and ensure the package uses a src/ layout by adding package_dir
= {"": "src"} in setup.cfg/setup.py or configure pytest to include src/ on
PYTHONPATH (e.g., add pythonpath = src in pytest.ini or set env PYTHONPATH) so
imports like utils.* and constants.* resolve during tests.

@tisnik tisnik force-pushed the lcore-437-check-llama-stack-version-on-startup branch 2 times, most recently from e1a1096 to dc83ef7 Compare August 19, 2025 06:56
@tisnik tisnik marked this pull request as draft August 19, 2025 06:58
@tisnik tisnik changed the title LCORE-437: check Llama Stack version on startup [Draft] LCORE-437: check Llama Stack version on startup Aug 19, 2025
@tisnik tisnik force-pushed the lcore-437-check-llama-stack-version-on-startup branch from dc83ef7 to 28005f5 Compare August 19, 2025 16:31
@tisnik tisnik closed this Aug 19, 2025
@tisnik tisnik force-pushed the lcore-437-check-llama-stack-version-on-startup branch from 28005f5 to 6291eb9 Compare August 19, 2025 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant