Skip to content

Conversation

@radofuchs
Copy link
Contributor

@radofuchs radofuchs commented Aug 19, 2025

Description

added healthcheck for llama stack and lightspeed stack to docker compose

Type of change

  • Refactor
  • New feature
  • [] Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • Chores
    • Introduced health checks for core services to automatically detect readiness and failures.
    • Improved startup sequencing: dependent services now wait until upstream services are healthy, reducing race conditions and boot-time errors.
    • Enhances stability with automatic restarts of unhealthy containers and clearer service status in docker-compose.
    • No changes to user-facing functionality.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 19, 2025

Walkthrough

Adds healthcheck blocks for llama-stack and lightspeed-stack in docker-compose.yaml and changes lightspeed-stack’s depends_on to wait for llama-stack to be healthy via condition: service_healthy. No other files changed.

Changes

Cohort / File(s) Summary of Changes
Docker Compose orchestration
docker-compose.yaml
Added healthchecks: llama-stack (GET http://localhost:8321/v1/health) and lightspeed-stack (GET http://localhost:8080/liveness); set intervals/timeouts/retries/start_periods. Updated lightspeed-stack depends_on to use condition: service_healthy for llama-stack.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant DC as docker-compose
    participant Llama as llama-stack
    participant Light as lightspeed-stack

    DC->>Llama: Start container
    Note over Llama: Healthcheck: /v1/health<br/>interval 10s, timeout 5s, retries 3, start_period 15s
    DC-->>Llama: Monitor health

    DC->>Light: Start gated by<br/>llama-stack service_healthy
    Note over Light: Healthcheck: /liveness<br/>interval 10s, timeout 5s, retries 3, start_period 5s
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I thump my paws at rising stacks,
Health pings hop—no time to relax!
Llama hums “I’m well,” in time,
Lightspeed waits the healthy chime.
Compose conducts the burrow’s show—
Green checks blink, and off we go! 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
docker-compose.yaml (3)

15-20: Healthcheck looks good; confirm curl availability and consider adding timeouts

The exec-form healthcheck is fine, but it assumes curl is present in the llama-stack image. If curl isn’t installed, the container will be marked unhealthy regardless of service state. Also, adding connection and overall timeouts makes failures faster and less noisy.

Consider this tweak:

-      test: ["CMD", "curl", "-f", "http://localhost:8321/v1/health"]
+      test: ["CMD-SHELL", "curl -fsS --connect-timeout 2 --max-time 3 http://127.0.0.1:8321/v1/health || exit 1"]
  • If curl isn’t available, either switch to whatever tool you already have (wget/busybox wget) or install curl in the Dockerfile.
  • Verify that 15s start_period is sufficient for llama initialization in your environment; if the model warms up longer, bump this value to avoid early unhealthy marks.

I can propose Dockerfile snippets for your base image to add curl if needed.


34-35: Ensure your Compose/CLI version supports depends_on.condition

Using depends_on: { llama-stack: { condition: service_healthy } } is supported by modern Docker Compose (Compose Spec). Some older Compose v3 implementations and Swarm ignore the condition. Please confirm the team’s local and CI environments run a Compose version that honors health-based conditions.

Note: This only gates initial startup. If llama later becomes unhealthy, lightspeed won’t restart automatically because of this dependency. Make sure lightspeed handles reconnects, or consider service-level retry logic.


38-43: Add timeouts to the lightspeed healthcheck and verify the endpoint exists

Same curl availability concern applies here. Also, adding timeouts is recommended. If the app sometimes boots slowly, consider increasing start_period from 5s.

Suggested tweak:

-      test: ["CMD", "curl", "-f", "http://localhost:8080/liveness"]
+      test: ["CMD-SHELL", "curl -fsS --connect-timeout 2 --max-time 3 http://127.0.0.1:8080/liveness || exit 1"]

Please verify that /liveness is the intended path (versus e.g., /health, /livez, or /readyz) and that the server listens on 0.0.0.0:8080 inside the container.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6182cd7 and 50d5c80.

📒 Files selected for processing (1)
  • docker-compose.yaml (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build-pr

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you

@tisnik tisnik merged commit b51226d into lightspeed-core:main Aug 19, 2025
18 of 19 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Sep 3, 2025
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants