Skip to content

gnunn1/lightspeed-stack

 
 

Repository files navigation

lightspeed-stack

About The Project

GitHub Pages License made-with-python

Lightspeed Core Stack (LCS) is an AI powered assistant that provides answers to product questions using backend LLM services, agents, and RAG databases.

Prerequisities

  • Python 3.12, or 3.13
    • please note that currently Python 3.14 is not officially supported
    • all sources are made (backward) compatible with Python 3.12; it is checked on CI

Installation

Installation steps depends on operation system. Please look at instructions for your system:

Configuration

The Llama Stack can be run as a standalone server and accessed via its the REST API. However, instead of direct communication via the REST API (and JSON format), there is an even better alternative. It is based on the so-called Llama Stack Client. It is a library available for Python, Swift, Node.js or Kotlin, which "wraps" the REST API stack in a suitable way, which is easier for many applications.

Llama Stack as separate server

If Llama Stack runs as a separate server, the Lightspeed service needs to be configured to be able to access it. For example, if server runs on localhost:8321, the service configuration should look like:

name: foo bar baz
service:
  host: localhost
  port: 8080
  auth_enabled: false
  workers: 1
  color_log: true
  access_log: true
llama_stack:
  use_as_library_client: false
  url: http://localhost:8321
user_data_collection:
  feedback_disabled: false
  feedback_storage: "/tmp/data/feedback"
  transcripts_disabled: false
  transcripts_storage: "/tmp/data/transcripts"

Llama Stack as client library

There are situations in which it is not advisable to run two processors (one with Llama Stack, the other with a service). In these cases, the stack can be run directly within the client application. For such situations, the configuration file could look like:

name: foo bar baz
service:
  host: localhost
  port: 8080
  auth_enabled: false
  workers: 1
  color_log: true
  access_log: true
llama_stack:
  use_as_library_client: true
  library_client_config_path: <path-to-llama-stack-run.yaml-file>
user_data_collection:
  feedback_disabled: false
  feedback_storage: "/tmp/data/feedback"
  transcripts_disabled: false
  transcripts_storage: "/tmp/data/transcripts"

Usage

usage: lightspeed_stack.py [-h] [-v] [-d] [-c CONFIG_FILE]

options:
  -h, --help            show this help message and exit
  -v, --verbose         make it verbose
  -d, --dump-configuration
                        dump actual configuration into JSON file and quit
  -c CONFIG_FILE, --config CONFIG_FILE
                        path to configuration file (default: lightspeed-stack.yaml)

Make targets

Usage: make <OPTIONS> ... <TARGETS>

Available targets are:

run                               Run the service locally
test-unit                         Run the unit tests
test-integration                  Run integration tests tests
test-e2e                          Run BDD tests for the service
check-types                       Checks type hints in sources
security-check                    Check the project for security issues
format                            Format the code into unified format
schema                            Generate OpenAPI schema file
requirements.txt                  Generate requirements.txt file containing hashes for all non-devel packages
shellcheck                        Run shellcheck
help                              Show this help screen

Running Linux container image

Stable release images are tagged with versions like 0.1.0. Tag latest always points to latest stable release.

Development images are build from main branch every time a new pull request is merged. Image tags for dev images use the template dev-YYYYMMMDDD-SHORT_SHA e.g. dev-20250704-eaa27fb.

Tag dev-latest always points to the latest dev image built from latest git.

To pull and run the image with own configuration:

  1. podman pull quay.io/lightspeed-core/lightspeed-stack:IMAGE_TAG
  2. podman run -it -p 8080:8080 -v my-lightspeed-stack-config.yaml:/app-root/lightspeed-stack.yaml:Z quay.io/lightspeed-core/lightspeed-stack:IMAGE_TAG
  3. Open localhost:8080 in your browser

If a connection in your browser does not work please check that in the config file host option looks like: host: 0.0.0.0.

Endpoints

The service provides health check endpoints that can be used for monitoring, load balancing, and orchestration systems like Kubernetes.

Readiness Endpoint

Endpoint: GET /v1/readiness

The readiness endpoint checks if the service is ready to handle requests by verifying the health status of all configured LLM providers.

Response:

  • 200 OK: Service is ready - all providers are healthy
  • 503 Service Unavailable: Service is not ready - one or more providers are unhealthy

Response Body:

{
  "ready": true,
  "reason": "All providers are healthy",
  "providers": []
}

Response Fields:

  • ready (boolean): Indicates if the service is ready to handle requests
  • reason (string): Human-readable explanation of the readiness state
  • providers (array): List of unhealthy providers (empty when service is ready)

Liveness Endpoint

Endpoint: GET /v1/liveness

The liveness endpoint performs a basic health check to verify the service is alive and responding.

Response:

  • 200 OK: Service is alive

Response Body:

{
  "alive": true
}

Contributing

License

Published under the Apache 2.0 License

Additional tools

Utility to generate OpenAPI schema

This script re-generated OpenAPI schema for the Lightspeed Service REST API.

Path

scripts/generate_openapi_schema.py

Usage

make schema

About

LLM tooling stack

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 97.1%
  • Other 2.9%