Skip to content

Configure OGX with RAG#104

Merged
openshift-merge-bot[bot] merged 2 commits into
openstack-lightspeed:lcore-migrationfrom
lpiwowar:lpiwowar/rag
May 19, 2026
Merged

Configure OGX with RAG#104
openshift-merge-bot[bot] merged 2 commits into
openstack-lightspeed:lcore-migrationfrom
lpiwowar:lpiwowar/rag

Conversation

@lpiwowar
Copy link
Copy Markdown
Contributor

@lpiwowar lpiwowar commented May 18, 2026

This PR ensures that the deployed OGX and Lightspeed Stack is configured to run as RAG with an image provided in the OpenStackLightspeed instance or the default one set in the operator code.

The main logic is built on top of initContainers and provides easy extension when we decide to introduce BYOK later. The logic works as follows.

Before the Lightspeed Stack and OGX main containers are executed, the following initContainers are run:

  1. vector-database-collect: Copies all vector database data (faiss_store.db, llama-stack.yaml, embeddings_model) to a shared volume across all containers in the pod.

  2. vector-store-build: Generates the final ogx_config.yaml and lightspeed-stack.yaml files with injected RAG configuration based on the data from step 1.

After the init containers complete, both Lightspeed Stack and OGX consume the config files produced by the init containers.

Later when introducing BYOK support, we should address:

  • We should prevent copying the same embedding model multiple times into the shared volume when customers provide multiple container images with the same embedding model.

  • The embedding model configuration logic for OGX needs to be updated to support BYOK images. Currently, when an embedding model with the same name appears in multiple images, it is only introduced once into the final OGX config, which causes errors related to missing embedding models for some vector databases.

The default vector database has been changed to:

quay.io/openstack-lightspeed/rag-content-openstack:alpha-ogx-os-docs-2025.2

This image was built MANUALLY and contains ONLY data for the nova project. The default image should be replaced once the upstream pipeline is ready to build llamastack-faiss compatible vector databases.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added dynamic Vector Database configuration generation workflow using init containers for OGX and Lightspeed Stack configs.
    • Enhanced init container support to collect vector database assets and build final configurations at runtime.
  • Improvements

    • Updated RAG content container image to OGX-compatible alpha version.
    • Replaced static configuration ConfigMaps with dynamically generated configs.
  • Chores

    • Added preflight validation for required binaries in the build system.
    • Enhanced test framework to validate generated configurations against expected outputs.

Review Change Stack

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 18, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 76e4a08e-91c8-48bf-b75b-5c9e32e0b21f

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The pull request transitions the OpenStack Lightspeed operator from static ConfigMap-mounted pod configurations to dynamically generated configurations assembled by init containers. New vector database collection and Python-based configuration generation scripts enable flexible metadata injection. The controller pod template, configuration generation logic, and reconciliation are refactored to use generated config volumes instead of pre-defined ConfigMaps. KUTTL tests are updated with pod-based validation assertions and expected configuration files.

Changes

Init Container-Driven Configuration Generation

Layer / File(s) Summary
Vector Database Collection and Build Scripts
internal/controller/assets/vector_database_collect.sh, internal/controller/assets/vector_database_build.py
Shell script collects vector database artifacts (faiss stores, llama-stack metadata) from mounted RAG images, with OCP fallback versioning. Python script generates merged OGX and Lightspeed Stack configurations by extracting model/vector-store/provider metadata from collected databases and injecting into target YAML templates.
Controller Constants and Script Embedding
internal/controller/constants.go
Add Vector DB volume, mount path, config file path, and ConfigMap key constants; embed both init container scripts as package-level variables for runtime injection; reorganize LCore constants by removing old filename references and restructuring permission modes.
Image Tag Updates and API Defaults
api/v1beta1/openstacklightspeed_types.go, config/manager/manager.yaml, hack/env.sh, bundle/manifests/openstack-lightspeed-operator.clusterserviceversion.yaml
Update default RAG container image from os-docs-2025.2 to alpha-ogx-os-docs-2025.2 across API constant, manager env var, hack script, and ClusterServiceVersion manifest; add TODO comments documenting temporary alpha status.
Pod Template Refactoring and Init Container Integration
internal/controller/lcore_deployment.go
Refactor buildLCorePodTemplateSpec to assemble pod volumes for OGX/Lightspeed Stack configs, Vector DB scripts, and shared data; implement buildInitContainers that detect OCP version and orchestrate two init containers collecting vector DB assets and building final configs; update llama-stack and lightspeed-service-api container mounts and args to reference generated config paths.
Configuration Generation and Reconciliation Updates
internal/controller/llama_stack_config.go, internal/controller/lcore_config.go, internal/controller/lcore_reconciler.go
Simplify Llama Stack config by removing vector DB construction and embedding model insertion, restructuring to use registered_resources; remove LCore RAG type and builder; update environment variables to point at generated config paths; add reconcileVectorDBScriptsConfigMap to maintain scripts ConfigMap in Phase 1 reconciliation.
Test Infrastructure and Validation Scripts
test/kuttl/common/expected-configs/validate-config.sh, test/kuttl/common/expected-configs/ogx_config.yaml, test/kuttl/common/expected-configs/ogx_config-update.yaml, test/kuttl/common/expected-configs/lightspeed-stack.yaml, test/kuttl/common/expected-configs/lightspeed-stack-update.yaml
Add validate-config.sh test utility that extracts running pod configurations, normalizes dynamic UUIDs, and diffs against expected YAML files; add expected configuration files for OGX and Lightspeed Stack in initial and update test scenarios.
KUTTL Test Assertions and Cleanup
test/kuttl/common/openstack-lightspeed-instance/*, test/kuttl/tests/basic-openstack-lightspeed-configuration/*, test/kuttl/tests/update-openstacklightspeed/*
Remove static ConfigMap assertions for llama-stack-config and lightspeed-stack-config; add new pod-based config validation assertions running validate-config.sh; update common assertions to include vector-db-scripts ConfigMap; disable OCP RAG in update scenario with explanatory comments; consolidate test fixture references.
Build Target Validation
Makefile
Add preflight checks to kuttl-test target to verify diff and oc command presence in PATH before running KUTTL tests.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

The PR involves substantial refactoring across deployment, configuration generation, and reconciliation logic with new init container orchestration, multiple new scripts, and restructured configuration schemas. The heterogeneous changes across different functional areas and the density of interconnected logic (init container setup, config generation, environment wiring, and test updates) create moderate complexity despite some sections being relatively straightforward (image tag updates, test assertions).

Possibly related PRs

  • openstack-lightspeed/operator#94: Introduces the base LCore deployment and reconciliation structure that this PR extends by refactoring to use init container-driven configuration generation and adding vector database integration.

Suggested labels

lgtm

Suggested reviewers

  • umago
  • Akrog

Poem

🐰 A container must birth configs anew,
Init scripts collect the vector DB's clue,
OGX and Lightspeed now dance as one flow,
From RAG to the pod where the services glow!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Configure OGX with RAG' accurately summarizes the main change—the PR implements OGX and Lightspeed Stack configuration with RAG (Retrieval-Augmented Generation) via initContainers that generate configuration files from collected vector database data.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@lpiwowar
Copy link
Copy Markdown
Contributor Author

/test all

@lpiwowar
Copy link
Copy Markdown
Contributor Author

/retest

@lpiwowar lpiwowar marked this pull request as ready for review May 18, 2026 12:36
@openshift-ci openshift-ci Bot requested review from Akrog and umago May 18, 2026 12:36
@lpiwowar
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/controller/lcore_deployment.go`:
- Around line 565-568: The environment variable LIGHTSPEED_STACK_CONFIG_PATH is
being set to VectorDBVolumeMountPath +
"/lightspeed_stack_config/lightspeed-stack.yaml" which mismatches the actual
path used elsewhere (VectorDBVolumeLightspeedStackConfigPath ->
"/vector-db-discovered-values/lightspeed-stack.yaml"); update the assignment to
use the same canonical constant VectorDBVolumeLightspeedStackConfigPath (or
change that constant to match if intended) so the pod/container and init/build
flow reference the identical path; locate the env var creation block where Name
== "LIGHTSPEED_STACK_CONFIG_PATH" and replace the concatenated path with
VectorDBVolumeLightspeedStackConfigPath.

In `@internal/controller/lcore_reconciler.go`:
- Around line 186-190: The current CreateOrPatch handler overwrites cm.Data with
only the new OGXConfigCMKey (in the controllerutil.CreateOrPatch closure), which
drops legacy ConfigMap keys during upgrades; instead update cm.Data by ensuring
it is non-nil, preserving any existing keys, and then set or overwrite the
OGXConfigCMKey entry to yamlData so legacy keys remain available; apply the same
fix to the other identical cm.Data reassignment occurrence in the file where
cm.Data is being replaced.

In `@test/kuttl/common/expected-configs/validate-config.sh`:
- Around line 14-15: The script dereferences CONFIG_TYPE="$1" and
EXPECTED_CONFIG="$2" while `set -u` is enabled, causing an unbound-variable exit
if arguments are missing; add an argument-count check at the top (e.g. test if
$# -lt 2) and print a clear usage message and exit with non-zero before any
access to $1/$2 so CONFIG_TYPE and EXPECTED_CONFIG are never expanded when
absent.
- Around line 57-65: The script currently assigns POD_NAME by taking the first
entry from `oc get pods` which is nondeterministic; update the `POD_NAME`
selection to deterministically pick a Running and Ready pod instead of the
arbitrary first item. Modify the `oc get pods` invocation in validate-config.sh
(the POD_NAME assignment) to filter pods by status.phase=Running (use
--field-selector=status.phase=Running) and/or by Ready condition, and if
multiple remain, sort by creationTimestamp (or choose the oldest Ready pod) via
jsonpath/jq to consistently select a single pod; ensure the rest of the script
still uses the POD_NAME variable and keep the existing error check for an empty
result.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 27831f27-15a5-4f3e-825e-18252889defb

📥 Commits

Reviewing files that changed from the base of the PR and between e49e5a1 and 642dcab.

📒 Files selected for processing (39)
  • Makefile
  • api/v1beta1/openstacklightspeed_types.go
  • bundle/manifests/openstack-lightspeed-operator.clusterserviceversion.yaml
  • config/manager/manager.yaml
  • hack/env.sh
  • internal/controller/assets/vector_database_build.py
  • internal/controller/assets/vector_database_collect.sh
  • internal/controller/constants.go
  • internal/controller/lcore_config.go
  • internal/controller/lcore_deployment.go
  • internal/controller/lcore_reconciler.go
  • internal/controller/llama_stack_config.go
  • test/kuttl/common/expected-configs/lightspeed-stack-update.yaml
  • test/kuttl/common/expected-configs/lightspeed-stack.yaml
  • test/kuttl/common/expected-configs/ogx_config-update.yaml
  • test/kuttl/common/expected-configs/ogx_config.yaml
  • test/kuttl/common/expected-configs/validate-config.sh
  • test/kuttl/common/openstack-lightspeed-instance/assert-lightspeed-stack-config.yaml
  • test/kuttl/common/openstack-lightspeed-instance/assert-llama-stack-config.yaml
  • test/kuttl/common/openstack-lightspeed-instance/assert-openstack-lightspeed-instance.yaml
  • test/kuttl/common/openstack-lightspeed-instance/assert-pod-lightspeed-stack-config.yaml
  • test/kuttl/common/openstack-lightspeed-instance/assert-pod-llama-stack-config.yaml
  • test/kuttl/common/openstack-lightspeed-instance/errors-openstack-lightspeed-instance.yaml
  • test/kuttl/tests/basic-openstack-lightspeed-configuration/03-assert-lightspeed-stack-config.yaml
  • test/kuttl/tests/basic-openstack-lightspeed-configuration/03-assert-openstack-lightspeed-instance.yaml
  • test/kuttl/tests/basic-openstack-lightspeed-configuration/04-assert-lightspeed-stack-config.yaml
  • test/kuttl/tests/basic-openstack-lightspeed-configuration/04-assert-llama-stack-config.yaml
  • test/kuttl/tests/basic-openstack-lightspeed-configuration/05-assert-llama-stack-config.yaml
  • test/kuttl/tests/update-openstacklightspeed/03-assert-lightspeed-stack-config.yaml
  • test/kuttl/tests/update-openstacklightspeed/03-assert-openstack-lightspeed-instance.yaml
  • test/kuttl/tests/update-openstacklightspeed/04-assert-lightspeed-stack-config.yaml
  • test/kuttl/tests/update-openstacklightspeed/04-assert-llama-stack-config.yaml
  • test/kuttl/tests/update-openstacklightspeed/05-assert-llama-stack-config.yaml
  • test/kuttl/tests/update-openstacklightspeed/06-update-openstack-lightspeed-instance.yaml
  • test/kuttl/tests/update-openstacklightspeed/08-assert-lightspeed-stack-config-update.yaml
  • test/kuttl/tests/update-openstacklightspeed/08-assert-openstacklightspeed-update.yaml
  • test/kuttl/tests/update-openstacklightspeed/09-assert-lightspeed-stack-config-update.yaml
  • test/kuttl/tests/update-openstacklightspeed/09-assert-llama-stack-config-update.yaml
  • test/kuttl/tests/update-openstacklightspeed/10-assert-llama-stack-config-update.yaml
💤 Files with no reviewable changes (8)
  • test/kuttl/tests/update-openstacklightspeed/03-assert-lightspeed-stack-config.yaml
  • test/kuttl/common/openstack-lightspeed-instance/assert-llama-stack-config.yaml
  • test/kuttl/tests/basic-openstack-lightspeed-configuration/03-assert-lightspeed-stack-config.yaml
  • test/kuttl/tests/basic-openstack-lightspeed-configuration/04-assert-llama-stack-config.yaml
  • test/kuttl/tests/update-openstacklightspeed/09-assert-llama-stack-config-update.yaml
  • test/kuttl/tests/update-openstacklightspeed/04-assert-llama-stack-config.yaml
  • test/kuttl/tests/update-openstacklightspeed/08-assert-lightspeed-stack-config-update.yaml
  • test/kuttl/common/openstack-lightspeed-instance/assert-lightspeed-stack-config.yaml

Comment thread internal/controller/lcore_deployment.go Outdated
Comment thread internal/controller/lcore_reconciler.go
Comment thread test/kuttl/common/expected-configs/validate-config.sh
Comment thread test/kuttl/common/expected-configs/validate-config.sh Outdated
umago
umago previously approved these changes May 19, 2026
Runs as the second init container (`vector-database-config-build`), after
`vector_database_collect.sh`. It loads operator-provided base configs, walks every
vector DB directory left by the collect step, and writes merged configs back to
the shared volume.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this. When I saw the name of the script I thought we were "building" the dbs... But we are "building" the config

@umago
Copy link
Copy Markdown
Contributor

umago commented May 19, 2026

Thanks @lpiwowar most skimmed thru the code but it does look good! Thanks for it

lpiwowar added 2 commits May 19, 2026 06:06
This commit ensures that the deployed OGX and Lightspeed Stack is
configured to run as RAG with an image provided in
the OpenStackLightspeed instance or the default one set in the
operator code.

The main logic is built on top of initContainers and provides easy
extension when we decide to introduce BYOK later. The logic works as
follows.

Before the Lightspeed Stack and OGX main containers are executed, the
following initContainers are run:

1. vector-database-collect: Copies all vector database data
   (faiss_store.db, llama-stack.yaml, embeddings_model) to a shared
   volume across all containers in the pod.

2. vector-store-build: Generates the final ogx_config.yaml and
   lightspeed-stack.yaml files with injected RAG configuration based
   on the data from step 1.

After the init containers complete, both Lightspeed Stack and OGX
consume the config files produced by the init containers.

Later when introducing BYOK support, we should address:

- We should prevent copying the same embedding model multiple times into
  the shared volume when customers provide multiple container images
  with the same embedding model.

- The embedding model configuration logic for OGX needs to be updated to
  support BYOK images. Currently, when an embedding model with the same
  name appears in multiple images, it is only introduced once into the
  final OGX config, which causes errors related to missing embedding
  models for some vector databases.

The default vector database has been changed to:

quay.io/openstack-lightspeed/rag-content-openstack:alpha-ogx-os-docs-2025.2

This image was built MANUALLY and contains ONLY data for the nova
project. The default image should be replaced once the upstream pipeline
is ready to build llamastack-faiss compatible vector databases.

Assisted-By: Claude <noreply@anthropic.com>
This commit enhances the KUTTL tests to validate the generated
ogx_config.yaml and lightspeed-stack.yaml files using the
initContainers approach introduced in commit b2c6719.

The implementation uses a workaround strategy:

A) TestAssert steps extract the generated config files from the
   Lightspeed Stack and OGX containers

B) The diff command compares the extracted config files against
   expected reference files prepared in test/kuttl

This ensures comprehensive validation of the operator's config
generation functionality during testing.

Assisted-By: Claude <noreply@anthropic.com>
@umago
Copy link
Copy Markdown
Contributor

umago commented May 19, 2026

/lgtm

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 19, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lpiwowar, umago

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot Bot merged commit a8abf60 into openstack-lightspeed:lcore-migration May 19, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants