Skip to content

Conversation

@rhdedgar
Copy link
Collaborator

@rhdedgar rhdedgar commented Jun 17, 2025

Updates the operator's controller to watch for the changes to the ConfigMap specified in the LlamaStackDistribution CR.

Closes: #12

@rhdedgar
Copy link
Collaborator Author

Some additional information:

A minimal example of an ollama run.yaml config is provided for demonstration purposes. Parameters for the example's ollama deployment match those of the current README.md file.

If a ConfigMap is specified in the LlamaStackDistribution CR, then any changes to that ConfigMap will result in a restart of the CR's Pod. The Pod will then mount the run.yaml configuration as a read-only file on startup.

Otherwise, if the LlamaStackDistribution CR has no userConfig section set, then the default run.yaml config file is used. This is currently provided by /usr/local/lib/python3.10/site-packages/llama_stack/templates/ollama/run.yaml. Configmaps in the CR's namespace are ignored unless specified by the userConfig option.

@rhdedgar
Copy link
Collaborator Author

I'm also open to changing things like:

  • Field names, if names like UserConfig should more clearly reflect its run.yaml association.
  • Hashing methods, if another approach is preferred for tracking.
  • Updating the README.md file with instructions once this is finalized.

Copy link
Collaborator

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks solid, thanks!

Comment on lines 212 to 240
Watches(
&corev1.ConfigMap{},
handler.EnqueueRequestsFromMapFunc(r.findLlamaStackDistributionsForConfigMap),
builder.WithPredicates(predicate.ResourceVersionChangedPredicate{}),
).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can include another predicate to diff the content of the cm on updates using a predicate.Funcs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That approach sounds good, I've now updated the code to only reconcile on changes to the ConfigMap's Data field, and creation/deletion events.


// findLlamaStackDistributionsForConfigMap maps ConfigMap changes to LlamaStackDistribution reconcile requests.
func (r *LlamaStackDistributionReconciler) findLlamaStackDistributionsForConfigMap(ctx context.Context, configMap client.Object) []reconcile.Request {
attachedLlamaStacks := &llamav1alpha1.LlamaStackDistributionList{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No immediate action, can be done in a followup, up to you. Can you use a Field Indexer with a FieldSelector in the List calls? This will drastically improve performance on a large number of CM which is not uncommon on a cluster.

Something like:

r.List(ctx, &matchedDistributions, &client.ListOptions{
		FieldSelector: client.MatchingFields{
			".spec.server.userConfig.configMapKey": key,
		},
	})

but you need to setup an indexer first in the manager, see: https://book.kubebuilder.io/cronjob-tutorial/controller-implementation#setup

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started implementing this approach, but hit some errors around failing to list when using the field selector. I'll look into it more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking again into this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found the cause of the errors that I saw, it was due to missing selectablefield kubebuilder markers. I've made the necessary updates and will work on getting those changes included in this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to include these changes into this PR.

@mergify
Copy link

mergify bot commented Jun 18, 2025

This pull request has merge conflicts that must be resolved before it can be merged. @rhdedgar please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Jun 18, 2025
@rhdedgar rhdedgar force-pushed the watch_run_yaml_configmap branch 3 times, most recently from 0a9654e to 5f40a92 Compare June 19, 2025 04:19
@rhdedgar
Copy link
Collaborator Author

/hold

@mergify mergify bot removed the needs-rebase label Jun 19, 2025
@rhdedgar
Copy link
Collaborator Author

I'll hold this until more testing can be done after the rebase (merge conflict).


// findLlamaStackDistributionsForConfigMap maps ConfigMap changes to LlamaStackDistribution reconcile requests.
func (r *LlamaStackDistributionReconciler) findLlamaStackDistributionsForConfigMap(ctx context.Context, configMap client.Object) []reconcile.Request {
attachedLlamaStacks := &llamav1alpha1.LlamaStackDistributionList{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking again into this.

@leseb
Copy link
Collaborator

leseb commented Jun 19, 2025

/hold

Added do-not-merge label instead. We don't have support for those commands.

Copy link
Collaborator

@VaishnaviHire VaishnaviHire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this change @rhdedgar !

I think it would also be helpful to extend readme to include this feature - Deployment of Llama Stack Server

func (r *LlamaStackDistributionReconciler) reconcileUserConfigMap(ctx context.Context, instance *llamav1alpha1.LlamaStackDistribution) error {
logger := log.FromContext(ctx)

if instance.Spec.Server.UserConfig == nil || instance.Spec.Server.UserConfig.ConfigMapName == "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check seems redundant with the one in Reconcile()


// getConfigMapHash calculates a hash of the ConfigMap data to detect changes.
func (r *LlamaStackDistributionReconciler) getConfigMapHash(ctx context.Context, instance *llamav1alpha1.LlamaStackDistribution) (string, error) {
if instance.Spec.Server.UserConfig == nil || instance.Spec.Server.UserConfig.ConfigMapName == "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, check seems redundant with one in reconcile()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Upon further review, I've removed some duplicate checks like these, and added helper functions for the reusable checks to improve readability. Things like SetupWithManager will look a bit more clear now.

@rhdedgar rhdedgar marked this pull request as draft June 25, 2025 04:22
@rhdedgar rhdedgar force-pushed the watch_run_yaml_configmap branch from 5f40a92 to 0ef764b Compare June 25, 2025 04:33
@rhdedgar
Copy link
Collaborator Author

I've updated this branch with my current working state (fieldSelector PoC). I'll get some additional changes made tomorrow.

I'll also look to address the most recent comments, and get code to address the feedback included in this PR.

@rhdedgar rhdedgar force-pushed the watch_run_yaml_configmap branch from 0ef764b to 20c573e Compare June 26, 2025 17:55
@rhdedgar rhdedgar marked this pull request as ready for review June 26, 2025 18:01
Signed-off-by: Doug Edgar <dedgar@redhat.com>
@rhdedgar rhdedgar force-pushed the watch_run_yaml_configmap branch 2 times, most recently from d7d2d88 to 0c80a32 Compare June 26, 2025 20:17
Signed-off-by: Doug Edgar <dedgar@redhat.com>
@rhdedgar rhdedgar force-pushed the watch_run_yaml_configmap branch from 0c80a32 to fda5b86 Compare June 27, 2025 00:04
@rhdedgar rhdedgar requested a review from leseb June 27, 2025 01:33
@leseb leseb requested a review from VaishnaviHire June 27, 2025 14:34
Copy link
Collaborator

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid! Thanks!

}

// Only trigger if Data or BinaryData has changed
dataChanged := !cmp.Equal(oldConfigMap.Data, newConfigMap.Data)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we print the diff?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do! I've added diff printing for ConfigMap Data when changes are detected.

Example:

2025-06-27T21:38:09Z	INFO	Referenced ConfigMap change detected	{"configMapName": "llama-stack-config", "configMapNamespace": "new-llama"}
2025-06-27T21:38:09Z	INFO	ConfigMap Data changed	{"configMapName": "llama-stack-config", "configMapNamespace": "new-llama"}
ConfigMap new-llama/llama-stack-config Data diff:
  map[string]string{
  	"run.yaml": (
  		"""
  		... // 3 identical lines
  		apis:
  		- inference
+ 		- scoring
  		providers:
  		  inference:
  		... // 11 identical lines
  		"""
  	),
  }

@VaishnaviHire
Copy link
Collaborator

VaishnaviHire commented Jun 27, 2025

I see following errors even before creating LlamaStackDistribution CR. I think we should only watch configmaps mentioned in the CRs. Maybe by adding labels when a CM is referenced in a CR?

ERROR	CRITICAL: Failed to list LlamaStackDistributions for ConfigMap reference check - assuming ConfigMap is referenced to prevent missing reconciliation events	{"configMapName": "trusted-ca", "configMapNamespace": "openshift-image-registry", "error": "field label not supported: spec.server.userConfig.configMapName"}

…g configmap sample

Signed-off-by: Doug Edgar <dedgar@redhat.com>
@rhdedgar
Copy link
Collaborator Author

@VaishnaviHire's recent comment (I don't see it at the moment) helped me find that I had omitted a check for only ConfigMaps referenced by existing LlamaStackDistributions.

I had originally tried to keep the log output as low as possible to maintain readability, and as a result, I was missing out on some efficiency opportunities to that would have been apparent from more verbose logging. I'll lean towards this approach more in the future.

@rhdedgar rhdedgar merged commit 1d09c42 into llamastack:main Jul 1, 2025
6 checks passed
VaishnaviHire pushed a commit to VaishnaviHire/llama-stack-k8s-operator that referenced this pull request Jul 16, 2025
Updates the operator's controller to watch for the changes to the
ConfigMap specified in the LlamaStackDistribution CR.

Closes: llamastack#12

---------

Signed-off-by: Doug Edgar <dedgar@redhat.com>
(cherry picked from commit 1d09c42)
VaishnaviHire pushed a commit to VaishnaviHire/llama-stack-k8s-operator that referenced this pull request Jul 16, 2025
Updates the operator's controller to watch for the changes to the
ConfigMap specified in the LlamaStackDistribution CR.

Closes: llamastack#12

---------

Signed-off-by: Doug Edgar <dedgar@redhat.com>
(cherry picked from commit 1d09c42)
VaishnaviHire pushed a commit to VaishnaviHire/llama-stack-k8s-operator that referenced this pull request Oct 31, 2025
Signed-off-by: konflux-internal-p02 <170854209+konflux-internal-p02[bot]@users.noreply.github.com>
Co-authored-by: konflux-internal-p02[bot] <170854209+konflux-internal-p02[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: Flag to enable RAG

3 participants