Skip to content

fix(controller): drop model label from Deployment selector to make modelRef mutable (closes #301)#385

Merged
Defilan merged 1 commit intomainfrom
fix/issue-301-modelref-immutable-selector
May 3, 2026
Merged

fix(controller): drop model label from Deployment selector to make modelRef mutable (closes #301)#385
Defilan merged 1 commit intomainfrom
fix/issue-301-modelref-immutable-selector

Conversation

@Defilan
Copy link
Copy Markdown
Member

@Defilan Defilan commented May 3, 2026

Closes #301.

Bug

Deployment.Spec.Selector.MatchLabels included inference.llmkube.dev/model: <modelRef-value>. Kubernetes treats Deployment selectors as immutable after creation, so editing spec.modelRef on a running InferenceService:

  1. Accepted the CR change (kubectl returned success)
  2. Every subsequent reconcile failed at the apiserver with `field is immutable`
  3. The Pod kept running the old model with no error surfaced to the user (the failures only showed up in controller logs)

This blocked common workflows like swapping a quant level or rolling to a different upstream fork.

Fix

Split the label set:

  • deploymentSelectorLabels (new helper): app + inference.llmkube.dev/service. Both are derived from isvc.Name and never change over the InferenceService's lifetime, so the selector is byte-stable across reconciles and the apiserver `Update` succeeds.
  • Full label map (including the model label) still ships on Deployment.ObjectMeta.Labels and Pod template labels, so `kubectl get pods -l inference.llmkube.dev/model=` keeps working as a filter.

When the user now edits modelRef, the selector stays identical, the controller updates the Pod template (new model label, new init container args, new model path), and Kubernetes does a standard rolling update.

Test plan

  • Two regression tests in `inferenceservice_deployment_test.go`:
    • selector must NOT contain `inference.llmkube.dev/model`
    • swapping `modelRef` on the same InferenceService produces Deployments whose selectors are deeply equal (this IS the apiserver immutability constraint, asserted at the source)
  • `go test ./internal/controller/... -count=1` passes
  • `golangci-lint run ./internal/controller/...` clean (0 issues)
  • No CRD changes; no chart changes

…delRef mutable (closes #301)

The Deployment selector.matchLabels included
inference.llmkube.dev/model: <modelRef-value>, which Kubernetes treats
as immutable post-creation. Editing spec.modelRef on a running
InferenceService accepted the CR change but every reconcile failed at
the apiserver with "field is immutable", so the pod kept running the
old model with no error surfaced to the user.

Split labels into two sets:

- deploymentSelectorLabels: just app + inference.llmkube.dev/service.
  These never change over the InferenceService's lifetime, so the
  selector is stable and the apiserver Update succeeds.
- The full label set (including model) still ships on
  Deployment.ObjectMeta.Labels and Pod template labels so kubectl
  filtering with -l inference.llmkube.dev/model=<name> keeps working.

When the user now edits modelRef, the selector is byte-identical, the
controller updates the Pod template (new model label, new init
container, new model path), and Kubernetes does a normal rolling
update.

Two regression tests:

- selector must not contain inference.llmkube.dev/model
- swapping modelRef on the same InferenceService must produce
  Deployments whose selectors are deeply equal (the exact apiserver
  immutability constraint)

Signed-off-by: Christopher Maher <chris@mahercode.io>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Defilan Defilan merged commit a1de3bf into main May 3, 2026
19 checks passed
@Defilan Defilan deleted the fix/issue-301-modelref-immutable-selector branch May 3, 2026 05:08
@github-actions github-actions Bot mentioned this pull request May 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

InferenceService spec.modelRef is effectively immutable due to Deployment selector labels

1 participant