Gate proxyrunner StatefulSet apply by MCPServer generation#5024
Merged
Conversation
b77f444 to
5a4707d
Compare
5a4707d to
2f45b9a
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5024 +/- ##
==========================================
+ Coverage 69.10% 69.13% +0.03%
==========================================
Files 556 556
Lines 73283 73348 +65
==========================================
+ Hits 50641 50712 +71
+ Misses 19627 19618 -9
- Partials 3015 3018 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
yrobla
reviewed
Apr 23, 2026
Contributor
yrobla
left a comment
There was a problem hiding this comment.
Good implementation overall — the monotonic generation gate is well-reasoned, backward-compat signal is clean, and tests cover the seven core gate cases. Four findings below (warning + nits; no blockers).
yrobla
previously approved these changes
Apr 23, 2026
During a proxy Deployment rolling update the old and new ReplicaSet pods both run DeployWorkload, each calling server-side apply on the backing StatefulSet with the same field manager and Force: true. The stale pod's apply can land after the fresh pod's and clobber the image, leaving the StatefulSet pinned to the old digest even after MCPServer.spec.image has been bumped. The operator now stamps MCPServer.metadata.generation into the RunConfig as a monotonic version. Proxyrunner threads it through to DeployWorkloadOptions. Before apply, the K8s runtime reads the existing StatefulSet's mcpserver-generation annotation; if it is strictly greater than ours, the apply is skipped. Otherwise our generation is stamped on the pod template and apply proceeds. Zero generation is the backward-compat signal and bypasses both the gate and the stamp, preserving existing behavior for callers that don't pass one. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Juan Antonio Osorio <ozz@stacklok.com>
2f45b9a to
f9cae74
Compare
yrobla
approved these changes
Apr 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
During a rolling update of the proxy Deployment, two proxyrunner pods are alive at the same time: the old-RS pod with the old args, and the new-RS pod with the new args. Both call
DeployWorkload→ server-side apply on the backingStatefulSetas the same field manager withForce: true. The stale writer can land after the fresh writer and clobber the image back to the old digest — the StatefulSet then re-rolls its pod onto the old image and stays there indefinitely. Confirmed in production onv0.24.0; details and timeline in #5023.MCPServer.metadata.generationinto the rendered RunConfig as a monotonic version.DeployWorkloadOptionsand into the K8s runtime.Apply, the runtime reads the existing StatefulSet'stoolhive.stacklok.dev/mcpserver-generationpod-template annotation; if it is strictly greater than ours, the apply is skipped. Otherwise our generation is stamped on the pod template and apply proceeds as today.Fixes #5023
Type of change
Test plan
task test)task test-e2e)task lint-fix)New table-driven test
TestDeployWorkload_RunConfigMCPServerGenerationGatecovers the seven gate cases: absent STS, existing STS with (missing / older / equal / strictly newer / unparseable) annotation, zero options generation (backward-compat). Round-trip JSON test forMCPServerGenerationwithomitemptybehavior. Operator-side test thatcreateRunConfigFromMCPServersets the generation from the CR. Three pre-existing operator determinism tests continue to pass; they were the reason this PR pivoted from atime.TimeRenderedAt field toint64 MCPServerGeneration.API Compatibility
v1beta1API, OR theapi-break-allowedlabel is applied and the migration guidance is described above.No CRD or REST API changes. The RunConfig JSON schema gains an
omitemptyinteger field — old consumers ignore unknown fields; new consumers see zero when reading old ConfigMaps.Does this introduce a user-facing change?
No behavior change in the common case. Users will see a new pod-template annotation
toolhive.stacklok.dev/mcpserver-generationon StatefulSets backing MCPServers created by upgraded operators. Users debugging a skipped apply will find aDEBUGlog line on the proxyrunner pod:skipping StatefulSet apply; newer MCPServer generation already applied.Implementation plan
Approved implementation plan
Design
Monotonic version carried end-to-end from the CR to the StatefulSet annotation.
RunConfiggainsMCPServerGeneration int64with JSON tagmcpserver_generation,omitempty. Zero is the unversioned / backward-compat signal.WithMCPServerGeneration(gen int64)builder option.runner.WithMCPServerGeneration(m.Generation)increateRunConfigFromMCPServer.DeployWorkloadOptions.RunConfigMCPServerGeneration int64carries it through the runtime seam.runner.Run→runtime.Setup→ k8sDeployWorkloadplumb the value.runConfigGeneration(options)— safe extract with nil-options handling.shouldSkipStatefulSetApply(ctx, ns, name, ourGen)—Getthe STS, parse the annotation, returntruewhentheirs > ours. Parse errors warn and fall through. Not-found returns false. Zero ourGen returns false (no gate).applyStatefulSet(...)— stamps the annotation when ourGen > 0, builds the apply-config, performs SSA.RunConfigMCPServerGenerationAnnotation = \"toolhive.stacklok.dev/mcpserver-generation\"alongsideserviceFieldManager. Doc comment calls out that the gate only becomes effective once proxyrunner is upgraded.Invariants
(0, nil), log DEBUG.Pivot during implementation
The first draft used
RenderedAt time.Time = time.Now().UTC(). Two problems surfaced during test validation:time.Now()madecreateRunConfigFromMCPServernon-deterministic. The operator's ConfigMap content-checksum would have changed every reconcile, which would have flipped the proxy Deployment's pod-template annotation and caused the proxy to roll on every reconcile — defeating the entire system. Three pre-existing determinism tests regressed to prove it.omitemptydoes not actually drop zerotime.Time(Go JSON quirk) — backward-compat output would have contained\"0001-01-01T00:00:00Z\".Pivoted to
int64 MCPServerGenerationsourced fromMCPServer.metadata.generation.omitemptyworks on int64. Generation is monotonic (K8s-enforced) and only bumps on spec changes, so RunConfig rendering is deterministic across no-op reconciles. The three determinism tests pass unchanged.Scope boundary
This PR fixes the image / pod-spec dimension of the race. It does not cover
spec.replicasclobbering (the scaling dimension tracked in #4484). Phase 2 — moving StatefulSet ownership into the operator — is the complete architectural fix and would close both. Out of scope here.Special notes for reviewers
GenerationChangedPredicateefficiency work — same.metadata.generationsignal, different use; Phase 1's deterministic Generation-based rendering also reduces the no-op-reconcile churn Add generation-change predicates to controller watches #4635 targets). Proxy runner triggers StatefulSet rolling update on every restart, causing crash loop #4877 (closed — same family, different symptom).toolhive-container-manager) and apply still usesForce: true. The TOCTOU window betweenGetandApplyis acceptable because the worst case is losing to a strictly newer writer — which is the desired outcome.Generated with Claude Code