Skip to content

Stage confineSpecUpdateRollout changes in annotation instead of writing directly to .spec #64

@rfranzke

Description

@rfranzke

/area control-plane
/kind enhancement
/label teamsize/medium

What is the topic about?:

Problem

With confineSpecUpdateRollout enabled, spec changes to a Shoot are written directly to .spec but only reconciled during the next maintenance window. This means .spec reflects the desired future state, not the currently applied state. This is confusing for users and tooling — you look at the Shoot and can't tell whether what you see is actually running or just pending.

Proposed Solution

Instead of writing changes directly to .spec, stage them as a patch in an annotation (e.g., shoot.gardener.cloud/staged-spec-patch). The .spec continues to reflect the currently applied state. During the maintenance window, the staged patch is applied to .spec, and reconciliation proceeds as usual.

Key semantics

  • .spec always reflects what's actually running — no ambiguity
  • Patches accumulate: multiple staged changes before the next maintenance window are merged into one cumulative patch
  • Staged changes are inspectable: users/tooling can diff the staged patch against current .spec to see what will change at the next maintenance window
  • Staged changes can be cancelled: remove the annotation before the maintenance window
  • Urgent changes: remove the staged annotation, make the change directly to .spec, trigger reconciliation via operation annotation as usual

Considerations

  • Annotation vs. separate resource: For the hackathon, an annotation is sufficient. However, annotations have size limits (~256KB total on the object). For production, a separate resource (e.g., StagedShootSpec) might be cleaner — evaluate during the PoC whether annotation size becomes a practical concern.
  • Admission/validation: The staged patch must be validated at staging time, not just at apply time — otherwise users discover validation errors only during the maintenance window. The admission webhook should apply the staged patch to the current .spec in-memory and run the full Shoot validation against the resulting spec. Additionally, the maintenance reconciler should re-validate before applying, in case the .spec changed between staging and apply (e.g., a different field was updated directly).

Tasks

  1. Design the staging mechanism: Define the annotation schema (JSON patch, strategic merge patch, or full spec snapshot), accumulation semantics, and interaction with the existing confineSpecUpdateRollout field.
  2. Adapt admission/validation: Implement in-memory patch-and-validate in the admission webhook so staged patches are validated eagerly.
  3. Adapt the maintenance reconciler: Apply the staged patch to .spec during the maintenance window, re-validate before applying, and clear the annotation after successful application.
  4. Adapt API clients/tooling: Ensure kubectl and dashboard workflows correctly stage changes instead of writing to .spec when confineSpecUpdateRollout is enabled.
  5. PoC: Implement the end-to-end flow for a simple spec change (e.g., Kubernetes version update) to validate the approach.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Q2/2026This topic is relevant for the hackathon in Q2/2026.area/control-planeControl plane relatedkind/enhancementEnhancement, improvement, extensionteamsize/mediumA team of 3 people.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions