How to categorize this topic?
/area ops-productivity
/kind enhancement
/label teamsize/medium
What is the topic about?:
Enhance the CloudProfile schema to align the MachineType definition with the MachineImage lifecycle proposed in GEP32 and extend it with successor and migrationPolicy .
# Current Date: 2026-04-15
apiVersion: core.gardener.cloud/v1beta1
kind: CloudProfile
spec:
machineTypes:
- name: m5.large
cpu: 2
gpu: 0
memory: 8Gi
lifecycle:
- classification: preview
# Implicitly starts if no startTime
- classification: supported
startTime: "2025-01-01T00:00:00Z"
- classification: deprecated
startTime: "2026-05-01T00:00:00Z"
- classification: expired
startTime: "2026-07-01T00:00:00Z"
successor: m6i.large
migrationPolicy: ForceUpgrade # Options: ForceUpgrade, Manual, BlockScaling
The successor field would define the new MachineType while the migrationPolicy defines how to handle it. Possible options would be ForceUpgrade, Manual or BlockScaling.
ForceUpgrade
Once expired.startTime is reached, Gardener patches the Shoot spec.
This keeps the infrastructure free of technical debt, but is the most disruptive for sensitive StatefulSets.
Manual
This offers the most user control, as it currently is a manual task to update clusters with new MachineTypes once they reach their EOL.
BlockScaling
Existing nodes are allowed to stay, but the MachineControllerManager is blocked from creating new nodes of the expired type. This will prevent the creation of new machines of a MachineType that is no longer available and no new instances will be provisioned.
For backwards compatibility, Manual or BlockScaling should be the default.
It should also be possible to override this in the Shoot spec if needed, e.g.
metadata:
annotations:
confirmation.gardener.cloud/skip-machine-type-migration: "true"
For unexpected unavailability by the cloud provider, the unavailable classification can be used.
The user should be warned when the current MachineType is deprecated and the successor should be recommended to them, e.g. via a Shoot constraint MachineTypeDeprecated analogous to existing version constraints.
successor should be a single value, a user would have always the choice to pick any other MachineType, but Gardener should have a definitive successor.
Of course, everything is open to discussion, but here are some open questions:
- What happens to worker pools where the configuration is incompatible with the new
MachineType (e.g. zones)
- Will there be a rolling update or only the spec changed and waited for the next reconcile (node pool will be updated in the maintenance window)
- What about hibernated
Shoots
- What happens, if the
successor is deprecated or expired as well?
- What happens, if the
successor chain is circular?
- How to handle
NamespacedCloudProfile?
- How does
unavailable differ from expired, will it start without a date?
- Should the override annotation
confirmation.gardener.cloud/skip-machine-type-migration
apply only to ForceUpgrade (skipping the patch for one maintenance window),
or should there be a symmetric mechanism for BlockScaling
(e.g. gardener.cloud/operation=force-machine-type-migration to allow scaling despite the block)?
- Who is allowed to set the override annotation? Should this be enforced via an admission plugin,
so that regular project members cannot bypass an explicitly configured CloudProfile policy?
How to categorize this topic?
/area ops-productivity
/kind enhancement
/label teamsize/medium
What is the topic about?:
Enhance the
CloudProfileschema to align theMachineTypedefinition with theMachineImagelifecycle proposed in GEP32 and extend it withsuccessorandmigrationPolicy.The
successorfield would define the newMachineTypewhile themigrationPolicydefines how to handle it. Possible options would beForceUpgrade,ManualorBlockScaling.ForceUpgradeOnce
expired.startTimeis reached, Gardener patches theShootspec.This keeps the infrastructure free of technical debt, but is the most disruptive for sensitive
StatefulSets.ManualThis offers the most user control, as it currently is a manual task to update clusters with new
MachineTypes once they reach their EOL.BlockScalingExisting nodes are allowed to stay, but the
MachineControllerManageris blocked from creating new nodes of the expired type. This will prevent the creation of new machines of aMachineTypethat is no longer available and no new instances will be provisioned.For backwards compatibility,
ManualorBlockScalingshould be the default.It should also be possible to override this in the
Shootspec if needed, e.g.For unexpected unavailability by the cloud provider, the
unavailableclassification can be used.The user should be warned when the current
MachineTypeis deprecated and the successor should be recommended to them, e.g. via a Shoot constraintMachineTypeDeprecatedanalogous to existing version constraints.successorshould be a single value, a user would have always the choice to pick any otherMachineType, but Gardener should have a definitivesuccessor.Of course, everything is open to discussion, but here are some open questions:
MachineType(e.g. zones)Shootssuccessorisdeprecatedorexpiredas well?successorchain is circular?NamespacedCloudProfile?unavailablediffer fromexpired, will it start without a date?confirmation.gardener.cloud/skip-machine-type-migrationapply only to
ForceUpgrade(skipping the patch for one maintenance window),or should there be a symmetric mechanism for
BlockScaling(e.g.
gardener.cloud/operation=force-machine-type-migrationto allow scaling despite the block)?so that regular project members cannot bypass an explicitly configured CloudProfile policy?