Create a Delayed Release Process #3813
Labels
area/upgrades
Related to upgrading Bottlerocket
status/needs-proposal
Needs a more detailed proposal for next steps
type/enhancement
New feature or request
What I'd like:
The ability to delay BottleRocket updates, while still having them be automated.
Description:
Increasingly, folks are updating nodes automatically. Using a tool like Karpenter watch the SSM Parameters to determine when there is drift, and automatically update the nodes (https://aws.amazon.com/blogs/containers/how-to-upgrade-amazon-eks-worker-nodes-with-karpenter-drift/). BottleRocket Updater does similar. For the most recent release, Karpenter started updating nodes before the GitHub Release was even announced.
Automatically updating is desirable, as it allows companies to stay up to date with little effort. However, often organizations want changes like this to move through different environments. They may want a change to set in dev for a few days, then QA for a few days, before moving to production.
It would be a helpful addition to the BottleRocket Release process if a workflow like that could be supported, while still being automated. This would give organizations a period where they could identify issues with a release, and delay the automated process for production environments if needed.
Potential Implementation:
Instead of having a single SSM parameter for "latest release", there could be parameters for:
This would, I beleive, also require changes in Karpenter or BottleRocket Updater. But would allow us to configure Dev to be latest, QA to be 3 days behind, Prod to be 7 days behind.
Potential Issues:
There are plenty of edge cases here. What if a release does have a problem that's found after a day? How does the Release Process handle that? What if there are multiple releases quickly?
Any alternatives you've considered:
This could likely be implemented somehow at the Karpenter/BottleRocket Updater level. And may even be more appropriate there.
Additionally, I believe any organization could also accomplish this by creating their own copies of the AMIs, but I was thinking of something that would be a more generally available process.
The text was updated successfully, but these errors were encountered: