Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make upgrade wait time between batches configurable #217

Closed
alex-dabija opened this issue Oct 16, 2020 · 7 comments
Closed

Make upgrade wait time between batches configurable #217

alex-dabija opened this issue Oct 16, 2020 · 7 comments
Assignees
Labels
area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service kind/story provider/aws Related to cloud provider Amazon AWS target-release/12.7.0 team/firecracker
Milestone

Comments

@alex-dabija
Copy link

alex-dabija commented Oct 16, 2020

User Story

As a customer, I want to configure the wait time between rolling nodes batches during a tenant cluster upgrade.

Background

The current wait time between rolling nodes batches is hard-coded to 15 minutes. In some situations, nodes with lots of pods or workloads which take a long time to start, this mount of time is not enough to restore the system to a stable state before starting the next batch.

Requirements

  • must be optional feature defined per tenant cluster in first release (ex: configured via annotation Cluster CR).
@calvix
Copy link
Member

calvix commented Oct 28, 2020

PauseTime - annotation aws.giantswarm.io/update-pause-time

the value should be ISO 8601 duration format http://en.wikipedia.org/wiki/ISO_8601#Durations , which is the same AWS CF wants

I want to have validation in the admission controller to inform/block users about using the wrong value but I would like to have a safeguard validation in aws-operator as well in case something fails, rather than have crashlooping operator. If the aws-operator sees an invalid value then it would simply use the default one.

annotation would be either on cluster CR or machine deployment CR, machine deployment value would override any cluster value

@giantswarm/team-firecracker-engineers @giantswarm/sig-ux please raise any concerns/suggestion

@paurosello
Copy link

do we want to use the alpha.... annotation for this features?

@njuettner
Copy link
Member

do we want to use the alpha.... annotation for this features?

Plus one, we should be good citizen and start using alpha/beta/stable more often indicating this is a feature which evolves over time and to clarify the expectation of such features.

@alex-dabija
Copy link
Author

I'm fine if we version the annotations or if we don't, because I do see them more as a temporary mechanism to enable a new feature.

If they are versioned we would have to make sure that newer versions of the aws-operator consider all previous versions of the annotation when it tries to get the required information. This might complicate the implementation a bit.

@paurosello
Copy link

As long as we do the same with all annotations we can manage it easily I think.

I like having the "alpha" in there so customers know this is a new feature and that they should be careful with it

@calvix
Copy link
Member

calvix commented Nov 5, 2020

functionality merged into AWS operator and validation in admission-controller, only docs are remaining

@calvix
Copy link
Member

calvix commented Nov 19, 2020

released as part of giantswarm/releases#512

@calvix calvix closed this as completed Nov 19, 2020
Giant Swarm Roadmap (Deprecated) automation moved this from Ready Soon ( <4 weeks ) to Released Nov 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kaas Mission: Cloud Native Platform - Self-driving Kubernetes as a Service kind/story provider/aws Related to cloud provider Amazon AWS target-release/12.7.0 team/firecracker
Projects
None yet
Development

No branches or pull requests

4 participants