Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent instance group from being managed by rolling-update #7685

Closed
dene14 opened this issue Sep 26, 2019 · 7 comments · Fixed by #9348
Closed

Prevent instance group from being managed by rolling-update #7685

dene14 opened this issue Sep 26, 2019 · 7 comments · Fixed by #9348
Labels
area/rolling-update lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@dene14
Copy link

dene14 commented Sep 26, 2019

1. Describe IN DETAIL the feature/behavior/change you would like to see.
In some cases it's necessary to manage instance group manually (for e.g. when instance store implied and some time required for node to sync up after recreation).
I'm not sure if instanceProtection helps with that, please let me know... and if it's not:

2. Feel free to provide a design supporting your feature request.
Introduce flag skipRollingUpdate for IG configuration defaulted to false. It will be operator responsibility to manage instance replacement in ASG in a graceful way.

@zetaab
Copy link
Member

zetaab commented Sep 29, 2019

I do not see benefit this feature because it do exist already. When rolling updating you can specify instancegroups for it

@dene14
Copy link
Author

dene14 commented Sep 29, 2019

This is fine as long as you're the only cluster manager or you have a small amount of clusters under your control. Things are used to be forgotten, so kind of protection that's managed in code would be useful.

@johngmyers
Copy link
Member

/area rolling-update

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 9, 2020
@johngmyers
Copy link
Member

@justinsb Now that surging has landed, I'd like to come to a conclusion on your deferred review comment about the method I chose to disable rolling updates for an instance group.

Currently rolling updates are disabled for an instance group by setting both maxSurge and maxUnavailable to 0. Alternatively we could make that a validation error and introduce a separate boolean option under rollingUpdate.

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 9, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 7, 2020
@johngmyers
Copy link
Member

With node-local storage and post-replacement sync, the solution is to use a PodDisruptionBudget. One can include an additional Deployment in the scope of the PDB's selector, providing a pod that indicates ready only when the database has adequate replication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/rolling-update lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants