Cordon all outdated nodes before any rolling update action #41

LaCodon · 2022-11-14T16:38:43Z

Describe the feature request

The current behaviour is to iterate over every outdated node and then first cordon and then drain it immediately afterwards. I think the behaviour should actually be to first cordon all outdated instances before doing anything and then just behave as usual.

Why do you personally want this feature to be implemented?

I whish this feature to be implemented because the current behaviour often (in my experience) leads to pods beeing replaced onto an outdated instance. This leads to a lot of pod restarts during rolling updates as pods get replaced more than once. This is espacially bad for pods with a long terminationGracePeriod or a long startup period. It happens that a pod doesn't even get ready after a replacement before it gets replaced again.

How long have you been using this project?

~3-4 months

Additional information

I would volunteer to implement this feature, even with backward compatibility if required.

TwiN · 2022-11-16T03:48:48Z

So the reason why I haven't designed the algorithm to just cordon everything before beginning the draining phase is that if you cordon everything and you have a spike in load, scaling will be delayed by at least the time it takes for a node to be spun up, and possibly longer, because the rolling update handler may also be draining a node while the scheduler is trying to schedule pods whose scaling was brought about by an HPA.

TL;DR: This was done on purpose, and cordoning all nodes, while making the upgrade faster, also increases the risk that the upgrade will no longer be "transparent"/"graceful" and may cause degraded application performance.

That said, I'm not completely against the idea of implementing it as an optional feature, if you believe it to be necessary for your use case(s).

LaCodon · 2022-11-16T15:10:40Z

I understand, that totally makes sense.
So I will try to implement the idea as an optional feature with respective readme documentation and propose it as a pull request to you :)

TwiN · 2022-11-17T01:01:13Z

Sounds good!

TwiN · 2022-12-03T20:43:27Z

Resolved by #42

someone-stole-my-name · 2022-12-05T23:39:40Z

Hey @TwiN, is this getting released any time soon?

TwiN · 2022-12-06T01:47:36Z

@someone-stole-my-name It was already available through the latest tag, but I've just released it in v1.7.0

LaCodon added the enhancement New feature or request label Nov 14, 2022

LaCodon mentioned this issue Nov 17, 2022

feat: Implement optional eager cordoning #42

Merged

2 tasks

TwiN closed this as completed Dec 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cordon all outdated nodes before any rolling update action #41

Cordon all outdated nodes before any rolling update action #41

LaCodon commented Nov 14, 2022

TwiN commented Nov 16, 2022

LaCodon commented Nov 16, 2022

TwiN commented Nov 17, 2022

TwiN commented Dec 3, 2022

someone-stole-my-name commented Dec 5, 2022

TwiN commented Dec 6, 2022

Cordon all outdated nodes before any rolling update action #41

Cordon all outdated nodes before any rolling update action #41

Comments

LaCodon commented Nov 14, 2022

Describe the feature request

Why do you personally want this feature to be implemented?

How long have you been using this project?

Additional information

TwiN commented Nov 16, 2022

LaCodon commented Nov 16, 2022

TwiN commented Nov 17, 2022

TwiN commented Dec 3, 2022

someone-stole-my-name commented Dec 5, 2022

TwiN commented Dec 6, 2022