Support restart task for StatefulSet with updateStrategy: OnDelete #840

timothysmith0609 · 2021-11-05T16:39:02Z

Builds on #836 by handling the special case of a StatefulSet having updatStrategy: OnDelete. In this case, we need to find all pods with an owner reference to the targeted StatefulSet: this is accomplished using the UID of the StatefulSet.

More info on owner references can be found here.

peiranliushop

Need to bump krane version as as well.

johnjmartin · 2021-11-05T17:26:40Z

lib/krane/restart_task.rb

+    def delete_statefulset_pods(record)
+      pods = kubeclient.get_pods(namespace: record.metadata.namespace)
+      pods.select! do |pod|
+        pod.metadata.ownerReferences.find { |ref| ref.uid == record.metadata.uid }


is record.metadata.uid something that identifies individual statefulsets?

It uniquely identifies the resource throughout the lifetime of the cluster. That is, no 2 resources in a cluster will ever have the same UID

timothysmith0609 · 2021-11-08T14:59:12Z

CI is showing a flaky error that I want to get to the bottom of before merging this. It appears it can treat the restart as successful even if multiple replicas of the statefulset have not yet reached the desired level

kgalieva · 2022-01-25T16:56:34Z

Hey @timothysmith0609! Any updates on this PR?

timothysmith0609 · 2022-01-26T14:47:22Z

I completely forgot this PR existed. I just rebased on latest master and will see what the state of that CI flakiness is

timothysmith0609 · 2022-01-26T15:26:29Z

This is flaky due to transparently returning true for StatefulSet#deploy_succeeded?. Per c8cea8d, it seems reasonable to check the normal success criteria (that used for rollingUpdate strategy) even when deploying a stateful set with onDelete. We would still like to know if the underlying resources are healthy and that there are enough of them running, no?

timothysmith0609 · 2022-01-26T16:29:22Z

@peiranliushop are you ok with the change I made in c8cea8d?

johnjmartin · 2022-01-26T16:38:50Z

We would still like to know if the underlying resources are healthy and that there are enough of them running, no?

Definitely 👍. We expect pods to come up healthy shortly after they're deleted in this particular deploy strategy

timothysmith0609 · 2022-01-26T19:59:50Z

We expect pods to come up healthy shortly after they're deleted in this particular deploy strategy

This is a little nuanced, actually. When deploying, OnDelete means the underlying pods will not be deleted (this must be done manually by the user). In contrast, when we want to restart the pods (the feature being implemented, here), we need to wait for the underlying pods to restart, even if OnDelete is the rollout strategy.

johnjmartin · 2022-01-31T20:55:59Z

When deploying, OnDelete means the underlying pods will not be deleted (this must be done manually by the user). In contrast, when we want to restart the pods (the feature being implemented, here), we need to wait for the underlying pods to restart, even if OnDelete is the rollout strategy.

Ah yes. In our case, the way we plan to use the feature is to run a production-platform-next apply (which with OnDelete will not trigger a restart of our pods), then follow that with a production-platform-next restart. This way we can have the affect of restarting all of our StatefulSet pods in one smooth go on each deploy.

timothysmith0609 · 2022-02-01T20:52:00Z

🤦 I forgot the pods would be named the same even after being deleted. I'll fix this tomorrow and get the tests passing before merging

JamesOwenHall · 2022-02-02T14:18:58Z

I forgot the pods would be named the same even after being deleted.

Ah true. You can probably just swap name for UID; that ought to fix it.

timothysmith0609 requested a review from a team as a code owner November 5, 2021 16:39

timothysmith0609 changed the title ~~support restart task for StatefulSet with updateStrategy: OnDelete~~ Support restart task for StatefulSet with updateStrategy: OnDelete Nov 5, 2021

timothysmith0609 requested a review from peiranliushop November 5, 2021 16:40

peiranliushop approved these changes Nov 5, 2021

View reviewed changes

timothysmith0609 requested review from peiranliushop and johnjmartin November 5, 2021 17:19

johnjmartin reviewed Nov 5, 2021

View reviewed changes

johnjmartin approved these changes Nov 5, 2021

View reviewed changes

timothysmith0609 added 3 commits January 26, 2022 09:46

support restart task for StatefulSet with updateStrategy: OnDelete

f668ac4

fix typo

2bba805

more typo

2719707

timothysmith0609 force-pushed the support-ondelete-statefulset-restart branch from abd2e52 to 2719707 Compare January 26, 2022 14:46

timothysmith0609 requested a review from a team as a code owner January 26, 2022 14:46

timothysmith0609 requested review from JamesOwenHall and removed request for a team January 26, 2022 14:46

At least check if things are healthy

c8cea8d

timothysmith0609 added 2 commits February 1, 2022 14:45

CHANGELOG

a09373e

check underlying pods on ss restart

86e500e

JamesOwenHall approved these changes Feb 1, 2022

View reviewed changes

fix restart statefulset test

d5afb23

timothysmith0609 merged commit 5749728 into master Feb 2, 2022

timothysmith0609 deleted the support-ondelete-statefulset-restart branch February 2, 2022 15:25

JamesHageman mentioned this pull request Feb 2, 2022

version 2.4.1 #872

Merged

stefanmb mentioned this pull request Mar 7, 2022

Resolve errors for StatefulSet restart with updateStrategy: OnDelete #876

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support restart task for StatefulSet with updateStrategy: OnDelete #840

Support restart task for StatefulSet with updateStrategy: OnDelete #840

timothysmith0609 commented Nov 5, 2021

peiranliushop left a comment

johnjmartin Nov 5, 2021

timothysmith0609 Nov 5, 2021

timothysmith0609 commented Nov 8, 2021

kgalieva commented Jan 25, 2022

timothysmith0609 commented Jan 26, 2022

timothysmith0609 commented Jan 26, 2022

timothysmith0609 commented Jan 26, 2022

johnjmartin commented Jan 26, 2022

timothysmith0609 commented Jan 26, 2022

johnjmartin commented Jan 31, 2022

timothysmith0609 commented Feb 1, 2022

JamesOwenHall commented Feb 2, 2022

Support restart task for StatefulSet with updateStrategy: OnDelete #840

Support restart task for StatefulSet with updateStrategy: OnDelete #840

Conversation

timothysmith0609 commented Nov 5, 2021

peiranliushop left a comment

Choose a reason for hiding this comment

johnjmartin Nov 5, 2021

Choose a reason for hiding this comment

timothysmith0609 Nov 5, 2021

Choose a reason for hiding this comment

timothysmith0609 commented Nov 8, 2021

kgalieva commented Jan 25, 2022

timothysmith0609 commented Jan 26, 2022

timothysmith0609 commented Jan 26, 2022

timothysmith0609 commented Jan 26, 2022

johnjmartin commented Jan 26, 2022

timothysmith0609 commented Jan 26, 2022

johnjmartin commented Jan 31, 2022

timothysmith0609 commented Feb 1, 2022

JamesOwenHall commented Feb 2, 2022