Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve errors for StatefulSet restart with updateStrategy: OnDelete #876

Merged
merged 4 commits into from
Mar 8, 2022

Conversation

stefanmb
Copy link
Contributor

@stefanmb stefanmb commented Mar 7, 2022

What are you trying to accomplish with this PR?

Resolve some logic errors in StatefulSet restart handling introduced in #840.

  1. If a bare pod is present the current delete_statefulset_pods will crash. (See below)
  2. The current success conditions for the restart cannot be met reliably due to:
    1. An unresolved bug in Kubernetes that prevents the currentRevision from matching updateRevision
    2. For STS with OnDelete the currentReplicas field is sometimes nil.

stacktrace

How is this accomplished?

  1. Introduce new test to trigger the nil error condition
  2. Check for nil dereference
  3. Remove readiness condition that is impossible to satisfy in STS with OnDelete

What could go wrong?
Until the upstream fix we'll still have to deploy with --no-verify-result for STS with OnDelete.

Acknowledgments

Thanks to @epk for helping to debug these issues.

@stefanmb stefanmb changed the title Add new test Resolve errors for StatefulSet restart with updateStrategy: OnDelete Mar 8, 2022
@@ -27,11 +30,10 @@ def deploy_succeeded?
"Consider switching to rollingUpdate.")
@success_assumption_warning_shown = true
end
else
success &= desired_replicas == status_data['currentReplicas'].to_i
Copy link
Contributor Author

@stefanmb stefanmb Mar 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This property is not supported for STS with OnDelete:

kubectl rollout status sts/foo                                                                                                                                                                          
error: rollout status is only available for RollingUpdate strategy type

... therefore this condition can never be true in this case and krane will stall.

Edit: It turns out currentReplicas can disappear sometimes, so this check can sometimes cause krane to stall, see below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How was the CI test in restart_task_test passing if this is the case?

Copy link
Contributor Author

@stefanmb stefanmb Mar 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timothysmith0609 Great question!

I had to look into this a bit. The brief answer is certain combinations of operations can cause the currentReplicas field to become nil. The simple test as-is does not hit these conditions, so it passes.

I've pushed a version of this test that hangs (in the same way I saw in real life) here: 5c4acd0

You can observe locally this is the state this sequence of events leaves the stateful set in:

  status:
    collisionCount: 0
    currentRevision: stateful-busybox-6b5489d7d6
    observedGeneration: 2
    readyReplicas: 2
    replicas: 2
    updateRevision: stateful-busybox-96676bd54
    updatedReplicas: 2

Notice the absence of currentReplicas.

I did not push this version of the test here because it's hanging due to the previously mentioned bug so the only way to run this test is with verify_result: false which doesn't make for a good test.

If you prefer, I can drop these changes from this PR since IRL I will have to run without verify result anyway for now.

Copy link
Contributor

@timothysmith0609 timothysmith0609 Mar 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation!

If you prefer, I can drop these changes from this PR since IRL I will have to run without verify result anyway for now.

May as well keep the changes for now, to avoid repeating the same work later.

@@ -213,7 +213,7 @@ def patch_daemonset_with_restart(record)
def delete_statefulset_pods(record)
pods = kubeclient.get_pods(namespace: record.metadata.namespace)
pods.select! do |pod|
pod.metadata.ownerReferences.find { |ref| ref.uid == record.metadata.uid }
pod.metadata&.ownerReferences&.find { |ref| ref.uid == record.metadata.uid }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes the nil errors encountered if bare pods are found in the namespaces. At Shopify nearly all app namespaces will have such a pod (e.g. secret seeder).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 Thank you for fixing that

@stefanmb stefanmb added the 🪲 bug Something isn't working label Mar 8, 2022
@@ -19,6 +19,9 @@ def status
end

def deploy_succeeded?
success = observed_generation == current_generation &&
desired_replicas == status_data['readyReplicas'].to_i &&
status_data['currentRevision'] == status_data['updateRevision']
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is currently broken in upstream Kubernetes for OnDelete, but if we remove the check the method will return immediately, before the pods are actually restarted successfully.

We could (re)implement some of the controller logic to check for individual pods but I believe running with --no-verify-result until the upstream fix is the better option.

Upstream fix: kubernetes/kubernetes#106059

@@ -36,7 +36,8 @@ def test_restart_by_annotation
end

def test_restart_statefulset_on_delete_restarts_child_pods
result = deploy_fixtures("hello-cloud", subset: "stateful_set.yml") do |fixtures|
result = deploy_fixtures("hello-cloud", subset: ["configmap-data.yml", "unmanaged-pod-1.yml.erb",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deploying a bare pod in the integration test without the fix produces the expected error:

Error:
RestartTaskTest#test_restart_statefulset_on_delete_restarts_child_pods:
NoMethodError: undefined method `find' for nil:NilClass
    /home/runner/work/krane/krane/lib/krane/restart_task.rb:216:in `block in delete_statefulset_pods'
    /opt/hostedtoolcache/Ruby/3.0.3/x64/lib/ruby/3.0.0/delegate.rb:349:in `select!'
    /opt/hostedtoolcache/Ruby/3.0.3/x64/lib/ruby/3.0.0/delegate.rb:349:in `block in delegating_block'
    /home/runner/work/krane/krane/lib/krane/restart_task.rb:215:in `delete_statefulset_pods'
    /home/runner/work/krane/krane/lib/krane/restart_task.rb:239:in `block in restart_statefulsets!'
    /home/runner/work/krane/krane/lib/krane/restart_task.rb:233:in `each'
    /home/runner/work/krane/krane/lib/krane/restart_task.rb:233:in `restart_statefulsets!'
    /home/runner/work/krane/krane/lib/krane/restart_task.rb:74:in `run!'
    /home/runner/work/krane/krane/lib/krane/restart_task.rb:49:in `run'
    /home/runner/work/krane/krane/test/integration/restart_task_test.rb:49:in `test_restart_statefulset_on_delete_restarts_child_pods'

@stefanmb stefanmb marked this pull request as ready for review March 8, 2022 03:52
@stefanmb stefanmb requested a review from a team as a code owner March 8, 2022 03:52
@stefanmb stefanmb requested review from JamesOwenHall and peiranliushop and removed request for a team March 8, 2022 03:52
Copy link
Contributor

@timothysmith0609 timothysmith0609 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ Thank you for fixing this, just one small question

@@ -213,7 +213,7 @@ def patch_daemonset_with_restart(record)
def delete_statefulset_pods(record)
pods = kubeclient.get_pods(namespace: record.metadata.namespace)
pods.select! do |pod|
pod.metadata.ownerReferences.find { |ref| ref.uid == record.metadata.uid }
pod.metadata&.ownerReferences&.find { |ref| ref.uid == record.metadata.uid }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 Thank you for fixing that

@@ -27,11 +30,10 @@ def deploy_succeeded?
"Consider switching to rollingUpdate.")
@success_assumption_warning_shown = true
end
else
success &= desired_replicas == status_data['currentReplicas'].to_i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How was the CI test in restart_task_test passing if this is the case?

@stefanmb stefanmb merged commit c9e81f7 into master Mar 8, 2022
@stefanmb stefanmb deleted the fix_ownerref_error branch March 8, 2022 18:31
@stefanmb stefanmb mentioned this pull request Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪲 bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants