Don't consider pod preempting a failure #317

DazWorrall · 2018-07-16T09:07:38Z

Similar to #293, pod/node preempting is a recoverable state (the desired pod count will reduce once the node has been reaped), so isn't a valid condition for fast failure.

dturn · 2018-07-16T16:31:29Z

test/unit/kubernetes-deploy/kubernetes_resource/pod_test.rb

@@ -170,6 +170,21 @@ def test_deploy_failed_is_false_for_evicted
    assert_nil pod.failure_message
  end

+  def test_deploy_failed_is_false_for_preempting
+    container_state = pod_spec.merge(
+      "status" => {


Are you confident that this is what the status will look like (or at least the reason). Otherwise this PR wont work.

Looks like it gets set here 👍

https://github.com/kubernetes/kubernetes/blob/817869e78af9d3250a36fce845336535dd29d0e4/pkg/kubelet/preemption/preemption.go#L106

Thanks both!

KnVerey · 2018-07-16T17:03:31Z

test/unit/kubernetes-deploy/kubernetes_resource/pod_test.rb

@@ -170,6 +170,21 @@ def test_deploy_failed_is_false_for_evicted
    assert_nil pod.failure_message
  end

+  def test_deploy_failed_is_false_for_preempting
+    container_state = pod_spec.merge(
+      "status" => {


Looks like it gets set here 👍

https://github.com/kubernetes/kubernetes/blob/817869e78af9d3250a36fce845336535dd29d0e4/pkg/kubelet/preemption/preemption.go#L106

joekohlsdorf · 2018-08-13T19:24:32Z

When I run into this case it looks like progressDeadlineSeconds is not respected, kubernetes-deploy will watch and wait for the deployment of affected resources forever.
Is this intentional?

dturn · 2018-08-13T21:51:11Z

When I run into this case it looks like progressDeadlineSeconds is not respected, kubernetes-deploy will watch and wait for the deployment of affected resources forever.
Is this intentional?

I suspect that kubernetes thinks progress is being made, progress to the controller is defined rather broadly, so it never actually hits progressDeadlineSeconds without making progress. We added the
--max-watch-seconds so that the deploy will fail after a configurable amount of wall clock time for situations like this. If you think this is a bug and not something subtle with kuberentes please open a bug, ideally with the output of kubernetes-deploy and the result of kubectl -o json for the deployment/resource.

DazWorrall requested review from KnVerey and dturn July 16, 2018 09:07

dturn approved these changes Jul 16, 2018

View reviewed changes

KnVerey approved these changes Jul 16, 2018

View reviewed changes

Don't consider pod preempting a failure

ab83055

DazWorrall force-pushed the improve-preemption-handling branch from a2e6130 to ab83055 Compare July 17, 2018 08:23

DazWorrall merged commit dc5fd47 into master Jul 17, 2018

DazWorrall deleted the improve-preemption-handling branch July 17, 2018 08:35

tskorupa deployed to rubygems July 18, 2018 15:46 Active

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't consider pod preempting a failure #317

Don't consider pod preempting a failure #317

DazWorrall commented Jul 16, 2018 •

edited

Loading

dturn Jul 16, 2018

KnVerey Jul 16, 2018

DazWorrall Jul 17, 2018

KnVerey Jul 16, 2018

joekohlsdorf commented Aug 13, 2018

dturn commented Aug 13, 2018

Don't consider pod preempting a failure #317

Don't consider pod preempting a failure #317

Conversation

DazWorrall commented Jul 16, 2018 • edited Loading

dturn Jul 16, 2018

Choose a reason for hiding this comment

KnVerey Jul 16, 2018

Choose a reason for hiding this comment

DazWorrall Jul 17, 2018

Choose a reason for hiding this comment

KnVerey Jul 16, 2018

Choose a reason for hiding this comment

joekohlsdorf commented Aug 13, 2018

dturn commented Aug 13, 2018

DazWorrall commented Jul 16, 2018 •

edited

Loading