Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't consider pod preempting a failure #317

Merged
merged 1 commit into from
Jul 17, 2018

Conversation

DazWorrall
Copy link
Member

@DazWorrall DazWorrall commented Jul 16, 2018

Similar to #293, pod/node preempting is a recoverable state (the desired pod count will reduce once the node has been reaped), so isn't a valid condition for fast failure.

@DazWorrall DazWorrall requested review from KnVerey and dturn July 16, 2018 09:07
@@ -170,6 +170,21 @@ def test_deploy_failed_is_false_for_evicted
assert_nil pod.failure_message
end

def test_deploy_failed_is_false_for_preempting
container_state = pod_spec.merge(
"status" => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you confident that this is what the status will look like (or at least the reason). Otherwise this PR wont work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks both!

@@ -170,6 +170,21 @@ def test_deploy_failed_is_false_for_evicted
assert_nil pod.failure_message
end

def test_deploy_failed_is_false_for_preempting
container_state = pod_spec.merge(
"status" => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DazWorrall DazWorrall merged commit dc5fd47 into master Jul 17, 2018
@DazWorrall DazWorrall deleted the improve-preemption-handling branch July 17, 2018 08:35
@joekohlsdorf
Copy link

When I run into this case it looks like progressDeadlineSeconds is not respected, kubernetes-deploy will watch and wait for the deployment of affected resources forever.
Is this intentional?

@dturn
Copy link
Contributor

dturn commented Aug 13, 2018

When I run into this case it looks like progressDeadlineSeconds is not respected, kubernetes-deploy will watch and wait for the deployment of affected resources forever.
Is this intentional?

I suspect that kubernetes thinks progress is being made, progress to the controller is defined rather broadly, so it never actually hits progressDeadlineSeconds without making progress. We added the
--max-watch-seconds so that the deploy will fail after a configurable amount of wall clock time for situations like this. If you think this is a bug and not something subtle with kuberentes please open a bug, ideally with the output of kubernetes-deploy and the result of kubectl -o json for the deployment/resource.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants