Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support timeout override for deployments #414

Merged
merged 7 commits into from
Jan 24, 2019
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,7 +226,7 @@ This is a limitation of the current implementation.
### Customizing behaviour with annotations
- `kubernetes-deploy.shopify.io/timeout-override`: Override the tool's hard timeout for one specific resource. Both full ISO8601 durations and the time portion of ISO8601 durations are valid. Value must be between 1 second and 24 hours.
- _Example values_: 45s / 3m / 1h / PT0.25H
- _Compatibility_: all resource types (Note: `Deployment` timeouts are based on `spec.progressDeadlineSeconds` if present, and that field has a default value as of the `apps/v1beta1` group version. Using this annotation will have no effect on `Deployment`s that time out with "Timeout reason: ProgressDeadlineExceeded".)
- _Compatibility_: all resource types
- `kubernetes-deploy.shopify.io/required-rollout`: Modifies how much of the rollout needs to finish
before the deployment is considered successful.
- _Compatibility_: Deployment
Expand Down
6 changes: 5 additions & 1 deletion lib/kubernetes-deploy/kubernetes_resource/deployment.rb
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,9 @@ def failure_message
end

def timeout_message
reason_msg = if progress_condition.present?
reason_msg = if timeout_override
STANDARD_TIMEOUT_MESSAGE
elsif progress_condition.present?
"Timeout reason: #{progress_condition['reason']}"
else
"Timeout reason: hard deadline for #{type}"
Expand All @@ -78,6 +80,8 @@ def pretty_timeout_type

def deploy_timed_out?
return false if deploy_failed?
return super if timeout_override

# Do not use the hard timeout if progress deadline is set
progress_condition.present? ? deploy_failing_to_progress? : super
end
Expand Down
2 changes: 0 additions & 2 deletions test/fixtures/invalid/bad_probe.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@ apiVersion: extensions/v1beta1 # keep as long as this group is valid
kind: Deployment
metadata:
name: bad-probe
annotations:
kubernetes-deploy.shopify.io/timeout-override: '25s'
spec:
replicas: 1
# progressDeadlineSeconds: 10 -- do not add: this is used to test the hard deadline
Expand Down
15 changes: 15 additions & 0 deletions test/integration/kubernetes_deploy_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -446,6 +446,21 @@ def test_deployment_with_progress_times_out_for_short_duration
])
end

def test_deployment_with_timeout_override
result = deploy_fixtures("long-running", subset: ['undying-deployment.yml.erb']) do |fixtures|
deployment = fixtures['undying-deployment.yml.erb']['Deployment'].first
deployment['spec']['progressDeadlineSeconds'] = 5
deployment["metadata"]["annotations"] = {
KubernetesDeploy::KubernetesResource::TIMEOUT_OVERRIDE_ANNOTATION => "10S",
}
container = deployment['spec']['template']['spec']['containers'].first
container['readinessProbe'] = { "exec" => { "command" => ['- ls'] } }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this patch? Doesn't this change mean the pod could actually come up successfully?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so, since - ls will always fail and the container will remain unready. Is there another way to ensure a timeout that I should be using?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I misread this as being an array of commands one of which is ls. The bad-probe template uses ["test", "0", "-eq", "1"] but it doesn't really matter.

end
assert_deploy_failure(result, :timed_out)

assert_logs_match_all(KubernetesDeploy::KubernetesResource::STANDARD_TIMEOUT_MESSAGE.split("\n"))
end

def test_wait_false_ignores_non_priority_resource_failures
# web depends on configmap so will not succeed deployed alone
assert_deploy_success(deploy_fixtures("hello-cloud", subset: ["web.yml.erb"], wait: false))
Expand Down
16 changes: 16 additions & 0 deletions test/unit/kubernetes-deploy/kubernetes_resource/deployment_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -307,6 +307,22 @@ def test_deploy_timed_out_based_on_progress_deadline
end
end

def test_deploy_timed_out_based_on_timeout_override
Timecop.freeze do
template = build_deployment_template
template["metadata"]["annotations"][KubernetesDeploy::KubernetesResource::TIMEOUT_OVERRIDE_ANNOTATION] = "10S"
deploy = build_synced_deployment(
template: template,
replica_sets: [build_rs_template(status: { "replica" => 1 })]
)
refute deploy.deploy_timed_out?, "Deploy not started shouldn't have timed out"
deploy.deploy_started_at = Time.now.utc - 3.minutes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't necessarily show that the annotation took precedence over the default and the progress deadline. IMO we should add an expired progress condition, an assertion as to what the default is, and an assertion on each edge of the time that the annotation specifies to prove it is the one causing the timeout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added assertions on each edge and changed pretty_timeout_type to check for timeout override


assert_equal(KubernetesDeploy::KubernetesResource::STANDARD_TIMEOUT_MESSAGE + "\nLatest ReplicaSet: web-1",
deploy.timeout_message.strip)
end
end

def test_deploy_timed_out_based_on_progress_deadline_ignores_statuses_for_older_generations
Timecop.freeze do
deployment_status = {
Expand Down