Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor/attempts number and logs #749

Merged
merged 5 commits into from
Oct 5, 2020

Conversation

ruionweb
Copy link
Contributor

@ruionweb ruionweb commented Sep 11, 2020

What are you trying to accomplish with this PR?
This change allows for some context deadline errors to be retried addressing issue
https://github.com/Shopify/shipit/issues/1227
How is this accomplished?
image
We noticed that many of the context deadline errors came from cmd:get, we've
updated the number of attempts on the following functions:

  • validate_context_exists_in_kubeconfig (cmd: config, get-context)
  • version_info (cmd: version)
  • validate_context_reachable (cmd: get, namespace)
  • validate_namespace_exists (cmd: get, namespace)

Before you deploy

  • I tophatted or tested this change.
  • This PR is safe to rollback.

@ruionweb ruionweb requested a review from a team as a code owner September 11, 2020 14:07
@ruionweb ruionweb requested a review from gigr September 11, 2020 14:09
@timothysmith0609
Copy link
Contributor

I imagine a lot of the timeouts will occur while syncing resources waiting for convergence. I notice you haven't added retries there. Is that on purpose or an oversight?

@ruionweb
Copy link
Contributor Author

ruionweb commented Sep 14, 2020

My line of thought was that both the TaskConfigValidator and kubectl version_info gets called during the "Initializing deploy" phase of the deploy task:

TaskConfigValidator: https://github.com/Shopify/krane/blob/master/lib/krane/deploy_task.rb#L141

Version_info (via validate_definition): https://github.com/Shopify/krane/blob/master/lib/krane/deploy_task.rb#L143

To my understanding it is Resource Watcher that runs sync_resources, and at this point, there are no Resource watchers created for the Kubernetes resources until the "Predeploy priority resources" phase.

https://github.com/Shopify/krane/blob/master/lib/krane/deploy_task.rb#L150

Other areas where TaskConfigValidator is called is in the restart task and run task during verify_config, which are commands separate from the Krane deploy

Please let me know what you think! @timothysmith0609 🙂

@ruionweb ruionweb merged commit fc30f27 into master Oct 5, 2020
@timothysmith0609 timothysmith0609 temporarily deployed to rubygems October 6, 2020 17:34 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants