Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ws-manager] log errors as warnings during exponential backoff #11216

Merged
merged 1 commit into from
Jul 8, 2022

Conversation

kylos101
Copy link
Contributor

@kylos101 kylos101 commented Jul 7, 2022

Description

If we have an error after backoff, log it as an error, otherwise log errors during backoff as a warning.

The errors we encounter while doing exponential backoff should be treated as warnings, because they don't necessarily require action. This way, the error at the end (signifying we could not start a workspace) becomes more valuable.

For example, during a traffic shift, these were showing as errors, but that was false (we were waiting for nodes to scale-up).

Related Issue(s)

Fixes # n/a

How to test

  1. Go to the preview environment
  2. Before you start a workspace in the preview environment, kubectl cordon the node (this is already done), so a workspace pod cannot be scheduled
  3. Try starting a workspace
  4. Observe via kubectl logs deployment/ws-manager that you see the warnings during backoff, the error at the end (after backoff), and that pod we log is a string representation of the pod object, rather than a byte array.
# errors being shown as warnings during back off
{"@type":"type.googleapis.com/google.devtools.clouderrorreporting.v1beta1.ReportedErrorEvent","error":"context deadline exceeded","instanceId":"28f4512d-0357-4594-8ef8-3988fb50942c","level":"warning","message":"was unable to start workspace","pod":"{\"metadata\":{\"annotations\":...

# the final error indicating backoff bumped into an error
{"@type":"type.googleapis.com/google.devtools.clouderrorreporting.v1beta1.ReportedErrorEvent","error":"context deadline exceeded","instanceId":"28f4512d-0357-4594-8ef8-3988fb50942c","level":"error","message":"was unable to start workspace after backoff","pod.Name":"ws-28f4512d-0357-4594-8ef8-3988fb50942c","pod.Namespace":"default",

Release Notes

NONE

Documentation

Werft options:

  • /werft with-preview

@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-kylos101-backofflogging.1 because the annotations in the pull request description changed
(with .werft/ from main)

If we still have an error afterwards, then log it
@roboquat roboquat added size/S and removed size/XS labels Jul 7, 2022
@kylos101 kylos101 marked this pull request as ready for review July 8, 2022 14:05
@kylos101 kylos101 requested a review from a team July 8, 2022 14:05
@github-actions github-actions bot added the team: workspace Issue belongs to the Workspace team label Jul 8, 2022
@roboquat roboquat merged commit d7bb7b9 into main Jul 8, 2022
@roboquat roboquat deleted the kylos101/backofflogging branch July 8, 2022 16:59
@roboquat roboquat added deployed: workspace Workspace team change is running in production deployed Change is completely running in production labels Jul 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: workspace Workspace team change is running in production deployed Change is completely running in production release-note-none size/S team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants