New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more info to the ListAndWatch trace #105819
Conversation
/assign @caesarxuchao |
@@ -319,6 +319,7 @@ func (r *Reflector) ListAndWatch(stopCh <-chan struct{}) error { | |||
panic(r) | |||
case <-listCh: | |||
} | |||
initTrace.Step("Objects listed", trace.Field{"error", err}) | |||
if err != nil { | |||
return fmt.Errorf("failed to list %v: %v", r.expectedTypeName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to add the trace, should we also make sure the caller of ListAndWatch logs the err? Trace is only printed if the function takes more than 10s, https://github.com/kubernetes/kubernetes/blob/8c8ec9071c35a462b90f545fd39aaa92aa80b9aa/staging/src/k8s.io/client-go/tools/cache/reflector.go#L262
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
8c8ec90
to
093aa21
Compare
/lgtm |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: caesarxuchao, tosi3k The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
During scalability tests we sometimes run into issues where the kube-apiserver says it has quickly responded to the list-and-watch client but the client times out during waiting for the list response resulting in a trace like this:
This change is to increase the debuggability so we know what is the actual culprit visible from the client point of view
Which issue(s) this PR fixes:
No issue is opened for that.
Special notes for your reviewer:
/sig scalability
/assign @mborsz
Does this PR introduce a user-facing change?