-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added tracing for long running requests. #69756
Conversation
do we need a release note? |
@neolit123 Historically trace PRs such as #44667 have not needed release notes. This change should only be adding logging to help debug long running requests. Is there a particular part of the change you feel we need to warn customers about? |
@cheftako ok, sounds good. |
/test pull-kubernetes-integration |
/assign @wojtek-t |
transformResponseObject(ctx, scope, req, w, status, result) | ||
} | ||
} | ||
|
||
// DeleteCollection returns a function that will handle a collection deletion | ||
func DeleteCollection(r rest.CollectionDeleter, checkBody bool, scope RequestScope, admit admission.Interface) http.HandlerFunc { | ||
return func(w http.ResponseWriter, req *http.Request) { | ||
trace := utiltrace.New("Delete " + req.URL.Path) | ||
defer trace.LogIfLong(1000 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with all other requests *create/update", maybe 500ms (we use 500ms there)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick question. On the other collection operations we do 500ms but we start them later in the operation. https://github.com/kubernetes/apiserver/blob/master/pkg/endpoints/handlers/get.go#L260 for example skips any watch latency.
I can certainly change to 500ms but I am wondering if that may trigger to often on collection requests which are expected to take longer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - get is special, so let's ignore it.
But for both create:
https://github.com/kubernetes/apiserver/blob/master/pkg/endpoints/handlers/create.go#L47
and update:
https://github.com/kubernetes/apiserver/blob/master/pkg/endpoints/handlers/update.go#L49
we start at exactly the same place and use 500ms.
So I think we should be consistent.
mediaType, _, err := negotiation.NegotiateOutputMediaType(req, scope.Serializer, &scope) | ||
if err != nil { | ||
status := responsewriters.ErrorToAPIStatus(err) | ||
trace.Step("Writing raw JSON response") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one won't provide much value - negotiation is really fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Granted but I suspect that some of the time the problem is actually writing the response to the requester. As such I would like a tight bracket around the "responsewriters.WriteRawJSON(int(status.Code), status, w)" call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - writing response is many times dominating factor - I completely agree that.
I completely wasn't clear what I wanted to say - sorry for that.
My point was that - the time between "negotiating output media type" and this step should be negligible.
So given that your point is completely correct, I think what I wanted to say is to:
- leave this one (to see whether we spent time here)
- but remove the "negotiating output media type" step (since different between this one and that should be negligible).
Does it make sense now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
_, info, err := negotiation.NegotiateOutputMediaType(req, metainternalversion.Codecs, &scope) | ||
if err != nil { | ||
scope.err(err, w, req) | ||
return | ||
} | ||
encoder := metainternalversion.Codecs.EncoderForVersion(info.Serializer, metav1beta1.SchemeGroupVersion) | ||
trace.Step(fmt.Sprintf("Serializing response as type %s", info.MediaType)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm 99,9% sure that all the time will spent after this.
So TBH, I would skip the previous steps (as tracing itself has some overhead) and leave just this one...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again trying to get a tight bracket around the "responsewriters.SerializeObject(info.MediaType, encoder, w, req, statusCode, partial)" call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same point here (sorry for being completely unclear on that).
I completely agree that this Step has value.
I just suggest removing those previous two ones (that should be negligible):
- trace.Step("Getting object kind, setting GVK")
- trace.Step("Renegotiating internal version")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -95,12 +102,14 @@ func transformResponseObject(ctx context.Context, scope RequestScope, req *http. | |||
} | |||
|
|||
// renegotiate under the internal version | |||
trace.Step("Renegotiating internal version") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to drop this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is now inline with the previous comments - so please drop this and the one mentioned below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -150,25 +162,30 @@ func transformResponseObject(ctx context.Context, scope RequestScope, req *http. | |||
} | |||
|
|||
// renegotiate under the internal version | |||
trace.Step("Renegotiating internal version") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
responsewriters.SerializeObject(info.MediaType, encoder, w, req, statusCode, table) | ||
return | ||
|
||
default: | ||
// this block should only be hit if scope AllowsConversion is incorrect | ||
trace.Step("Renegotiating internal version") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
/lgtm |
@logicalhan: changing LGTM is restricted to assignees, and only kubernetes/kubernetes repo collaborators may be assigned issues. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for being extremely unclear with what I meant - I hope now it's more clear what i wanted to write.
transformResponseObject(ctx, scope, req, w, status, result) | ||
} | ||
} | ||
|
||
// DeleteCollection returns a function that will handle a collection deletion | ||
func DeleteCollection(r rest.CollectionDeleter, checkBody bool, scope RequestScope, admit admission.Interface) http.HandlerFunc { | ||
return func(w http.ResponseWriter, req *http.Request) { | ||
trace := utiltrace.New("Delete " + req.URL.Path) | ||
defer trace.LogIfLong(1000 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - get is special, so let's ignore it.
But for both create:
https://github.com/kubernetes/apiserver/blob/master/pkg/endpoints/handlers/create.go#L47
and update:
https://github.com/kubernetes/apiserver/blob/master/pkg/endpoints/handlers/update.go#L49
we start at exactly the same place and use 500ms.
So I think we should be consistent.
mediaType, _, err := negotiation.NegotiateOutputMediaType(req, scope.Serializer, &scope) | ||
if err != nil { | ||
status := responsewriters.ErrorToAPIStatus(err) | ||
trace.Step("Writing raw JSON response") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - writing response is many times dominating factor - I completely agree that.
I completely wasn't clear what I wanted to say - sorry for that.
My point was that - the time between "negotiating output media type" and this step should be negligible.
So given that your point is completely correct, I think what I wanted to say is to:
- leave this one (to see whether we spent time here)
- but remove the "negotiating output media type" step (since different between this one and that should be negligible).
Does it make sense now?
_, info, err := negotiation.NegotiateOutputMediaType(req, metainternalversion.Codecs, &scope) | ||
if err != nil { | ||
scope.err(err, w, req) | ||
return | ||
} | ||
encoder := metainternalversion.Codecs.EncoderForVersion(info.Serializer, metav1beta1.SchemeGroupVersion) | ||
trace.Step(fmt.Sprintf("Serializing response as type %s", info.MediaType)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same point here (sorry for being completely unclear on that).
I completely agree that this Step has value.
I just suggest removing those previous two ones (that should be negligible):
- trace.Step("Getting object kind, setting GVK")
- trace.Step("Renegotiating internal version")
@@ -95,12 +102,14 @@ func transformResponseObject(ctx context.Context, scope RequestScope, req *http. | |||
} | |||
|
|||
// renegotiate under the internal version | |||
trace.Step("Renegotiating internal version") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is now inline with the previous comments - so please drop this and the one mentioned below.
Added tracing for use cases where etcd is not the cause of long running requests. Fixed spelling. Factored in Wojtek-t feedback.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cheftako, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
Adds tracing for use apiserver requests (get etc) where etcd is not the cause of the long running
requests.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #69748
Special notes for your reviewer:
Release note: