Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpc-js: Add more info to deadline exceeded errors #2692

Merged

Conversation

murgatroid99
Copy link
Member

Currently, the details string in a client-side deadline exceeded error is just "Deadline exceeded". This change adds more information to that, to help narrow down the cause of those errors. Specifically, it shows a breakdown of any time taken before the request was actually sent to a specific server, along with the address of the server if it was sent. For example, an error string from a test with all information shown looks like this:

Deadline exceeded after 1.001s,name resolution: 0.000s,metadata filters: 0.002s,previous attempts: 0,time to current attempt start: 0.001s,LB pick: 0.006s,remote_addr=[::1]:45251

This is just a first pass to get some useful information into the error string, and to add the framework to add or change individual pieces of information there as necessary.

This should largely address #2687.

@@ -684,7 +684,7 @@ export class InternalChannel {
host: string,
credentials: CallCredentials,
deadline: Deadline
): Call {
): LoadBalancingCall | RetryingCall {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my knowledge, I often see either of these or ResolvingCall on my grpc traces. Curious what the difference between the 3 are and whether they indicate any specific pattern? RetryingCall is the one that's a bit confusing as we don't have gRPC retries enabled by default. Does this account for the transparent retries?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those classes do have tracers set up, specifically load_balancing_call, retrying_call, and resolving_call. You should see them if you have them enabled and you make any requests. All of those are part of the call stack that makes up the internal implementation of a call object. As you say, RetryingCall is used by default for transparent retries.

this.trace('Deadline: ' + deadlineToString(this.deadline));
const timeout = getRelativeTimeout(this.deadline);
if (timeout !== Infinity) {
this.trace('Deadline will be reached in ' + timeout + 'ms');
const handleDeadline = () => {
this.cancelWithStatus(Status.DEADLINE_EXCEEDED, 'Deadline exceeded');
if (!this.deadlineStartTime) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is okay but we should never reach here right in lieu of line 106?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true but TypeScript doesn't see a guarantee that the value doesn't change before the timer triggers, so it's easier to just have the check for type safety.

@pratik151192
Copy link

Hey just checking on this PR where a check is failing for it to be merged

@murgatroid99
Copy link
Member Author

That check isn't the problem. It's just waiting on a review. That should happen soon.

@murgatroid99 murgatroid99 merged commit cc44d78 into grpc:@grpc/grpc-js@1.10.x Apr 1, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants