Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ruby] Make GRPC::DeadlineExceeded report properly #33565

Closed
wants to merge 2 commits into from

Conversation

stanhu
Copy link
Contributor

@stanhu stanhu commented Jun 28, 2023

In grpc v1.46.2 and later versions, #29155 caused the Ruby client to return GRPC::Core::CallError, a non-standard error, with a message: grpc_call_start_batch failed with outstanding read or write present (code=8). However, the actual error should have been GRPC::DeadlineExceeded. This occurred because when GRPC::DeadlineExceeded is raised, the @metadata_received flag doesn't get flipped. When receive_and_check_status runs to determine
the error, the RECV_INITIAL_METADATA is erroneously sent again.

To avoid this, flip the flag in an ensure block whenever the RECV_INITIAL_METADATA op is set.

Closes #33283

@stanhu stanhu changed the title ruby: make GRPC::DeadlineExceeded report properly ruby: Make GRPC::DeadlineExceeded report properly Jun 28, 2023
@stanhu stanhu force-pushed the sh-ruby-fix-issue-33283 branch 2 times, most recently from 434ce0f to f5a9ee4 Compare June 28, 2023 07:59
@stanhu stanhu changed the title ruby: Make GRPC::DeadlineExceeded report properly [ruby] Make GRPC::DeadlineExceeded report properly Jun 28, 2023
@stanhu
Copy link
Contributor Author

stanhu commented Jul 4, 2023

@apolcyn What do you think of this fix?

@alto-ruby alto-ruby requested a review from apolcyn July 4, 2023 17:46
@alto-ruby alto-ruby added the release notes: no Indicates if PR should not be in release notes label Jul 4, 2023
ensure
# Ensure we don't attempt to request the initial metadata again
# in case an exception occurs.
@metadata_received = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This becomes the opposite that of the original report that if run_batch throws before sent RECV_INITIAL_METADATA, we are setting metadata_received anyways.

Ideally we need to c-core to propagate the error from completion queue so here we know whether to metadata_received or not.

Given that's a fairly large change in c-core, I incline to accept this to unblock the customer as I mentioned in the other comment in the issue, missing a RECV_INITIAL_METADATA doesn't really cause problems (for now at least).

@apolcyn Do you have any other concerns?

@faustind
Copy link

🙌🏾

Thanks to all who are working to resolve #33283.

With increased traffic between our gRPC services, the frequency of this non-standard error being reported has also increased.

This PR was approved by @alto-ruby approximately 2 months ago. However, it seems there has been no further activity since then.

We look forward to this PR being merged.

@stanhu
Copy link
Contributor Author

stanhu commented Nov 17, 2023

I'm seeing a conflict with #34016, which seems like it squelches the error? I'm not sure that's what we want; we want to see DeadlineExceeded.

@xlgmokha
Copy link

@alto-ruby is there anything that we can do about the concern raised here. I'd love to see this patch released.

@alto-ruby
Copy link
Contributor

@apolcyn Do you have any concerns to tis PR?

@alto-ruby
Copy link
Contributor

@apolcyn
Hi Alex, happy new year!
Could we get your approval of this PR?
I'll wait one more week to see if you have any concerns, if not, I'll go ahead and merge it. Sounds OK?

@HannahShiSFB
Copy link
Collaborator

@sampajano Hi Eryu, could you help me check the "import/copybara Pending" status? Thanks a lot!

@sampajano
Copy link
Contributor

@HannahShiSFB Thanks for checking! I've approved the internal merge request and will find one more reviewer to approve too (2 is required)!

@sampajano
Copy link
Contributor

@alto-ruby @HannahShiSFB There's some test failures. Can you work on resolving them? Thanks! :)

@stanhu
Copy link
Contributor Author

stanhu commented Jan 22, 2024

The test failures look unrelated to this change. I rebased master.

@sampajano
Copy link
Contributor

@stanhu Thanks for checking and rebasing! Running the tests again now! 😃

@stanhu
Copy link
Contributor Author

stanhu commented Jan 23, 2024

Test failures look unrelated again. 🤷

@sampajano
Copy link
Contributor

ok.. that's unfortunate.. can you maybe rebase again? 😅

In grpc v1.46.2 and later versions, grpc#29155 caused the Ruby client to
return `GRPC::Core::CallError`, a non-standard error, with a message:
`grpc_call_start_batch failed with outstanding read or write present
(code=8)`. However, the actual error should have been
`GRPC::DeadlineExceeded`. This occurred because when
`GRPC::DeadlineExceeded` is raised, the `@metadata_received` flag
doesn't get flipped. When `receive_and_check_status` runs to determine
the error, the `RECV_INITIAL_METADATA` is erroneously sent again.

To avoid this, flip the flag in an `ensure` block whenever the
`RECV_INITIAL_METADATA` op is set.

Closes grpc#33283
@stanhu
Copy link
Contributor Author

stanhu commented Jan 23, 2024

@sampajano Rebased again. Can you launch a CI run?

@sampajano
Copy link
Contributor

Thank you! Done! Fingers crossed.. 😄

@stanhu
Copy link
Contributor Author

stanhu commented Jan 24, 2024

So close! Everything passed except for https://source.cloud.google.com/results/invocations/eeb9b8bd-12a5-4887-a504-47b85ac55c06/targets, but this seems like a flaky test.

@sampajano
Copy link
Contributor

Looks like it! Lemme run it again! 😄

@stanhu
Copy link
Contributor Author

stanhu commented Mar 6, 2024

@sampajano Could we get this merged?

@sampajano
Copy link
Contributor

@stanhu I'm sorry for the delays.. But it seems like there's another unrelated test failure.. 😅

I'm checking with the team how to get it merged in spite of the error..

But in the meantime, do you mind rebasing yet again and i'll run the test again..?

Apologize again for the trouble..

@copybara-service copybara-service bot closed this in 489d24c Mar 9, 2024
@sampajano
Copy link
Contributor

@stanhu I manually re-imported the PR and it worked!! Thanks so much for your contribution here again! And sorry for the delays! 😃

@stanhu
Copy link
Contributor Author

stanhu commented Mar 9, 2024

@sampajano Thank you!

@sampajano
Copy link
Contributor

@stanhu you're most welcome! Thank you again for your contribution! 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kokoro:run lang/ruby release notes: no Indicates if PR should not be in release notes
Projects
None yet
8 participants