-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backup-stream: don't close the server stream when encountered errors #14432
Conversation
Signed-off-by: hillium <yujuncen@pingcap.com>
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
96d5198
to
5b630c6
Compare
The changes which would infect non-test code are only in the commit 18867c4. |
match send_all.await { | ||
Err(grpcio::Error::RemoteStopped) => { | ||
if let Err(err) = send_all.await { | ||
let can_retry = matches!(&err, grpcio::Error::RpcFailure(rpc_err) if rpc_err.code() == RpcStatusCode::UNAVAILABLE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this error can retry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default grpc-go
would retry it. (Should we also retry RESOURCE_EXHAUSTED
?)
https://pkg.go.dev/github.com/grpc-ecosystem/go-grpc-middleware/retry#pkg-variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it seems there is transparent retry inside gRPC implementation, perhaps we can remove this?(That is for clients.) What do you think? @BusyJay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea. I'm just curious why only this error can be retried.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I have removed the retry, because the client would retry sooner when the connection closed.
Signed-off-by: hillium <yujuncen@pingcap.com>
.await | ||
.report_if_err(format_args!("during removing subscription {}", id)) | ||
// The stream is an endless stream -- we don't need to close it. | ||
drop(sub); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we met non-grpc error here. shall we close it manually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to @BusyJay, we should call close
iff we have finished sending all items. In fact this stream is an infinite stream, so I think we don't need to close it in any conditions (Perhaps except server shutting down?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change look OK to me.
12ffb05
to
ed6373d
Compare
Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
ed6373d
to
241641f
Compare
@BusyJay Hi could you help to /merge this? |
/merge |
@BusyJay: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests You only need to trigger If you have any questions about the PR merge process, please refer to pr process. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
This pull request has been accepted and is ready to merge. Commit hash: 241641f
|
/test |
thread 'test::resolved_follower' panicked at 'not all keys are recorded: it remains ["7480000000000000ff015f728000000000ff0000ec0000000000faffffffffffffff0b", "7480000000000000ff015f728000000000ff0000020000000000fafffffffffffffff5", "7480000000000000ff015f728000000000ff00005a0000000000faffffffffffffff9d"] (total = 128)', components/backup-stream/tests/mod.rs:583:13 We need to improve the stability of integration tests... |
/test |
In response to a cherrypick label: new pull request created to branch |
/run-cherry-picker |
In response to a cherrypick label: new pull request created to branch |
close tikv#14426 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
…ikv#14432) close tikv#14426 Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> Signed-off-by: hillium <yujuncen@pingcap.com>
What is changed and how it works?
Issue Number: Close #14426, Close #14910
What's Changed:
This PR make us won't close the grpc server stream when encountered some errors.
Check List
Tests
There is a tricky unit test added, these tests can pass, however I'm not sure whether they should be added to the code.
Details
Release note
This is a fix over bug in master branch.