Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix connectivity state transitions when dialing #1596

Merged
merged 7 commits into from
Oct 23, 2017

Conversation

menghanl
Copy link
Contributor

@menghanl menghanl commented Oct 19, 2017

The connectivity for addrConn goes from Connecting to TransientFailure and back to Connecting when dialing. This causes the first RPC to fail if it's not wait-for-ready. This PR makes addrConn skip the first TransientFailure if it's dialing.

@menghanl menghanl requested a review from dfawley October 19, 2017 22:26
@tamird
Copy link
Contributor

tamird commented Oct 19, 2017

@menghanl thanks for making this change. Do you think it deserves a test?

@menghanl
Copy link
Contributor Author

I removed the FailFast(false) calloption from pickfirst and roundrobin tests.
This will cover the case that first failfast RPC after non-blocking dial doesn't fail.

call_test.go Outdated
@@ -285,7 +285,7 @@ func TestInvokeCancelClosedNonFailFast(t *testing.T) {
req := "hello"
ctx, cancel := context.WithCancel(context.Background())
cancel()
if err := Invoke(ctx, "/foo/bar", &req, &reply, cc, FailFast(false)); err == nil {
if err := Invoke(ctx, "/foo/bar", &req, &reply, cc); err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is called TestInvokeCancelClosedNonFailFast - did you mean to make this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Reverted.

clientconn.go Outdated
@@ -904,15 +890,17 @@ func (ac *addrConn) errorf(format string, a ...interface{}) {

// resetTransport recreates a transport to the address for ac. The old
// transport will close itself on error or when the clientconn is closed.
//
// The connectivity state of ac will be set to TransientFailure if it's not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this accurate? It seems that ac.state will be set to connectivity.Connecting unconditionally and then to connectivity. TransientFailure on every attempt; I can't tell which part of this is sensitive to "if it's not first time doing resetTransport."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state is set by transportMonitor to TransientFailure before this function is called.
I removed this comment and add a comment in transportMonitor.

@menghanl menghanl merged commit b3ed81a into grpc:master Oct 23, 2017
@menghanl menghanl deleted the dial_connectivity_state branch October 23, 2017 21:06
@tamird
Copy link
Contributor

tamird commented Oct 23, 2017

Can you guys cut 1.7.1 with this?

@menghanl
Copy link
Contributor Author

@tamird Will do it this week.

menghanl added a commit to menghanl/grpc-go that referenced this pull request Oct 25, 2017
menghanl added a commit to menghanl/grpc-go that referenced this pull request Oct 25, 2017
menghanl added a commit to menghanl/grpc-go that referenced this pull request Oct 25, 2017
This commit cherry-picked the connectivity state change from b3ed81a,
excluding the connect() function cleanup.
@menghanl
Copy link
Contributor Author

@tamird Just cut 1.7.1.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants