clientv3: respect up/down notifications from grpc #5845

heyitsanthony · 2016-07-01T18:30:38Z

Partial patch; will need to revendor grpc once the fix on that end is merged.

gyuho · 2016-07-01T18:43:28Z

clientv3/balancer.go

 	numGets uint32
+	// mu protects upEps, downEps, and numGets


Maybe mu sync.Mutex on the top of upEps and numGets?

And what is downEps? I don't see it in the struct fields?

comment was outdated; fixed

heyitsanthony · 2016-08-01T23:40:11Z

fixed up to use balancer for this functionality since withblock patch was rejected PTAL /cc @xiang90

xiang90 · 2016-08-02T01:57:47Z

clientv3/balancer.go

-	v := atomic.AddUint32(&b.numGets, 1)
-	ep := b.eps[v%uint32(len(b.eps))]
-	return grpc.Address{Addr: getHost(ep)}, func() {}, nil
+	b.mu.Lock()


it seems that we need to do more work here?

see comments on Get at https://godoc.org/google.golang.org/grpc#Balancer

Also: https://github.com/grpc/grpc-go/blob/master/balancer.go#L272-L364

I think most of that complication is from making it as general as possible. It's safe to assume FailFast is false so there's no need to implement the suggested convoluted blocking logic.

do we expose the failfast option to user now?

xiang90 · 2016-08-03T16:34:46Z

clientv3/balancer.go

+	defer b.mu.Unlock()
+
+	if b.pinAddr != nil {
+		if _, ok := b.upEps[b.pinAddr.Addr]; ok || time.Since(b.pinTime) < b.pinWait {


i am not very clear about this. why do we need to have a pinWait? if one pined address is still up, should we use it until it fails? current pinWait is 500ms, so a new rpc coming after 500ms will choose a new endpoint now?

OK, this code is bad; the dial timeout is already in the grpc dialer and there should only be one connecting / up endpoint at a time. I can simplify it.

heyitsanthony · 2016-08-04T15:31:14Z

@xiang90 all fixed PTAL

xiang90 · 2016-08-04T17:15:39Z

clientv3/balancer.go

+	b.mu.Unlock()
+	// notify client that a connection is up
+	select {
+	case b.upc <- struct{}{}:


it seems like we only needs to send this once when we create the client? should we rename this to ready? and wrap it with sync.Once?

xiang90 · 2016-08-04T17:21:56Z

LGTM

heyitsanthony · 2016-08-04T17:25:11Z

There may be a subtle bug in this if connectingAddr's host is down but other endpoints are available; I'll see if I can trigger it with a test case.

heyitsanthony · 2016-08-04T21:34:42Z

Added a test for the failover case. Now blocked on grpc/grpc-go#810

Fixes etcd-io#5842

xiang90 · 2016-08-16T17:21:52Z

clientv3/retry.go

+	return func(rpcCtx context.Context, f rpcFunc) {
+		for {
+			err := f(rpcCtx)
+			// ignore grpc conn closing on fail-fast calls; they are transient errors


is there a dial failure error? can connclosing happen after writing the request?

There is no explicit dial failure error; the closest thing is the helpfully unexported errConnClosing (which gets grpc.Errorf()'d into a grpc formatted error). Transport errors seem to either go through ConnectionErrorf or prefixed with "transport:". I guess the safest policy is retry only on isConnClosing(err) and bail out otherwise?

heyitsanthony · 2016-08-16T18:37:16Z

OK, changed the retry logic to bail if err != closing. CI seems to be happy. PTAL /cc @xiang90

xiang90 · 2016-08-16T18:50:09Z

lgtm

gyuho reviewed Jul 1, 2016
View reviewed changes

heyitsanthony force-pushed the clientv3-ignore-dead-eps branch from 24bed9c to 4085ae1 Compare July 1, 2016 21:02

gyuho added the backport/v3.3 label Jul 4, 2016

westhood mentioned this pull request Jul 12, 2016

clientv3: support sync membership of etcd cluster #5920

Closed

heyitsanthony force-pushed the clientv3-ignore-dead-eps branch 3 times, most recently from 941a85f to a4563d1 Compare August 1, 2016 22:04

gyuho added the kind/release-note label Aug 1, 2016

heyitsanthony force-pushed the clientv3-ignore-dead-eps branch 2 times, most recently from 5414a8b to 38f2cd0 Compare August 1, 2016 23:19

xiang90 reviewed Aug 2, 2016
View reviewed changes

heyitsanthony force-pushed the clientv3-ignore-dead-eps branch 5 times, most recently from 8755254 to 8b53cea Compare August 2, 2016 04:34

xiang90 reviewed Aug 3, 2016
View reviewed changes

heyitsanthony force-pushed the clientv3-ignore-dead-eps branch 3 times, most recently from c78bd3f to 83e7ddb Compare August 4, 2016 06:26

xiang90 reviewed Aug 4, 2016
View reviewed changes

heyitsanthony force-pushed the clientv3-ignore-dead-eps branch from 83e7ddb to af6b266 Compare August 4, 2016 21:32

heyitsanthony force-pushed the clientv3-ignore-dead-eps branch from e4946c9 to 7985874 Compare August 12, 2016 00:21

heyitsanthony force-pushed the clientv3-ignore-dead-eps branch from 7985874 to 9e82185 Compare August 14, 2016 06:34

vendor: update grpc

b9d01fb

heyitsanthony force-pushed the clientv3-ignore-dead-eps branch 2 times, most recently from 7de19f7 to 11f2b99 Compare August 16, 2016 16:49

clientv3: respect up/down notifications from grpc

46765ad

Fixes etcd-io#5842

heyitsanthony force-pushed the clientv3-ignore-dead-eps branch from 11f2b99 to 7a9fa02 Compare August 16, 2016 17:15

integration: treat client TLS connecting to insecure server as timeout

ee3797d

xiang90 reviewed Aug 16, 2016
View reviewed changes

heyitsanthony force-pushed the clientv3-ignore-dead-eps branch from 7a9fa02 to 3eadf96 Compare August 16, 2016 17:47

clientv3: use failfast and retry wrappers for at-most-once rpcs

3eadf96

heyitsanthony merged commit 8d77035 into etcd-io:master Aug 16, 2016

heyitsanthony mentioned this pull request Aug 16, 2016

test: TestCtlV3MemberRemove #6041

Closed

heyitsanthony deleted the clientv3-ignore-dead-eps branch August 16, 2016 20:52

gyuho removed kind/need-release-note-v3.3 backport/v3.3 labels Nov 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clientv3: respect up/down notifications from grpc #5845

clientv3: respect up/down notifications from grpc #5845

heyitsanthony commented Jul 1, 2016

gyuho Jul 1, 2016

heyitsanthony Jul 1, 2016

heyitsanthony commented Aug 1, 2016

xiang90 Aug 2, 2016

heyitsanthony Aug 2, 2016

xiang90 Aug 3, 2016

heyitsanthony Aug 3, 2016

xiang90 Aug 3, 2016

xiang90 Aug 3, 2016

heyitsanthony Aug 3, 2016

heyitsanthony commented Aug 4, 2016

xiang90 Aug 4, 2016

heyitsanthony Aug 4, 2016

xiang90 commented Aug 4, 2016

heyitsanthony commented Aug 4, 2016

heyitsanthony commented Aug 4, 2016

xiang90 Aug 16, 2016

heyitsanthony Aug 16, 2016

heyitsanthony commented Aug 16, 2016

xiang90 commented Aug 16, 2016

clientv3: respect up/down notifications from grpc #5845

clientv3: respect up/down notifications from grpc #5845

Conversation

heyitsanthony commented Jul 1, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

heyitsanthony commented Aug 1, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

heyitsanthony commented Aug 4, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xiang90 commented Aug 4, 2016

heyitsanthony commented Aug 4, 2016

heyitsanthony commented Aug 4, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

heyitsanthony commented Aug 16, 2016

xiang90 commented Aug 16, 2016