Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vtadmin] custom discovery resolver #9977

Merged

Conversation

ajm188
Copy link
Contributor

@ajm188 ajm188 commented Mar 24, 2022

Description

This PR introduces a custom grpc resolver based on our discovery.Discovery interface for use in both vtctld and vtgate grpc communication in vtadmin.

Motivation and Rationale

There are two* main drawbacks to the current discovery/dialing approach in both vtctldclient and vtsql for vtadmin:

  1. If there are N potential hosts to connect to, a given vtadmin process will pick exactly one and send all traffic for all requests to that host for its lifetime. This both unevenly distributes load and could in theory overwhelm the one host connected.
  2. This one's only sort of a drawback, as we've taken steps in both [vitessdriver|vtadmin] Support Ping in vitessdriver, use in vtadmin to healthcheck connections during Dial #7709 and [vtadmin] Update vtctld dialer to validate connectivity  #9915 to detect and mitigate, but grpc's "design" "philosophy" is that it'll "handle" transient failures and try in the background (forever ...) to reestablish. This means if a vtctld/vtgate goes away permanently, we'll just spin and spin and surface errors back to the user rather than actually do anything useful.

This solves both of these problems by using the resolver API to use our discovery to (1) provide multiple hosts for grpc to multiplex the ClientConn over and (2) let grpc request a re-resolve (re-discovery, in our case) to update the address list if it's getting connection failures.

Preparatory changes

  1. Discovery now has Discover{Vtctld,VTGate}Addrs with the semantics of Discover*Addr (in that we run the addr template) but with N addresses returned instead.
  2. maybe others i'll remember, or none!

Overview

When you call grpc.Dial(addr, opts...), grpc will parse addr as follows:

(<scheme>://)?(<authority>/)?<endpoint>

So, for example, the addr dns://some_authority/foo.bar will be parsed into &Target{Scheme: "dns", Authority: "some_authority", Endpoint: "foo.bar"}.

If there is a resolver registered for the given scheme (there is both a global and local registry; more on this in a bit), then grpc will use that resolver to resolve the target into one or more addresses to connect to. Otherwise (or if the target has no scheme, in the case of "/some_authority/foo.bar" or just "//foo.bar"), the default scheme (dns) is used. The Dial call then returns a single ClientConn, which has a SubConn for each address returned by the resolver, which grpc can choose (or not, depending on the balancer configuration, connection errors, etc) to multiplex the RPCs over. After enough transient failures, or periodically (opaquely, "when grpc damn well feels like it") it will request the resolver to ResolveNow again, and the resolver should call UpdateState with the new set of addresses that grpc should use for that ClientConn.

grpc.WithResolvers(...)

As mentioned above, when Dial-ing, grpc has two resolver registries to check -- the global one (via resolver.Register, and a ClientConn-local one. The latter takes precedence over the former, and is the one we're interested in. To add a resolver to the local registry, grpc provides a DialOption called WithResolvers, which works exactly the same as resolver.Register, except that it's local to the ClientConn being dialed.

This allows us to have resolvers for each cluster, with their own configuration options, without risking – or having to reason about – them stepping over each other.

Within VTAdmin

With all that prologue out of the way, let's talk about how this ties together within the context of vtadmin. Each cluster gets two resolvers, one for vtctldclient proxies, and one for vtgate proxies. Each of these uses the cluster.Id as the "scheme" (N.B. because we only ever dial with WithResolvers, things are local to the cluster so we could probably get away with using the same scheme (e.g. "vtadmin://") across all clusters, but there's no harm (as far as I can tell) in doing things the way I currently have them either), and then either "vtctld" or "vtgate" as the "authority". In order to instruct grpc to use our resolvers, then, we do not dial an actual address, but instead either the "address" "{clusterId}://vtctld/" or "{clusterId}://vtgate/".

So, when creating a VtctldClientProxy or VTGateProxy from the cluster CLI options (via the Parse methods), we new up a resolver.Options (our type), which takes a Discovery implementation, and then parses any DiscoveryTags, DiscoveryTimeout, and BalancerPolicy (more on this one below) from the cluster flags. We then use the options to instantiate a resolver.Builder, which will get passed into WithResolvers when either the vtctld or vtgate client makes its call to Dial.

Within the resolver itself, whenever grpc asks us to ResolveNow, we use the discoverFunc (which is either DiscoverVtctldAddrs or DiscoverVTGateAddrs, depending on the "authority" (component)) to look up the appropriate set of addresses, which we send back to the ClientConn via UpdateState. We also log information about the resolution, and update the resolver to remember what address list it most recently sent to the ClientConn for debug purposes. Information about the resolver state is bubbled up from resolver instance => resolver builder => vtgate or vtctld proxy => cluster debug endpoint(s). For example, running locally the other day, I can show (I've changed some of the field names since then, but you get the idea):

➜  vitess git:(andrew/vtadmin-vtctldclient-resolver) ✗ curl -s localhost:14200/debug/clusters | jq '.[][].resolver | select(. != null)'  
{
  "discovery_tags": null,
  "resolve_timeout": 1000000000,
  "resolvers": [
    {
      "addr_list": [
        {
          "Addr": "localhost:15999",
          "ServerName": "",
          "Attributes": null,
          "Type": 0,
          "Metadata": null
        }
      ],
      "cluster": "local",
      "component": "vtctld",
      "created_at": "2022-03-28T13:17:06Z",
      "last_resolved_at": "2022-03-28T13:17:36Z"
    }
  ],
  "scheme": "local"
}
{
  "discovery_tags": [],
  "resolve_timeout": 1000000000,
  "resolvers": [
    {
      "addr_list": [
        {
          "Addr": "localhost:15991",
          "ServerName": "",
          "Attributes": null,
          "Type": 0,
          "Metadata": null
        }
      ],
      "cluster": "local",
      "component": "vtgate",
      "created_at": "2022-03-28T13:17:06Z",
      "last_resolved_at": "2022-03-28T13:17:36Z"
    }
  ],
  "scheme": "local"
}

BalancerPolicy

The default balancer policy when none is specified is called pick_first. This is effectively grpc doing net.Dial(state.Addresses[0]), which still gives us the problem of "all the load goes to one host". To get around this, without making the resolver have to be aware of shuffling the host list, we expose new flag(s) (--{vtctld,vtgate}-grpc-balancer-policy) to allow users to specify the empty string (default policy), pick_first explicitly, or round_robin.

To simulate RR balancing, I added a non-existent vtctld to my static discovery file and enabled RR:

W0330 09:40:30.874143   57707 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {localhost:20991  <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp: lookup localhost: operation was canceled". Reconnecting...
W0330 09:40:30.874757   57707 component.go:41] [core] grpc: addrConn.createTransport failed to connect to {localhost:20991  <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial tcp [::1]:20991: connect: connection refused". Reconnecting...
diff --git a/examples/local/vtadmin/discovery.json b/examples/local/vtadmin/discovery.json
index def7dd50f8..a3e3f8c282 100644
--- a/examples/local/vtadmin/discovery.json
+++ b/examples/local/vtadmin/discovery.json
@@ -5,6 +5,12 @@
                 "fqdn": "localhost:15000",
                 "hostname": "localhost:15999"
             }
+        },
+        {
+            "host":{
+                "fqdn": "localhost:20000",
+                "hostname": "localhost:20991"
+            }
         }
     ],
     "vtgates": [
diff --git a/examples/local/scripts/vtadmin-up.sh b/examples/local/scripts/vtadmin-up.sh
index fae840847b..3b0fe53dcf 100755
--- a/examples/local/scripts/vtadmin-up.sh
+++ b/examples/local/scripts/vtadmin-up.sh
@@ -14,8 +14,9 @@ vtadmin \
   --http-tracing \
   --logtostderr \
   --alsologtostderr \
   --rbac-config="./vtadmin/rbac.yaml" \
-  --cluster "id=local,name=local,discovery=staticfile,discovery-staticfile-path=./vtadmin/discovery.json,tablet-fqdn-tmpl={{ .Tablet.Hostname }}:15{{ .Tablet.Alias.Uid }}" \
+  --cluster "id=local,name=local,discovery=staticfile,discovery-staticfile-path=./vtadmin/discovery.json,vtctld-grpc-balancer-policy=round_robin,tablet-fqdn-tmpl={{ .Tablet.Hostname }}:15{{ .Tablet.Alias.Uid }}" \
   > "${log_dir}/vtadmin-api.out" 2>&1 &
 vtadmin_pid=$!

After removing the RR policy flag (sending us back to pick_first, meaning we would pick the legit vtctld), things went back to normal:

I0330 09:42:00.050492   58484 resolver.go:243] [vtadmin.cluster.resolver]: resolving vtctlds (cluster local)
I0330 09:42:00.050554   58484 resolver.go:306] [vtadmin.cluster.resolver]: found 2 vtctlds (cluster local)
I0330 09:42:00.051179   58484 resolver.go:243] [vtadmin.cluster.resolver]: resolving vtgates (cluster local)
I0330 09:42:00.051216   58484 resolver.go:306] [vtadmin.cluster.resolver]: found 1 vtgates (cluster local)
I0330 09:42:00.051183   58484 resolver.go:243] [vtadmin.cluster.resolver]: resolving vtgates (cluster local)
I0330 09:42:00.053309   58484 resolver.go:306] [vtadmin.cluster.resolver]: found 1 vtgates (cluster local)
I0330 09:42:00.051487   58484 proxy.go:141] Established gRPC connection to vtctld
I0330 09:42:00.058540   58484 vtsql.go:145] Have valid connection to vtgate, reusing it.

Appendix

Some notable changes:

  1. vtctld and vtgate configs no longer take a Discovery directly. This was only used in order to discover a host to Dial, which is now handled by the resolver transparently, so they don't need it, and push the Discovery down to the resolver builder.
  2. We can't annotate dial spans with the "host" we've dialed, because we don't know (it's all background in grpc/resolver code). As a compromise, we have a reference to the builder, which we implement debug.Debuggable for, so the address list and balancer policy show up in the /debug endpoint(s) for vtadmin and for a given cluster.

Testing

In addition to the unit tests I updated and added, which all pass, this has been tested both against the local example and against development vtadmin deployments.

Related Issue(s)

Checklist

  • Should this PR be backported? no
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

@ajm188 ajm188 added this to In progress in VTAdmin via automation Mar 24, 2022
@ajm188 ajm188 force-pushed the andrew/vtadmin-vtctldclient-resolver branch 3 times, most recently from f5fc354 to a415798 Compare March 24, 2022 17:50
@ajm188 ajm188 mentioned this pull request Mar 29, 2022
3 tasks
@ajm188 ajm188 force-pushed the andrew/vtadmin-vtctldclient-resolver branch from c8df110 to 1c1ac19 Compare March 29, 2022 20:34
@ajm188 ajm188 added Type: Internal Cleanup Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: VTAdmin VTadmin interface release notes labels Mar 30, 2022
@ajm188
Copy link
Contributor Author

ajm188 commented Mar 30, 2022

okay @notfelineit @doeg @setassociative I'm marking this ready for review!! I still need to write Actual Release Notes, but given that things may change pending PR feedback, I'm going to wait on that until things seem to have solidified around the design details (but I will do it in this branch before merging, I promise!)

@ajm188 ajm188 marked this pull request as ready for review March 30, 2022 19:57
Copy link
Contributor

@doeg doeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, both design and implementation. Thanks for the thorough write-up + commentary.

For posterity, it's worth mentioning we've had this running in our Slack environment for a couple of days, and it's been resilient to both vtctlds and vtgates going away. 💯

@deepthi deepthi requested a review from vmg March 30, 2022 21:55
Copy link
Collaborator

@vmg vmg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new GRPC Discovery looks good! I'm not sure about the actual vtadmin integration, that part of Vitess is foreign to me. :)

@ajm188
Copy link
Contributor Author

ajm188 commented Mar 31, 2022

we're currently seeing issues with keepalives closing connections repeatedly. vtadmin is still functional because we just redial but it muddies logs and could cause intermittent failures for users.

i've narrowed the problem down to the fact that our default grpcclient dial options include a keepalive.ClientParameters with PermitWithoutStream: true, whereas our default server enforcement policies (so for connecting to both gates and vtctlds) is to immediately close any connection that sends a ping without an active stream (c.f.)

@ajm188
Copy link
Contributor Author

ajm188 commented Apr 4, 2022

okayyyyyy!!!!!!!!!!!! rebasing on top of #10024 makes these logs go away. i believe grpc/grpc-go#4579 is fixing the culprit of our problem.

after @vmg's change is merged, i'll rebase and update the code to use the newer fields, namely:

resolver: add AddressMap and State.BalancerAttributes
resolver: Add URL field to Target to store parsed dial target

Copy link
Contributor

@doeg doeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For breadcrumbs, we tested this (well, this + #10024) again in Slack's dev environment this morning and it works even better than before. :) The logs are very quiet now.

I realize you don't need another +1 to merge post-rebase, so I'll just leave one now. 🏆 Thanks again. This is really nice.

@ajm188
Copy link
Contributor Author

ajm188 commented Apr 4, 2022

there's one more change i'll need to make and re-verify, unfortunately (that Target.Url bit) i think, because the field i'm currently using (because it's the only one available) has been deprecated between grpc 1.37 and 1.45

@ajm188
Copy link
Contributor Author

ajm188 commented Apr 4, 2022

i've updated the rebase branch with the fix if you want to test in your environment. i can't push it to this branch (yet) because it literally won't compile 😭 😆

Andrew Mason added 4 commits April 4, 2022 12:33
1. Register `VtctldServer`, not `VtctlServer`.
2. Abuse `GetKeyspace` to allow the client to inspect the listen
   address of a given server to use in assertions.
3. Building on (2), stop asserting on proxy.host, which will be going
   away>

Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Still need to actually add flags for these, but this will make it easier.

Also make some things unexported.

Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Andrew Mason added 21 commits April 4, 2022 12:33
Signed-off-by: Andrew Mason <andrew@planetscale.com>
…th empty list, and tests

Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
Signed-off-by: Andrew Mason <andrew@planetscale.com>
…ted)

Signed-off-by: Andrew Mason <andrew@planetscale.com>
@ajm188 ajm188 force-pushed the andrew/vtadmin-vtctldclient-resolver branch from d3c6b96 to 3d7ab09 Compare April 4, 2022 16:36
Signed-off-by: Andrew Mason <andrew@planetscale.com>
@ajm188 ajm188 merged commit 94a6b76 into vitessio:main Apr 4, 2022
VTAdmin automation moved this from In progress to Done Apr 4, 2022
@ajm188 ajm188 deleted the andrew/vtadmin-vtctldclient-resolver branch April 4, 2022 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: VTAdmin VTadmin interface Type: Enhancement Logical improvement (somewhere between a bug and feature) Type: Internal Cleanup
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants