Multi-endpoint feature #135

nimf · 2022-07-01T19:05:09Z

The purpose of GcpMultiEndpointChannel is twofold:

Fallback to an alternative endpoint (host:port) of a gRPC service when the original endpoint is completely unavailable.
Be able to route an RPC call to a specific group of endpoints.

A group of endpoints is called a MultiEndpoint and is essentially a list of endpoints where priority is defined by the position in the list with the first endpoint having top priority. A MultiEndpoint tracks endpoints' availability. When a MultiEndpoint is picked for an RPC call, it picks the top priority endpoint that is currently available. More information on the MultiEndpoint class.

GcpMultiEndpointChannel can have one or more MultiEndpoint identified by its name -- arbitrary string provided in the GcpMultiEndpointOptions when configuring MultiEndpoints. This name can be used to route an RPC call to this MultiEndpoint by setting the ME_KEY key value of the RPC CallOptions.

GcpMultiEndpointChannel receives a list of GcpMultiEndpointOptions for initial configuration. An updated configuration can be provided at any time later using setMultiEndpoints(List). The first item in the GcpMultiEndpointOptions list defines the default MultiEndpoint that will be used when no MultiEndpoint name is provided with an RPC call.

Example configuration:

MultiEndpoint named "default" with endpoints:

service.example.com:443
service-fallback.example.com:443

MultiEndpoint named "read" with endpoints:

ro-service.example.com:443
service-fallback.example.com:443
service.example.com:443

Let's assume we have a service with read and write operations and the following backends:

service.example.com -- the main set of backends supporting all operations
service-fallback.example.com -- read-write replica supporting all operations
ro-service.example.com -- read-only replica supporting only read operations

With the configuration above GcpMultiEndpointChannel will use the "default" MultiEndpoint by default. It means that RPC calls by default will use the main endpoint and if it is not available then the read-write replica.

To offload some read calls to the read-only replica we can specify "read" MultiEndpoint in the CallOptions. Then these calls will use the read-only replica endpoint and if it is not available then the read-write replica and if it is also not available then the main endpoint.

GcpMultiEndpointChannel creates a GcpManagedChannel channel pool for every unique endpoint. For the example above three channel pools will be created.

mohanli-ml

Thanks for the PR! Please take a look at the comments, and let me know if you have any questions.

grpc-gcp/src/main/java/com/google/cloud/grpc/multiendpoint/Endpoint.java

grpc-gcp/src/main/java/com/google/cloud/grpc/multiendpoint/MultiEndpoint.java

grpc-gcp/src/main/java/com/google/cloud/grpc/GcpMultiEndpointChannel.java

grpc-gcp/src/main/java/com/google/cloud/grpc/GcpMultiEndpointOptions.java

grpc-gcp/src/main/java/com/google/cloud/grpc/GcpMultiEndpointChannel.java

wenbozhu

LG overall.

One thing I was looking for is a way to define a pluggable module to monitor the original endpoint and the underlying mechanism, e.g. /gen204 etc. The implementation will be subject to change.

nimf · 2022-07-13T19:29:30Z

@wenbozhu I found we don't need to make /gen204 requests as we set up a minimum number of channels to be always connected. This replaces the need of /gen204 because a few HTTP2 connections are always alive to every endpoint. If all connections to an endpoint break, the endpoint is considered unavailable and the minimum number of channels/connections try to reconnect until connection is successful and thus treating the endpoint as available again. Does it make sense?

wenbozhu · 2022-07-13T20:06:23Z

Are those H2 connections gRPC channels? Do we rely on H2 PING to keep the connections alive, and what's the (default) interval?
Given we will maintain a few (default?) the interval could be increased (maybe by 1/2 of the total number of connections) if we are concerned about the overhead.

nimf · 2022-07-14T00:56:11Z

Yes those are gRPC channels. We do rely on http2 ping keepalive but if there are no calls then it won’t send pings because keepalive_permit_without_calls is expected to be false. So for the few connections on an unused endpoint no pings will be sent.

wenbozhu · 2022-07-14T03:13:06Z

Let's discuss this offline. Those channels (to the backup endpoint or to a broken primary endpoint) will not have active requests, so some form of keepalive pings are needed for the purpose of detecting the health of the endpoint.

rahul2393 · 2022-10-27T07:20:28Z

@nimf Any ETA when this will be available in Go https://github.com/GoogleCloudPlatform/grpc-gcp-go.

nimf · 2022-10-27T21:18:09Z

@rahul2393 preliminary ETA is somewhere in Q1 2023

nimf added 5 commits June 30, 2022 13:20

Add MultiEndpoint

a2de65d

Allow GcpManagedChannel state change notifications.

489efd9

Remove race conditions when creating new channels.

bc260b8

Add minSize to options toString, reorder options in toString.

4f67579

Add multi-endpoint feature.

28fcaf4

nimf requested a review from mohanli-ml July 1, 2022 19:27

mohanli-ml reviewed Jul 4, 2022

View reviewed changes

nimf added 3 commits July 6, 2022 10:23

Address PR comments

892c739

Address PR comments

3884c96

Add TODOs for endpoints credentials.

282c0d7

mohanli-ml reviewed Jul 8, 2022

View reviewed changes

grpc-gcp/src/main/java/com/google/cloud/grpc/GcpMultiEndpointChannel.java Outdated Show resolved Hide resolved

nimf added 3 commits July 8, 2022 11:14

Rename maybeFallback to maybeUpdateCurrentEndpoint.

af5be1e

Add authorityFor method

8080ea2

Update TODOs for different channel credentials.

34e77da

mohanli-ml approved these changes Jul 10, 2022

View reviewed changes

nimf merged commit b7e0ef2 into master Jul 12, 2022

wenbozhu reviewed Jul 13, 2022

View reviewed changes

nimf deleted the multi-endpoint branch November 8, 2023 00:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-endpoint feature #135

Multi-endpoint feature #135

nimf commented Jul 1, 2022

mohanli-ml left a comment

wenbozhu left a comment

nimf commented Jul 13, 2022

wenbozhu commented Jul 13, 2022

nimf commented Jul 14, 2022

wenbozhu commented Jul 14, 2022

rahul2393 commented Oct 27, 2022

nimf commented Oct 27, 2022

Multi-endpoint feature #135

Multi-endpoint feature #135

Conversation

nimf commented Jul 1, 2022

mohanli-ml left a comment

Choose a reason for hiding this comment

wenbozhu left a comment

Choose a reason for hiding this comment

nimf commented Jul 13, 2022

wenbozhu commented Jul 13, 2022

nimf commented Jul 14, 2022

wenbozhu commented Jul 14, 2022

rahul2393 commented Oct 27, 2022

nimf commented Oct 27, 2022