From 832124ffa83affed864b233b24b66235b30908a1 Mon Sep 17 00:00:00 2001 From: Doug Fawley Date: Mon, 22 Nov 2021 11:31:14 -0800 Subject: [PATCH] Proposal A49: Dynamic Connection Scaling Based On Stream Availability A feature to automatically create new connections when the limit of streams on a connection is reached. --- A49-dynamic-connection-scaling.md | 211 ++++++++++++++++++++++++++++++ 1 file changed, 211 insertions(+) create mode 100644 A49-dynamic-connection-scaling.md diff --git a/A49-dynamic-connection-scaling.md b/A49-dynamic-connection-scaling.md new file mode 100644 index 000000000..f7caf242e --- /dev/null +++ b/A49-dynamic-connection-scaling.md @@ -0,0 +1,211 @@ +Dynamic Connection Scaling Based On Stream Availability +---- +* Author(s): dfawley +* Approvers: roth, ejona +* Implemented in: none +* Last updated: 2021-11-22 +* Discussion at: (filled after thread exists) + +## Abstract + +A feature to automatically create new connections when the limit of streams on a +connection is reached. + +## Background + +HTTP/2 contains a connection-level setting, +[`SETTINGS_MAX_CONCURRENT_STREAMS`](https://httpwg.org/specs/rfc7540.html#SETTINGS_MAX_CONCURRENT_STREAMS), +to limit the number of streams a peer may initiate. By default, gRPC HTTP/2 +servers do not set this (allowing unlimited streams), but it can be lowered to +protect against a single client using too many resources. In an environment +with reverse proxies or virtual IPs, it may be possible to create multiple +connections to the same IP address that lead to different physical servers, and +this can be a feasible way for a client to achieve more throughput (QPS) without +overloading any server. In addition, TLS and HTTP/2 processing is serial, so +even for a single IP to a single server, using multiple connections can achieve +more throughput by utilizing more CPU cores. + +## Proposal + +### Connection Management + +A feature will be added at the subchannel level to provide the ability to scale +up connections to the subchannel's address when the stream limit is reached. +The maximum number of connections per subchannel will be controlled by a new +service config field named `maxConnectionsPerSubchannel`. This defaults to 1, +resulting in no behavior change if it is not explicitly set. In addition, when +the value is 1, implementations should not incur any additional overhead +relative to the current behavior (e.g. no additional synchronization or memory +usage). + +Note that in Java and Go, subchannels may contain multiple addresses. Once an +address is connected, any redundant connections will be created to that same +address only. + +Note also that the behavior of creating multiple connections to a single +address, with pick-first, will make the channel less likely to later use another +address. Normally, pick-first would start back at the first address in its list +when any connection is lost, but with multiple connections to the same address, +it is less likely that all connections will be lost together, which is the only +condition under which the channel would restart at the beginning of the list. + +#### Adding Connections + +To determine when to create a new connection, the subchannel will monitor the +`MAX_CONCURRENT_STREAMS` and the number of active streams on each connection. A +new connection will be created when all of the following occur: + +- One or more RPCs are pending in the subchannel. +- No existing connection in the subchannel has any available streams. +- The number of current connections in the subchannel is fewer than the + `maxConnectionsPerSubchannel` setting in the channel's service config. +- No other connections are currently being created by the subchannel. + +If a connection attempt fails, the backoff period must be respected and scale +accordingly before attempting a new connection. + +Note that any of the above conditions may trigger the creation of a connection. +Examples: + +- A new RPC is dispatched and all connections are saturated. +- RPCs were already pending and a connection was lost or a connection attempt + fails. +- RPCs were already pending and a new connection was just created that did not + provide enough available streams for all pending RPCs. + +#### Removing Connections + +When connections are lost for any reason, they are removed from the subchannel. +If no `READY` connections remain and no connection attempts are in progress, the +subchannel will enter `IDLE`. If no `READY` connections remain and a connection +attempt is in progress, the subchannel will enter `CONNECTING` instead. In +either case, any pending RPCs will result in an error with status code +`UNAVAILABLE`. These RPCs would be eligible for transparent retries since no +wire traffic was produced for them. Note that connections are never closed +automatically by the client. Servers should use +[`MAX_CONNECTION_IDLE`](https://github.com/grpc/proposal/blob/master/A9-server-side-conn-mgt.md) +to make sure clients do not keep more open connections than necessary. + +### Connection Selection + +When an RPC is started and a subchannel is picked for that RPC, the subchannel +will find the first available connection for it, in order of connection +creation. + +The subchannel must ensure that races do not happen while dispatching RPCs to a +connection that will lead to one or more RPCs being queued in the connection +despite having available quota elsewhere. For example, if two RPCs are +initiated at the same time and one stream is available in a connection, both +RPCs must not choose the same connection, or else one will queue. This race can +be avoided with locks or atomics. + +One race that may lead to RPCs being queued in a connection is if the +`MAX_CONCURRENT_STREAMS` setting of a connection is lowered by the server after +RPCs are dispatched to the connection. This race can be avoided if the +connection is modified to not queue RPCs, but report the scenario back to the +subchannel instead, or to coordinate the `SETTINGS` frame ACK with the +subchannel. Such changes are out of scope for this design, but may be +considered in the future. Until this is designed and implemented, it is +acceptable to queue RPCs on a connection due to this race, which is expected to +be rare. + +If no connection is available for an RPC, the RPC must be queued in the +subchannel until a connection is available for it. This queue must be roughly +fair: RPCs must be dispatched in the order in which they are received into the +queue, acknowledging that timing between threads may lead to concurrent RPCs +being added to the queue in an arbitrary order. See "Potential Future Work" for +a possible adjustment to this queuing strategy. + +### Settings + +#### Service Config Settings + +The top-level service config of gRPC will be modified for this feature as follows: + +``` +{ + // ... + // existing fields, e.g. "methodConfig" and "loadBalancingConfig" + // ... + + // Settings to control dynamic connection scaling. For more details, + // please refer to gRFC AXX (this one; XX = TBD). + "connectionScaling": { + // Maximum connections gRPC will maintain for each subchannel in + // this channel. When no streams are available for an RPC in a + // subchannel, gRPC will automatically create new connections up + // to this limit. If this value changes during the life of a + // channel, existing subchannels will be updated to reflect + // the change. No connections will be closed as a result of + // lowering this value. Down-scaling will only happen as + // connections are lost naturally. + // + // Must be a whole number greater than 1. Values higher than + // the configured limit (by default, 10) will be clamped. + "maxConnectionsPerSubchannel": number + } +} +``` + +#### Channel Settings + +The `maxConnectionsPerSubchannel` setting from service config has a natural +limit of 10. A channel setting should be provided to allow a client application +to raise or lower this limit. + +### Temporary environment variable protection + +Until sufficient testing is performed, this feature will initially be disabled +by default, and only enabled if `GRPC_EXPERIMENTAL_CONNECTION_SCALING=true`. + +### Subchannel Sharing + +Subchannels may be shared across channels. The `maxConnectionsPerSubchannel` +setting will be applied to a shared subchannel and that subchannel will use the +highest maximum of all channels currently sharing it. If channels sharing the +subchannel are closed, then the remaining channels' settings will be combined to +determine the new max. If the max is lowered to less than the number of open +connections, the excess connections will not be closed as a result. +Down-scaling will only happen as connections are lost naturally. + +## Rationale + +Other approaches were considered as part of the design process. A design +incorporating a more sophisticated -- or pluggable -- connection selection +method was rejected due to the complexity of such a mechanism. A design to +expose the `MAX_CONCURRENT_STREAMS` to LB policies was similarly rejected, +because utilizing this properly would require rewriting any existing LB policies +and complexities related to synchronizing `MAX_CONCURRENT_STREAMS` updates. + +### Potential Future Work + +- Additional service config settings: + - `minConnectionsPerSubchannel` to force clients to create a minimum number of + connections, even if they are not necessary. + - `minAvailableStreamsPerSubchannel` to allow clients to create new connections + before the hard `MAX_CONCURRENT_STREAMS` setting is reached. +- Channel arg to limit streams per connection lower than + `MAX_CONCURRENT_STREAMS`. +- Instead of queuing RPCs in the subchannel, it may be possible to improve + aggregate performance by failing the RPC, resulting in transparent retry and a + re-pick. Without other systems in place, this would lead to a busy-loop if + the same subchannel is picked repeatedly, so this is not included in this + design. + +### Limitations + +This design does not handle connection affinity; there is no way to ensure +related RPCs end up on the same connection without setting +`maxConnectionsPerSubchannel` to 1. For use cases where affinity is important, +multiple channels should be utilized instead. + +This design also does not attempt to maximize throughput of connections, which +would be a far more complex problem. To maximize throughput effectively, more +information about the nature of RPCs would need to be exposed, e.g. how much +bandwidth they may require and how long they might be expected to last. The +goal of this design is to simply overcome the stream limits on connections, +hence the simple and greedy connection selection mechanism. + +## Implementation + +TBD