Use a load-aware balancer #251

olix0r · 2018-02-01T21:35:04Z

Currently, the conduit proxy uses a simplistic Round-Robin load
balancing algorithm. This strategy degrades severely when individual
endpoints exhibit abnormally high latency.

This change improves this situation somewhat by making the load balancer
aware of the number of outstanding requests to each endpoint. When nodes
exhibit high latency, they should tend to have more pending requests
than faster nodes; and the Power-of-Two-Choices node selector can be
used to distribute requests to lesser-loaded instances.

From the finagle guide:

The algorithm randomly picks two nodes from the set of ready endpoints
and selects the least loaded of the two. By repeatedly using this
strategy, we can expect a manageable upper bound on the maximum load of
any server.

The maximum load variance between any two servers is bound by
ln(ln(n))` where `n` is the number of servers in the cluster.

Currently, the conduit proxy uses a simplistic Round-Robin load balancing algorithm. This strategy degrades severely when individual endpoints exhibit abnormally high latency. This change improves this situation somewhat by making the load balancer aware of the number of outstanding requests to each endpoint. When nodes exhibit high latency, they should tend to have more pending requests than faster nodes; and the Power-of-Two-Choices node selector can be used to distribute requests to lesser-loaded instances. From the finagle guide: The algorithm randomly picks two nodes from the set of ready endpoints and selects the least loaded of the two. By repeatedly using this strategy, we can expect a manageable upper bound on the maximum load of any server. The maximum load variance between any two servers is bound by ln(ln(n))` where `n` is the number of servers in the cluster. Signed-off-by: Oliver Gould <ver@buoyant.io>

dadjeibaah · 2018-02-01T23:57:03Z

I know that conduit is supposed to be zero config. Does this mean that conduit users are always going to have to stick with the power_of_two_choices load balancing?

adleong · 2018-02-02T00:02:59Z

Great question, @deebo91! Power of two choices with the least loaded metrics (P2C+LL) is a very good general purpose load balancing algorithm that works pretty well for most kinds of traffic. So I think it's a great default.

In the future, it would be really cool to see Conduit dynamically picking an LB algorithm depending on the nature of the traffic it sees. For example, EWMA+Latency for unary requests, P2C+LL for streaming requests, aperture when the load doesn't match the pool size, etc.

But in general, it should be possible to intelligently make these determinations based on the live data instead of needing users to configure it.

dadjeibaah · 2018-02-02T00:05:13Z

That sounds like a really good feature! Aligns with the "just works" philosophy. Thanks for clarifying @adleong!

olix0r · 2018-02-02T20:44:31Z

ran a little test:

slowcooker -> lb -> hello -> world

world is made of 9 "earth" pods with no added latency, and a single "mars" pod with 2s of added latency.

:; slow_cooker -concurrency=100 $HELLO
# sending 100 GET req/s with concurrency=100 to http://$HELLO/ ...
#                      good/b/f t   goal%   min [p50 p95 p99  p999]  max bhash change

p2c + pending requests (this branch)

2018-02-02T20:37:47Z    978/0/0 1000  97% 10s  40 [ 76  92 2075 2087 ] 2086      0 
2018-02-02T20:37:57Z    966/0/0 1000  96% 10s  40 [ 71  83 2073 2085 ] 2084      0 
2018-02-02T20:38:07Z    960/0/0 1000  96% 10s  40 [ 70  81 2075 2077 ] 2077      0 
2018-02-02T20:38:17Z    983/0/0 1000  98% 10s  40 [ 71  84 2075 2085 ] 2085      0 
2018-02-02T20:38:27Z    973/0/0 1000  97% 10s  40 [ 69  82 2071 2081 ] 2081      0 
2018-02-02T20:38:37Z    969/0/0 1000  96% 10s  40 [ 71  80 2073 2079 ] 2079      0 
2018-02-02T20:38:47Z    972/0/0 1000  97% 10s  40 [ 70  90 2075 2087 ] 2087      0

round robin (v0.2.0):

2018-02-02T20:39:37Z    808/100/0 1000  90% 10s  39 [ 72 2069 2301 2477 ] 2476      0 
2018-02-02T20:39:47Z    909/0/0 1000  90% 10s  40 [ 67 2067 2073 2077 ] 2077      0 
2018-02-02T20:39:57Z    909/0/0 1000  90% 10s  40 [ 69 2067 2073 2079 ] 2079      0 
2018-02-02T20:40:07Z    909/0/0 1000  90% 10s  40 [ 67 2069 2079 2101 ] 2100      0 
2018-02-02T20:40:17Z    910/0/0 1000  91% 10s  40 [ 68 2067 2079 2083 ] 2083      0 
2018-02-02T20:40:27Z    907/0/0 1000  90% 10s  40 [ 66 2065 2073 2093 ] 2093      0 
2018-02-02T20:40:37Z    910/0/0 1000  91% 10s  40 [ 69 2069 2075 2077 ] 2077      0 
2018-02-02T20:40:47Z    910/0/0 1000  91% 10s  40 [ 66 2067 2075 2079 ] 2078      0 
2018-02-02T20:40:57Z    910/0/0 1000  91% 10s  40 [ 68 2069 2073 2075 ] 2075      0

seanmonstar · 2018-02-02T21:07:22Z

Sorry for the noob question, but how I understand that results?

klingerf · 2018-02-02T21:19:55Z

@seanmonstar The big differentiator appears to be the improvement in p95 latency, from 2067ms to 84ms, on average. The latency columns in the output are [p50 p95 p99 p999].

Another useful metric to look at is goal%, which represents the percentage of requests actually sent, given a goal of 100rps. The p2c output shows that it came closer to reaching the goal, which means that it had higher overall throughput.

olix0r · 2018-02-02T23:38:07Z

For posterity, here's the base k8s configuration i used to test the load balancing behavior https://gist.github.com/olix0r/16006b1dd98fd43221820181d36293d9

seanmonstar

Well, the PR is short and sweet. I assume things are better, but don't otherwise have comments...

Although, it occurs to me now that we do have this concept of a weighed addr set, but aren't really using it. Should we be?

olix0r · 2018-02-07T01:24:11Z

@seanmonstar i think the destination service should be extended to support weights; and once we've implemented that, we can introduce weighted balancing into the proxy.

Currently, the conduit proxy uses a simplistic Round-Robin load balancing algorithm. This strategy degrades severely when individual endpoints exhibit abnormally high latency. This change improves this situation somewhat by making the load balancer aware of the number of outstanding requests to each endpoint. When nodes exhibit high latency, they should tend to have more pending requests than faster nodes; and the Power-of-Two-Choices node selector can be used to distribute requests to lesser-loaded instances. From the finagle guide: The algorithm randomly picks two nodes from the set of ready endpoints and selects the least loaded of the two. By repeatedly using this strategy, we can expect a manageable upper bound on the maximum load of any server. The maximum load variance between any two servers is bound by ln(ln(n))` where `n` is the number of servers in the cluster. Signed-off-by: Oliver Gould <ver@buoyant.io>

@olix0r

commit b27dfb2d21aa8ca5466ea0edce17d27094ace7c1 Author: Takanori Ishibashi <takanori.1112@gmail.com> Date: Wed May 15 05:58:42 2019 +0900 updaes->updates (#250) Signed-off-by: Takanori Ishibashi <takanori.1112@gmail.com> commit 16441c25a9d423a6ab12b689b830d9ae3798fa00 Author: Eliza Weisman <eliza@buoyant.io> Date: Tue May 14 14:40:03 2019 -0700 Pass router::Config directly to router::Layer (#253) Currently, router `layer`s are constructed with a single argument, a type implementing `Recognize`. Then, the entire router stack is built with a `router::Config`. However, in #248, it became necessary to provide the config up front when constructing the `router::layer`, as the layer is used in a fallback layer. Rather than providing a separate type for a preconfigured layer, @olix0r suggested we simply change all router layers to accept the `Config` when they're constructed (see linkerd/linkerd2-proxy#248 (comment)). This branch changes `router::Layer` to accept the config up front. The `router::Stack` types `make` function now requires no arguments, and the implementation of `Service` for `Stack` can be called with any `T` (as the target is now ignored). Signed-off-by: Eliza Weisman <eliza@buoyant.io> commit b70c68d4504a362eac6a7828039a2e5c7fcd308a Author: Eliza Weisman <eliza@buoyant.io> Date: Wed May 15 13:14:04 2019 -0700 Load balancers fall back to ORIG_DST when no endpoints exist (#248) Currently, when no endpoints exist in the load balancer for a destination, we fail the request. This is because we expect endpoints to be discovered by both destination service queries _and_ DNS lookups, so if there are no endpoints for a destination, it is assumed to not exist. In #2661, we intend to remove the DNS lookup from the proxy and instead fall back to routing requests for which no endpoints exist in the destination service to their SO_ORIGINAL_DST IP address. This means that the current approach of failing requests when the load balancer has no endpoints will no longer work. This branch introduces a generic `fallback` layer, which composes a primary and secondary service builder into a new layer. The primary service can fail requests with an error type that propages the original request, allowing the fallback middleware to call the fallback service with the same request. Other errors returned by the primary service are still propagated upstream. In contrast to the approach used in #240, this fallback middleware is generic and not tied directly to a load balancer or a router, and can be used for other purposes in the future. It relies on the router cache eviction added in #247 to drain the router when it is not being used, rather than proactively destroying the router when endpoints are available for the lb, and re-creating it when they exist again. A new trait, `HasEndpointStatus`, is added in order to allow the discovery lookup to communicate the "no endpoints" state to the balancer. In addition, we add a new `Update::NoEndpoints` variant to `proxy::resolve::Update`, so that when the control plane sends a no endpoints update, we switch from the balancer to the no endpoints state _immediately_, rather than waiting for all the endpoints to be individually removed. When the balancer has no endpoints, it fails all requests with a fallback error, so that the fallback middleware A subsequent PR (#248) will remove the DNS lookups from the discovery module. Closes #240. Signed-off-by: Eliza Weisman <eliza@buoyant.io> commit 6525b0638ad18e74510f3156269e0613f237e2f5 Author: Zahari Dichev <zaharidichev@gmail.com> Date: Wed May 15 23:35:09 2019 +0300 Allow disabling tap by setting an env var (#252) This PR fixes #2811. Now if `LINKERD2_PROXY_TAP_DISABLED` is set, the tap is not served at all. The approach taken is that the `ProxyParts` is changed so the `control_listener` is now an `Option` that will be None if tap is disabled as this control_listener seems to be exclusively used to serve the tap. Feel free to suggest a better approach. Signed-off-by: Zahari Dichev <zaharidichev@gmail.com> commit 91f32db2ea6d74470fd689c713ff87dc7586222d Author: Zahari Dichev <zaharidichev@gmail.com> Date: Thu May 16 00:45:23 2019 +0300 Assert that outbound TLS works before identity is certified (#251) This commit introduces TLS capabilities to the support server as well as tests to ensure that outbound TLS works even when there is no verified certificate for the proxy yet. Fixes #2599 Signed-off-by: Zahari Dichev <zaharidichev@gmail.com> commit 45aadc6b1b28e6daea0c40e694a86ae518887d85 Author: Sean McArthur <sean@buoyant.io> Date: Wed May 15 14:25:39 2019 -0700 Update h2 to v0.1.19 Includes a couple HPACK fixes Signed-off-by: Sean McArthur <sean@buoyant.io> commit 3e0e00c6dfbf5a9155b887cfd594f611edfc135f Author: Oliver Gould <ver@buoyant.io> Date: Thu May 16 08:11:06 2019 -0700 Update mio to 0.6.17 (#257) To pick up tokio-rs/mio#939

@olix0r

commit b27dfb2d21aa8ca5466ea0edce17d27094ace7c1 Author: Takanori Ishibashi <takanori.1112@gmail.com> Date: Wed May 15 05:58:42 2019 +0900 updaes->updates (#250) Signed-off-by: Takanori Ishibashi <takanori.1112@gmail.com> commit 16441c25a9d423a6ab12b689b830d9ae3798fa00 Author: Eliza Weisman <eliza@buoyant.io> Date: Tue May 14 14:40:03 2019 -0700 Pass router::Config directly to router::Layer (#253) Currently, router `layer`s are constructed with a single argument, a type implementing `Recognize`. Then, the entire router stack is built with a `router::Config`. However, in #248, it became necessary to provide the config up front when constructing the `router::layer`, as the layer is used in a fallback layer. Rather than providing a separate type for a preconfigured layer, @olix0r suggested we simply change all router layers to accept the `Config` when they're constructed (see linkerd/linkerd2-proxy#248 (comment)). This branch changes `router::Layer` to accept the config up front. The `router::Stack` types `make` function now requires no arguments, and the implementation of `Service` for `Stack` can be called with any `T` (as the target is now ignored). Signed-off-by: Eliza Weisman <eliza@buoyant.io> commit b70c68d4504a362eac6a7828039a2e5c7fcd308a Author: Eliza Weisman <eliza@buoyant.io> Date: Wed May 15 13:14:04 2019 -0700 Load balancers fall back to ORIG_DST when no endpoints exist (#248) Currently, when no endpoints exist in the load balancer for a destination, we fail the request. This is because we expect endpoints to be discovered by both destination service queries _and_ DNS lookups, so if there are no endpoints for a destination, it is assumed to not exist. In #2661, we intend to remove the DNS lookup from the proxy and instead fall back to routing requests for which no endpoints exist in the destination service to their SO_ORIGINAL_DST IP address. This means that the current approach of failing requests when the load balancer has no endpoints will no longer work. This branch introduces a generic `fallback` layer, which composes a primary and secondary service builder into a new layer. The primary service can fail requests with an error type that propages the original request, allowing the fallback middleware to call the fallback service with the same request. Other errors returned by the primary service are still propagated upstream. In contrast to the approach used in #240, this fallback middleware is generic and not tied directly to a load balancer or a router, and can be used for other purposes in the future. It relies on the router cache eviction added in #247 to drain the router when it is not being used, rather than proactively destroying the router when endpoints are available for the lb, and re-creating it when they exist again. A new trait, `HasEndpointStatus`, is added in order to allow the discovery lookup to communicate the "no endpoints" state to the balancer. In addition, we add a new `Update::NoEndpoints` variant to `proxy::resolve::Update`, so that when the control plane sends a no endpoints update, we switch from the balancer to the no endpoints state _immediately_, rather than waiting for all the endpoints to be individually removed. When the balancer has no endpoints, it fails all requests with a fallback error, so that the fallback middleware A subsequent PR (#248) will remove the DNS lookups from the discovery module. Closes #240. Signed-off-by: Eliza Weisman <eliza@buoyant.io> commit 6525b0638ad18e74510f3156269e0613f237e2f5 Author: Zahari Dichev <zaharidichev@gmail.com> Date: Wed May 15 23:35:09 2019 +0300 Allow disabling tap by setting an env var (#252) This PR fixes #2811. Now if `LINKERD2_PROXY_TAP_DISABLED` is set, the tap is not served at all. The approach taken is that the `ProxyParts` is changed so the `control_listener` is now an `Option` that will be None if tap is disabled as this control_listener seems to be exclusively used to serve the tap. Feel free to suggest a better approach. Signed-off-by: Zahari Dichev <zaharidichev@gmail.com> commit 91f32db2ea6d74470fd689c713ff87dc7586222d Author: Zahari Dichev <zaharidichev@gmail.com> Date: Thu May 16 00:45:23 2019 +0300 Assert that outbound TLS works before identity is certified (#251) This commit introduces TLS capabilities to the support server as well as tests to ensure that outbound TLS works even when there is no verified certificate for the proxy yet. Fixes #2599 Signed-off-by: Zahari Dichev <zaharidichev@gmail.com> commit 45aadc6b1b28e6daea0c40e694a86ae518887d85 Author: Sean McArthur <sean@buoyant.io> Date: Wed May 15 14:25:39 2019 -0700 Update h2 to v0.1.19 Includes a couple HPACK fixes Signed-off-by: Sean McArthur <sean@buoyant.io> commit 3e0e00c6dfbf5a9155b887cfd594f611edfc135f Author: Oliver Gould <ver@buoyant.io> Date: Thu May 16 08:11:06 2019 -0700 Update mio to 0.6.17 (#257) To pick up tokio-rs/mio#939

olix0r added review/ready Issue has a reviewable PR area/proxy labels Feb 1, 2018

olix0r force-pushed the ver/p2clb branch from 57333e7 to f5804ae Compare February 1, 2018 21:51

siggy assigned olix0r Feb 1, 2018

Merge branch 'master' into ver/p2clb

da82945

olix0r changed the title ~~[wip] Use a load-aware balancer~~ Use a load-aware balancer Feb 7, 2018

olix0r requested review from seanmonstar, hawkw and adleong February 7, 2018 00:25

seanmonstar approved these changes Feb 7, 2018

View reviewed changes

olix0r merged commit a2d537f into master Feb 7, 2018

olix0r deleted the ver/p2clb branch February 7, 2018 17:39

olix0r removed the review/ready Issue has a reviewable PR label Feb 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a load-aware balancer #251

Use a load-aware balancer #251

olix0r commented Feb 1, 2018

dadjeibaah commented Feb 1, 2018

adleong commented Feb 2, 2018

dadjeibaah commented Feb 2, 2018

olix0r commented Feb 2, 2018 •

edited

seanmonstar commented Feb 2, 2018

klingerf commented Feb 2, 2018

olix0r commented Feb 2, 2018

seanmonstar left a comment

olix0r commented Feb 7, 2018 •

edited

Use a load-aware balancer #251

Use a load-aware balancer #251

Conversation

olix0r commented Feb 1, 2018

dadjeibaah commented Feb 1, 2018

adleong commented Feb 2, 2018

dadjeibaah commented Feb 2, 2018

olix0r commented Feb 2, 2018 • edited

seanmonstar commented Feb 2, 2018

klingerf commented Feb 2, 2018

olix0r commented Feb 2, 2018

seanmonstar left a comment

Choose a reason for hiding this comment

olix0r commented Feb 7, 2018 • edited

olix0r commented Feb 2, 2018 •

edited

olix0r commented Feb 7, 2018 •

edited