Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a load-aware balancer #251

Merged
merged 2 commits into from Feb 7, 2018
Merged

Use a load-aware balancer #251

merged 2 commits into from Feb 7, 2018

Conversation

olix0r
Copy link
Member

@olix0r olix0r commented Feb 1, 2018

Currently, the conduit proxy uses a simplistic Round-Robin load
balancing algorithm. This strategy degrades severely when individual
endpoints exhibit abnormally high latency.

This change improves this situation somewhat by making the load balancer
aware of the number of outstanding requests to each endpoint. When nodes
exhibit high latency, they should tend to have more pending requests
than faster nodes; and the Power-of-Two-Choices node selector can be
used to distribute requests to lesser-loaded instances.

From the finagle guide:

The algorithm randomly picks two nodes from the set of ready endpoints
and selects the least loaded of the two. By repeatedly using this
strategy, we can expect a manageable upper bound on the maximum load of
any server.

The maximum load variance between any two servers is bound by
ln(ln(n))` where `n` is the number of servers in the cluster.

@olix0r olix0r added review/ready Issue has a reviewable PR area/proxy labels Feb 1, 2018
Currently, the conduit proxy uses a simplistic Round-Robin load
balancing algorithm. This strategy degrades severely when individual
endpoints exhibit abnormally high latency.

This change improves this situation somewhat by making the load balancer
aware of the number of outstanding requests to each endpoint. When nodes
exhibit high latency, they should tend to have more pending requests
than faster nodes; and the Power-of-Two-Choices node selector can be
used to distribute requests to lesser-loaded instances.

From the finagle guide:

    The algorithm randomly picks two nodes from the set of ready endpoints
    and selects the least loaded of the two. By repeatedly using this
    strategy, we can expect a manageable upper bound on the maximum load of
    any server.

    The maximum load variance between any two servers is bound by
    ln(ln(n))` where `n` is the number of servers in the cluster.

Signed-off-by: Oliver Gould <ver@buoyant.io>
@dadjeibaah
Copy link
Contributor

I know that conduit is supposed to be zero config. Does this mean that conduit users are always going to have to stick with the power_of_two_choices load balancing?

@adleong
Copy link
Member

adleong commented Feb 2, 2018

Great question, @deebo91! Power of two choices with the least loaded metrics (P2C+LL) is a very good general purpose load balancing algorithm that works pretty well for most kinds of traffic. So I think it's a great default.

In the future, it would be really cool to see Conduit dynamically picking an LB algorithm depending on the nature of the traffic it sees. For example, EWMA+Latency for unary requests, P2C+LL for streaming requests, aperture when the load doesn't match the pool size, etc.

But in general, it should be possible to intelligently make these determinations based on the live data instead of needing users to configure it.

@dadjeibaah
Copy link
Contributor

That sounds like a really good feature! Aligns with the "just works" philosophy. Thanks for clarifying @adleong!

@olix0r
Copy link
Member Author

olix0r commented Feb 2, 2018

ran a little test:

slowcooker -> lb -> hello -> world

world is made of 9 "earth" pods with no added latency, and a single "mars" pod with 2s of added latency.

:; slow_cooker -concurrency=100 $HELLO
# sending 100 GET req/s with concurrency=100 to http://$HELLO/ ...
#                      good/b/f t   goal%   min [p50 p95 p99  p999]  max bhash change

p2c + pending requests (this branch)

2018-02-02T20:37:47Z    978/0/0 1000  97% 10s  40 [ 76  92 2075 2087 ] 2086      0 
2018-02-02T20:37:57Z    966/0/0 1000  96% 10s  40 [ 71  83 2073 2085 ] 2084      0 
2018-02-02T20:38:07Z    960/0/0 1000  96% 10s  40 [ 70  81 2075 2077 ] 2077      0 
2018-02-02T20:38:17Z    983/0/0 1000  98% 10s  40 [ 71  84 2075 2085 ] 2085      0 
2018-02-02T20:38:27Z    973/0/0 1000  97% 10s  40 [ 69  82 2071 2081 ] 2081      0 
2018-02-02T20:38:37Z    969/0/0 1000  96% 10s  40 [ 71  80 2073 2079 ] 2079      0 
2018-02-02T20:38:47Z    972/0/0 1000  97% 10s  40 [ 70  90 2075 2087 ] 2087      0 

round robin (v0.2.0):

2018-02-02T20:39:37Z    808/100/0 1000  90% 10s  39 [ 72 2069 2301 2477 ] 2476      0 
2018-02-02T20:39:47Z    909/0/0 1000  90% 10s  40 [ 67 2067 2073 2077 ] 2077      0 
2018-02-02T20:39:57Z    909/0/0 1000  90% 10s  40 [ 69 2067 2073 2079 ] 2079      0 
2018-02-02T20:40:07Z    909/0/0 1000  90% 10s  40 [ 67 2069 2079 2101 ] 2100      0 
2018-02-02T20:40:17Z    910/0/0 1000  91% 10s  40 [ 68 2067 2079 2083 ] 2083      0 
2018-02-02T20:40:27Z    907/0/0 1000  90% 10s  40 [ 66 2065 2073 2093 ] 2093      0 
2018-02-02T20:40:37Z    910/0/0 1000  91% 10s  40 [ 69 2069 2075 2077 ] 2077      0 
2018-02-02T20:40:47Z    910/0/0 1000  91% 10s  40 [ 66 2067 2075 2079 ] 2078      0 
2018-02-02T20:40:57Z    910/0/0 1000  91% 10s  40 [ 68 2069 2073 2075 ] 2075      0 

@seanmonstar
Copy link
Contributor

Sorry for the noob question, but how I understand that results?

@klingerf
Copy link
Member

klingerf commented Feb 2, 2018

@seanmonstar The big differentiator appears to be the improvement in p95 latency, from 2067ms to 84ms, on average. The latency columns in the output are [p50 p95 p99 p999].

Another useful metric to look at is goal%, which represents the percentage of requests actually sent, given a goal of 100rps. The p2c output shows that it came closer to reaching the goal, which means that it had higher overall throughput.

@olix0r
Copy link
Member Author

olix0r commented Feb 2, 2018

For posterity, here's the base k8s configuration i used to test the load balancing behavior https://gist.github.com/olix0r/16006b1dd98fd43221820181d36293d9

@olix0r olix0r changed the title [wip] Use a load-aware balancer Use a load-aware balancer Feb 7, 2018
Copy link
Contributor

@seanmonstar seanmonstar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the PR is short and sweet. I assume things are better, but don't otherwise have comments...

Although, it occurs to me now that we do have this concept of a weighed addr set, but aren't really using it. Should we be?

@olix0r
Copy link
Member Author

olix0r commented Feb 7, 2018

@seanmonstar i think the destination service should be extended to support weights; and once we've implemented that, we can introduce weighted balancing into the proxy.

@olix0r olix0r merged commit a2d537f into master Feb 7, 2018
@olix0r olix0r deleted the ver/p2clb branch February 7, 2018 17:39
@olix0r olix0r removed the review/ready Issue has a reviewable PR label Feb 7, 2018
khappucino pushed a commit to Nordstrom/linkerd2 that referenced this pull request Mar 5, 2019
Currently, the conduit proxy uses a simplistic Round-Robin load
balancing algorithm. This strategy degrades severely when individual
endpoints exhibit abnormally high latency.

This change improves this situation somewhat by making the load balancer
aware of the number of outstanding requests to each endpoint. When nodes
exhibit high latency, they should tend to have more pending requests
than faster nodes; and the Power-of-Two-Choices node selector can be
used to distribute requests to lesser-loaded instances.

From the finagle guide:

    The algorithm randomly picks two nodes from the set of ready endpoints
    and selects the least loaded of the two. By repeatedly using this
    strategy, we can expect a manageable upper bound on the maximum load of
    any server.

    The maximum load variance between any two servers is bound by
    ln(ln(n))` where `n` is the number of servers in the cluster.

Signed-off-by: Oliver Gould <ver@buoyant.io>
olix0r added a commit that referenced this pull request May 16, 2019
commit b27dfb2d21aa8ca5466ea0edce17d27094ace7c1
Author: Takanori Ishibashi <takanori.1112@gmail.com>
Date:   Wed May 15 05:58:42 2019 +0900

    updaes->updates (#250)

    Signed-off-by: Takanori Ishibashi <takanori.1112@gmail.com>

commit 16441c25a9d423a6ab12b689b830d9ae3798fa00
Author: Eliza Weisman <eliza@buoyant.io>
Date:   Tue May 14 14:40:03 2019 -0700

     Pass router::Config directly to router::Layer (#253)

    Currently, router `layer`s are constructed with a single argument, a
    type implementing `Recognize`. Then, the entire router stack is built
    with a `router::Config`. However, in #248, it became necessary to
    provide the config up front when constructing the `router::layer`, as
    the layer is used in a fallback layer. Rather than providing a separate
    type for a preconfigured layer, @olix0r suggested we simply change all
    router layers to accept the `Config` when they're constructed (see
    linkerd/linkerd2-proxy#248 (comment)).

    This branch changes `router::Layer` to accept the config up front. The
    `router::Stack` types `make` function now requires no arguments, and the
    implementation of `Service` for `Stack` can be called with any `T` (as
    the target is now ignored).

    Signed-off-by: Eliza Weisman <eliza@buoyant.io>

commit b70c68d4504a362eac6a7828039a2e5c7fcd308a
Author: Eliza Weisman <eliza@buoyant.io>
Date:   Wed May 15 13:14:04 2019 -0700

    Load balancers fall back to ORIG_DST when no endpoints exist (#248)

    Currently, when no endpoints exist in the load balancer for a
    destination, we fail the request. This is because we expect endpoints to
    be discovered by both destination service queries _and_ DNS lookups, so
    if there are no endpoints for a destination, it is assumed to not exist.

    In #2661, we intend to remove the DNS lookup from the
    proxy and instead fall back to routing requests for which no endpoints
    exist in the destination service to their SO_ORIGINAL_DST IP address.
    This means that the current approach of failing requests when the load
    balancer has no endpoints will no longer work.

    This branch introduces a generic `fallback` layer, which composes a
    primary and secondary service builder into a new layer. The primary
    service can fail requests with an error type that propages the original
    request, allowing the fallback middleware to call the fallback service
    with the same request. Other errors returned by the primary service are
    still propagated upstream.

    In contrast to the approach used in #240, this fallback middleware is
    generic and not tied directly to a load balancer or a router, and can
    be used for other purposes in the future. It relies on the router cache
    eviction added in #247 to drain the router when it is not being used,
    rather than proactively destroying the router when endpoints are
    available for the lb, and re-creating it when they exist again.

    A new trait, `HasEndpointStatus`, is added in order to allow the
    discovery lookup to communicate the "no endpoints" state to the
    balancer. In addition, we add a new `Update::NoEndpoints` variant to
    `proxy::resolve::Update`, so that when the control plane sends a no
    endpoints update, we switch from the balancer to the no endpoints state
    _immediately_, rather than waiting for all the endpoints to be
    individually removed. When the balancer has no endpoints, it fails all
    requests with a fallback error, so that the fallback middleware

    A subsequent PR (#248) will remove the DNS lookups from the discovery
    module.

    Closes #240.

    Signed-off-by: Eliza Weisman <eliza@buoyant.io>

commit 6525b0638ad18e74510f3156269e0613f237e2f5
Author: Zahari Dichev <zaharidichev@gmail.com>
Date:   Wed May 15 23:35:09 2019 +0300

    Allow disabling tap by setting an env var (#252)

    This PR fixes #2811. Now if
    `LINKERD2_PROXY_TAP_DISABLED` is set, the tap is not served at all. The
    approach taken is that  the `ProxyParts` is changed so the
    `control_listener` is now an `Option` that will be None if tap is
    disabled as this control_listener seems to be exclusively used to serve
    the tap. Feel free to suggest a better approach.

    Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>

commit 91f32db2ea6d74470fd689c713ff87dc7586222d
Author: Zahari Dichev <zaharidichev@gmail.com>
Date:   Thu May 16 00:45:23 2019 +0300

    Assert that outbound TLS works before identity is certified (#251)

    This commit introduces TLS capabilities to the support server as well as
    tests to ensure that outbound TLS works even when there is no verified
    certificate for the proxy yet.

    Fixes #2599

    Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>

commit 45aadc6b1b28e6daea0c40e694a86ae518887d85
Author: Sean McArthur <sean@buoyant.io>
Date:   Wed May 15 14:25:39 2019 -0700

    Update h2 to v0.1.19

    Includes a couple HPACK fixes

    Signed-off-by: Sean McArthur <sean@buoyant.io>

commit 3e0e00c6dfbf5a9155b887cfd594f611edfc135f
Author: Oliver Gould <ver@buoyant.io>
Date:   Thu May 16 08:11:06 2019 -0700

    Update mio to 0.6.17 (#257)

    To pick up tokio-rs/mio#939
olix0r added a commit that referenced this pull request May 16, 2019
commit b27dfb2d21aa8ca5466ea0edce17d27094ace7c1
Author: Takanori Ishibashi <takanori.1112@gmail.com>
Date:   Wed May 15 05:58:42 2019 +0900

    updaes->updates (#250)

    Signed-off-by: Takanori Ishibashi <takanori.1112@gmail.com>

commit 16441c25a9d423a6ab12b689b830d9ae3798fa00
Author: Eliza Weisman <eliza@buoyant.io>
Date:   Tue May 14 14:40:03 2019 -0700

     Pass router::Config directly to router::Layer (#253)

    Currently, router `layer`s are constructed with a single argument, a
    type implementing `Recognize`. Then, the entire router stack is built
    with a `router::Config`. However, in #248, it became necessary to
    provide the config up front when constructing the `router::layer`, as
    the layer is used in a fallback layer. Rather than providing a separate
    type for a preconfigured layer, @olix0r suggested we simply change all
    router layers to accept the `Config` when they're constructed (see
    linkerd/linkerd2-proxy#248 (comment)).

    This branch changes `router::Layer` to accept the config up front. The
    `router::Stack` types `make` function now requires no arguments, and the
    implementation of `Service` for `Stack` can be called with any `T` (as
    the target is now ignored).

    Signed-off-by: Eliza Weisman <eliza@buoyant.io>

commit b70c68d4504a362eac6a7828039a2e5c7fcd308a
Author: Eliza Weisman <eliza@buoyant.io>
Date:   Wed May 15 13:14:04 2019 -0700

    Load balancers fall back to ORIG_DST when no endpoints exist (#248)

    Currently, when no endpoints exist in the load balancer for a
    destination, we fail the request. This is because we expect endpoints to
    be discovered by both destination service queries _and_ DNS lookups, so
    if there are no endpoints for a destination, it is assumed to not exist.

    In #2661, we intend to remove the DNS lookup from the
    proxy and instead fall back to routing requests for which no endpoints
    exist in the destination service to their SO_ORIGINAL_DST IP address.
    This means that the current approach of failing requests when the load
    balancer has no endpoints will no longer work.

    This branch introduces a generic `fallback` layer, which composes a
    primary and secondary service builder into a new layer. The primary
    service can fail requests with an error type that propages the original
    request, allowing the fallback middleware to call the fallback service
    with the same request. Other errors returned by the primary service are
    still propagated upstream.

    In contrast to the approach used in #240, this fallback middleware is
    generic and not tied directly to a load balancer or a router, and can
    be used for other purposes in the future. It relies on the router cache
    eviction added in #247 to drain the router when it is not being used,
    rather than proactively destroying the router when endpoints are
    available for the lb, and re-creating it when they exist again.

    A new trait, `HasEndpointStatus`, is added in order to allow the
    discovery lookup to communicate the "no endpoints" state to the
    balancer. In addition, we add a new `Update::NoEndpoints` variant to
    `proxy::resolve::Update`, so that when the control plane sends a no
    endpoints update, we switch from the balancer to the no endpoints state
    _immediately_, rather than waiting for all the endpoints to be
    individually removed. When the balancer has no endpoints, it fails all
    requests with a fallback error, so that the fallback middleware

    A subsequent PR (#248) will remove the DNS lookups from the discovery
    module.

    Closes #240.

    Signed-off-by: Eliza Weisman <eliza@buoyant.io>

commit 6525b0638ad18e74510f3156269e0613f237e2f5
Author: Zahari Dichev <zaharidichev@gmail.com>
Date:   Wed May 15 23:35:09 2019 +0300

    Allow disabling tap by setting an env var (#252)

    This PR fixes #2811. Now if
    `LINKERD2_PROXY_TAP_DISABLED` is set, the tap is not served at all. The
    approach taken is that  the `ProxyParts` is changed so the
    `control_listener` is now an `Option` that will be None if tap is
    disabled as this control_listener seems to be exclusively used to serve
    the tap. Feel free to suggest a better approach.

    Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>

commit 91f32db2ea6d74470fd689c713ff87dc7586222d
Author: Zahari Dichev <zaharidichev@gmail.com>
Date:   Thu May 16 00:45:23 2019 +0300

    Assert that outbound TLS works before identity is certified (#251)

    This commit introduces TLS capabilities to the support server as well as
    tests to ensure that outbound TLS works even when there is no verified
    certificate for the proxy yet.

    Fixes #2599

    Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>

commit 45aadc6b1b28e6daea0c40e694a86ae518887d85
Author: Sean McArthur <sean@buoyant.io>
Date:   Wed May 15 14:25:39 2019 -0700

    Update h2 to v0.1.19

    Includes a couple HPACK fixes

    Signed-off-by: Sean McArthur <sean@buoyant.io>

commit 3e0e00c6dfbf5a9155b887cfd594f611edfc135f
Author: Oliver Gould <ver@buoyant.io>
Date:   Thu May 16 08:11:06 2019 -0700

    Update mio to 0.6.17 (#257)

    To pick up tokio-rs/mio#939
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants