-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load balancers fall back to ORIG_DST when no endpoints exist #248
Conversation
impl<Req, Rec> Clone for Layer<Req, Rec> | ||
where | ||
Rec: Recognize<Req> + Clone + Send + Sync + 'static, | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous derived Clone
implementation for Layer
(and for Stack
) required that the request type also be Clone
, even though it's a PhantomData
; this had to be fixed since the fallback layer requires cloning the fallback service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. I also found the type Layer<...>
in fallback.rs
to be unclear but it's not a blocker for me.
When we ready Ready
for MakeFallback
still needs to be figured out. I think I prefer the way the way it currently is -- applying backpressure when making the service instead of falling back and having the possibility that the fallback service is not ready.
@olix0r and @kleimkuhler, I've made all the changes you requested. I think we still need to agree on what the correct thing to do when the fallback service is not ready is. |
Currently, router `layer`s are constructed with a single argument, a type implementing `Recognize`. Then, the entire router stack is built with a `router::Config`. However, in #248, it became necessary to provide the config up front when constructing the `router::layer`, as the layer is used in a fallback layer. Rather than providing a separate type for a preconfigured layer, @olix0r suggested we simply change all router layers to accept the `Config` when they're constructed (see #248 (comment)). This branch changes `router::Layer` to accept the config up front. The `router::Stack` types `make` function now requires no arguments, and the implementation of `Service` for `Stack` can be called with any `T` (as the target is now ignored). Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For posterity, here's how I tested this:
- deployed strest with a proxy from this branch
- confirmed traffic to the dst pod
- change the service so that no pods are selected in the service
- confirm that traffic continues to the pod despite the service being emptied
- add a new client (that can't resolve the service name to any ips) and confirm that it cannot send requests
The `tower-balance` load balancer's `poll_ready` function returns `NotReady` when the load balancer has no ready endpoints. However, our fallback strategy added in #248 requires the load balancer to accept a request when it has no endpoints, so that it can return the error indicating that the fallback router should be used. This branch changes the proxy's `http::balance::Service`'s `poll_ready` implementation to return `Ok(Async::Ready(()))` when it is in the no endpoints state, so that the request will fall back. Signed-off-by: Eliza Weisman <eliza@buoyant.io>
commit b27dfb2d21aa8ca5466ea0edce17d27094ace7c1 Author: Takanori Ishibashi <takanori.1112@gmail.com> Date: Wed May 15 05:58:42 2019 +0900 updaes->updates (#250) Signed-off-by: Takanori Ishibashi <takanori.1112@gmail.com> commit 16441c25a9d423a6ab12b689b830d9ae3798fa00 Author: Eliza Weisman <eliza@buoyant.io> Date: Tue May 14 14:40:03 2019 -0700 Pass router::Config directly to router::Layer (#253) Currently, router `layer`s are constructed with a single argument, a type implementing `Recognize`. Then, the entire router stack is built with a `router::Config`. However, in #248, it became necessary to provide the config up front when constructing the `router::layer`, as the layer is used in a fallback layer. Rather than providing a separate type for a preconfigured layer, @olix0r suggested we simply change all router layers to accept the `Config` when they're constructed (see linkerd/linkerd2-proxy#248 (comment)). This branch changes `router::Layer` to accept the config up front. The `router::Stack` types `make` function now requires no arguments, and the implementation of `Service` for `Stack` can be called with any `T` (as the target is now ignored). Signed-off-by: Eliza Weisman <eliza@buoyant.io> commit b70c68d4504a362eac6a7828039a2e5c7fcd308a Author: Eliza Weisman <eliza@buoyant.io> Date: Wed May 15 13:14:04 2019 -0700 Load balancers fall back to ORIG_DST when no endpoints exist (#248) Currently, when no endpoints exist in the load balancer for a destination, we fail the request. This is because we expect endpoints to be discovered by both destination service queries _and_ DNS lookups, so if there are no endpoints for a destination, it is assumed to not exist. In #2661, we intend to remove the DNS lookup from the proxy and instead fall back to routing requests for which no endpoints exist in the destination service to their SO_ORIGINAL_DST IP address. This means that the current approach of failing requests when the load balancer has no endpoints will no longer work. This branch introduces a generic `fallback` layer, which composes a primary and secondary service builder into a new layer. The primary service can fail requests with an error type that propages the original request, allowing the fallback middleware to call the fallback service with the same request. Other errors returned by the primary service are still propagated upstream. In contrast to the approach used in #240, this fallback middleware is generic and not tied directly to a load balancer or a router, and can be used for other purposes in the future. It relies on the router cache eviction added in #247 to drain the router when it is not being used, rather than proactively destroying the router when endpoints are available for the lb, and re-creating it when they exist again. A new trait, `HasEndpointStatus`, is added in order to allow the discovery lookup to communicate the "no endpoints" state to the balancer. In addition, we add a new `Update::NoEndpoints` variant to `proxy::resolve::Update`, so that when the control plane sends a no endpoints update, we switch from the balancer to the no endpoints state _immediately_, rather than waiting for all the endpoints to be individually removed. When the balancer has no endpoints, it fails all requests with a fallback error, so that the fallback middleware A subsequent PR (#248) will remove the DNS lookups from the discovery module. Closes #240. Signed-off-by: Eliza Weisman <eliza@buoyant.io> commit 6525b0638ad18e74510f3156269e0613f237e2f5 Author: Zahari Dichev <zaharidichev@gmail.com> Date: Wed May 15 23:35:09 2019 +0300 Allow disabling tap by setting an env var (#252) This PR fixes #2811. Now if `LINKERD2_PROXY_TAP_DISABLED` is set, the tap is not served at all. The approach taken is that the `ProxyParts` is changed so the `control_listener` is now an `Option` that will be None if tap is disabled as this control_listener seems to be exclusively used to serve the tap. Feel free to suggest a better approach. Signed-off-by: Zahari Dichev <zaharidichev@gmail.com> commit 91f32db2ea6d74470fd689c713ff87dc7586222d Author: Zahari Dichev <zaharidichev@gmail.com> Date: Thu May 16 00:45:23 2019 +0300 Assert that outbound TLS works before identity is certified (#251) This commit introduces TLS capabilities to the support server as well as tests to ensure that outbound TLS works even when there is no verified certificate for the proxy yet. Fixes #2599 Signed-off-by: Zahari Dichev <zaharidichev@gmail.com> commit 45aadc6b1b28e6daea0c40e694a86ae518887d85 Author: Sean McArthur <sean@buoyant.io> Date: Wed May 15 14:25:39 2019 -0700 Update h2 to v0.1.19 Includes a couple HPACK fixes Signed-off-by: Sean McArthur <sean@buoyant.io> commit 3e0e00c6dfbf5a9155b887cfd594f611edfc135f Author: Oliver Gould <ver@buoyant.io> Date: Thu May 16 08:11:06 2019 -0700 Update mio to 0.6.17 (#257) To pick up tokio-rs/mio#939
commit b27dfb2d21aa8ca5466ea0edce17d27094ace7c1 Author: Takanori Ishibashi <takanori.1112@gmail.com> Date: Wed May 15 05:58:42 2019 +0900 updaes->updates (#250) Signed-off-by: Takanori Ishibashi <takanori.1112@gmail.com> commit 16441c25a9d423a6ab12b689b830d9ae3798fa00 Author: Eliza Weisman <eliza@buoyant.io> Date: Tue May 14 14:40:03 2019 -0700 Pass router::Config directly to router::Layer (#253) Currently, router `layer`s are constructed with a single argument, a type implementing `Recognize`. Then, the entire router stack is built with a `router::Config`. However, in #248, it became necessary to provide the config up front when constructing the `router::layer`, as the layer is used in a fallback layer. Rather than providing a separate type for a preconfigured layer, @olix0r suggested we simply change all router layers to accept the `Config` when they're constructed (see linkerd/linkerd2-proxy#248 (comment)). This branch changes `router::Layer` to accept the config up front. The `router::Stack` types `make` function now requires no arguments, and the implementation of `Service` for `Stack` can be called with any `T` (as the target is now ignored). Signed-off-by: Eliza Weisman <eliza@buoyant.io> commit b70c68d4504a362eac6a7828039a2e5c7fcd308a Author: Eliza Weisman <eliza@buoyant.io> Date: Wed May 15 13:14:04 2019 -0700 Load balancers fall back to ORIG_DST when no endpoints exist (#248) Currently, when no endpoints exist in the load balancer for a destination, we fail the request. This is because we expect endpoints to be discovered by both destination service queries _and_ DNS lookups, so if there are no endpoints for a destination, it is assumed to not exist. In #2661, we intend to remove the DNS lookup from the proxy and instead fall back to routing requests for which no endpoints exist in the destination service to their SO_ORIGINAL_DST IP address. This means that the current approach of failing requests when the load balancer has no endpoints will no longer work. This branch introduces a generic `fallback` layer, which composes a primary and secondary service builder into a new layer. The primary service can fail requests with an error type that propages the original request, allowing the fallback middleware to call the fallback service with the same request. Other errors returned by the primary service are still propagated upstream. In contrast to the approach used in #240, this fallback middleware is generic and not tied directly to a load balancer or a router, and can be used for other purposes in the future. It relies on the router cache eviction added in #247 to drain the router when it is not being used, rather than proactively destroying the router when endpoints are available for the lb, and re-creating it when they exist again. A new trait, `HasEndpointStatus`, is added in order to allow the discovery lookup to communicate the "no endpoints" state to the balancer. In addition, we add a new `Update::NoEndpoints` variant to `proxy::resolve::Update`, so that when the control plane sends a no endpoints update, we switch from the balancer to the no endpoints state _immediately_, rather than waiting for all the endpoints to be individually removed. When the balancer has no endpoints, it fails all requests with a fallback error, so that the fallback middleware A subsequent PR (#248) will remove the DNS lookups from the discovery module. Closes #240. Signed-off-by: Eliza Weisman <eliza@buoyant.io> commit 6525b0638ad18e74510f3156269e0613f237e2f5 Author: Zahari Dichev <zaharidichev@gmail.com> Date: Wed May 15 23:35:09 2019 +0300 Allow disabling tap by setting an env var (#252) This PR fixes #2811. Now if `LINKERD2_PROXY_TAP_DISABLED` is set, the tap is not served at all. The approach taken is that the `ProxyParts` is changed so the `control_listener` is now an `Option` that will be None if tap is disabled as this control_listener seems to be exclusively used to serve the tap. Feel free to suggest a better approach. Signed-off-by: Zahari Dichev <zaharidichev@gmail.com> commit 91f32db2ea6d74470fd689c713ff87dc7586222d Author: Zahari Dichev <zaharidichev@gmail.com> Date: Thu May 16 00:45:23 2019 +0300 Assert that outbound TLS works before identity is certified (#251) This commit introduces TLS capabilities to the support server as well as tests to ensure that outbound TLS works even when there is no verified certificate for the proxy yet. Fixes #2599 Signed-off-by: Zahari Dichev <zaharidichev@gmail.com> commit 45aadc6b1b28e6daea0c40e694a86ae518887d85 Author: Sean McArthur <sean@buoyant.io> Date: Wed May 15 14:25:39 2019 -0700 Update h2 to v0.1.19 Includes a couple HPACK fixes Signed-off-by: Sean McArthur <sean@buoyant.io> commit 3e0e00c6dfbf5a9155b887cfd594f611edfc135f Author: Oliver Gould <ver@buoyant.io> Date: Thu May 16 08:11:06 2019 -0700 Update mio to 0.6.17 (#257) To pick up tokio-rs/mio#939
This branch removes the fall-back-to-DNS behaviour from the proxy's destination module. Instead, when the Destination service returns "no endpoints", the proxy will route the request to its original destination, using the fallback behaviour introduced in #248. In addition, the proxy now no longer reconnects to the destination service for queries which receive a response with the Invalid Argument status code. After linkerd/linkerd2#2664, this status is used by the control plane to indicate names which should not query the destination service. I've added a new test for this. Furthermore (and unlike #245), this branch also removes the caching of service discovery lookups so that they can be shared between resolvers for authorities with the same canonical form. When we limited the number of concurrently in flight destination service queries to 100, reusing them was much more important; now it adds a lot of unnecessary complexity. The rewritten `control::destination` module is now much simpler and (hopefully) easier to understand. The `Resolver` holds a destination service `Client`, which is cloned every time a new `Resolution` is requested. Each active `Resolution` has a corresponding background task that drives the destination service query and sends updates through a channel to the `Resolver`. The background task is necessary so that updates are processed as they are recieved, rather than only when the `Resolution` is polled (i.e. when the load balancer is routing a request). In addition, the channel provides the queueing necessary to transform the control plane's update types (which contain multiple endpoint addresses) into multiple `tower-balance` updates for single endpoints, and to invalidate stale endpoints when the control plane connection is reestablished. When a control plane query is not necessary because the requested authority does not match the proxy's destination suffixes, the background task is not spawned and the `Resolution` remains permanently in the "no endpoints" state; and when an `InvalidArgument` error code is recieved, the background task terminates. A majority of the service discovery tests pass with no changes after this change. The following changes were necessary: * The test `outbound_updates_newer_services` was removed, as it tests that the same destination service query is shared between multiple services with the same authority (which is no longer the case). * The mock destination service was changed to send the `Unavailable` status code when it has run out of expected destinations, rather than `InvalidArgument`, as that code is used to indicate that a destination should never be reconnected. * New tests were added for falling back to ORIG_DST when no endpoints exist, and for not reconnecting after an `InvalidArgument` error. Closes #245. Closes linkerd/linkerd2#2661 Closes linkerd/linkerd2#2728 Signed-off-by: Eliza Weisman eliza@buoyant.io
Currently, when no endpoints exist in the load balancer for a
destination, we fail the request. This is because we expect endpoints to
be discovered by both destination service queries and DNS lookups, so
if there are no endpoints for a destination, it is assumed to not exist.
In linkerd/linkerd2#2661, we intend to remove the DNS lookup from the
proxy and instead fall back to routing requests for which no endpoints
exist in the destination service to their SO_ORIGINAL_DST IP address.
This means that the current approach of failing requests when the load
balancer has no endpoints will no longer work.
This branch introduces a generic
fallback
layer, which composes aprimary and secondary service builder into a new layer. The primary
service can fail requests with an error type that propages the original
request, allowing the fallback middleware to call the fallback service
with the same request. Other errors returned by the primary service are
still propagated upstream.
In contrast to the approach used in #240, this fallback middleware is
generic and not tied directly to a load balancer or a router, and can
be used for other purposes in the future. It relies on the router cache
eviction added in #247 to drain the router when it is not being used,
rather than proactively destroying the router when endpoints are
available for the lb, and re-creating it when they exist again.
A new trait,
HasEndpointStatus
, is added in order to allow thediscovery lookup to communicate the "no endpoints" state to the
balancer. In addition, we add a new
Update::NoEndpoints
variant toproxy::resolve::Update
, so that when the control plane sends a noendpoints update, we switch from the balancer to the no endpoints state
immediately, rather than waiting for all the endpoints to be
individually removed. When the balancer has no endpoints, it fails all
requests with a fallback error, so that the fallback middleware
A future commit will remove the DNS lookups from the discovery module.
Closes #240.
Signed-off-by: Eliza Weisman eliza@buoyant.io