Smarter retry behavior/cross priority retries #3958

snowp · 2018-07-26T03:48:42Z

We're interested in having Envoy perform smarter retry behavior, where subsequent requests are sent to different hosts than the original request. More specifically, we'd be interested in having requests that route to priority P attempt to send retires to priority P' != P even in the case where the first priority is completely healthy, but fall back to the original priority if no healthy alternatives exist. The use case is mainly to maintain compatibility with our existing legacy systems: it's an attempt to minimize the change in behavior for our internal users when switching over to routing through Envoy.

Got a few ideas on the top of my mind that might help fuel a discussion:

Configurable subsets on retry

One somewhat suboptimal option would be to use subset lb and allow adjusting the metadata match on subsequent retires. I'm imagining something like:

       route:
          metadata_match:
            filter_metadata:
              envoy.lb:
                metadata: "primary"
          cluster: some-cluster
          timeout: 1s
          retry_metadata_match:
            -  envoy.lb:
                  metadata: "secondary"
            -  envoy.lb:
                  metadata: "tertiary"

which would issue the first request to the primary subset, the first retry to secondary, second retry to tertiary and then loop back again to route to secondary (i.e. the retry list wraps around). With this we could add metadata to the hosts in each priority to allow us to exclude certain priorities (e.g. hosts not in P0 get labeled P0=false).

The problem I can see with this is that there's no guards against the subsets being empty, which would cause an immediate 503 and quickly exhaust the permitted number of retries. Presumably this could be solved by being able to query the load balancer to determine if a subset is empty before trying to route to it, but I'm not sure how easy that would be to achieve. The other issue here is that we don't have knowledge about what priority was actually hit - we have to statically guess what priority to hit on the subsequent requests. For instance, if we have P0, P1, P2 and everything is healthy, requests with retries could hit P0, P1, P2, while if P0 is unhealthy, we'd hit P1, P1, P2.

Cross-priority failover

Another option I can think of would be more tightly coupled to priorities and simply keep track in the retry state what priority was last tried, and would attempt to select a host while ignoring the previous priority. Again this seems like it needs some way of checking whether there are available hosts when when retrying (falling back to hitting the same priority), but it seems like a simpler API and would at least achieve what we're looking for. This approach handles the edge case mentioned in the previous paragraph, because we'd know that we hit P1 on the first request and be able to take that into account when selecting the next priority to route to.

Fully configurable retry routing

A last approach is less fleshed out but would involve being able to register custom retry implementations (like how custom filters can be registered). It could expose an API like

RetryRouteDecision routeRetry(const std::vector<RetryAttempt>& previous_attempts);

This would allow Envoy users to implement an arbitrary retry policy based on the previous attempts, and it could be configurable on the route level, something like:

retry_router:
  name: my_retry_router
  config:
    available_subsets: [ primary, secondary, tertiary]

This way any Envoy user could implement their own bespoke retry behavior. It would still be useful to be able to query the health of subsets/clusters etc for aforementioned reasons, but you could presumably add something a like incrementRetryCounter() to the API to let the implementation decide whether we count the retry towards the retry counters.

Happy to consider other ideas, these were just top of mind.

The text was updated successfully, but these errors were encountered:

snowp · 2018-07-26T06:32:33Z

A variant of the third option would be to use RetryState for this purpose instead of introducing a new type. It seems doable to pass it additional data to shouldRetry and provide hooks to mutate the the LoadBalancerContext values of the Router, allowing it to change the parameters used to select the next connection.

mattklein123 · 2018-07-26T22:00:28Z

Note this is related to #2259. @snowp will read/respond to the full proposal in the next few days.

mattklein123 · 2018-07-30T19:05:39Z

@snowp just catching up here. IMO we should probably do the "fully configurable retry routing" option with some built-in options such as "don't route to to the same host," "don't route to same priority," etc. I think this could be built very cleanly into the existing retry/router implementation and would be very flexible. WDYT? Basically, I would like to solve #2259 while also allowing folks to do more complex stuff if they want?

snowp · 2018-07-30T19:45:41Z

Yeah that sounds reasonable to me. It would definitely satisfy our needs and seems generic enough that I'm sure others will find it useful.

Happy to work on this. Would it be natural to use the subset lb to implement this behavior? e.g. generate one subset per priority and when a request fails going to P0, we set metadata_match to priority = [P1, P2, ...] (using #3964)?

One thing that would be nice with this being tied to the subset lb is that the work for host/priority would naturally extend arbitrary subsets.

The main concern I see with using the subset lb is performance, especially with the recent subset lb perf concerns. Does maintaining all these extra subsets in order to do smarter retries make sense? Can this be better implemented in a different way?

mattklein123 · 2018-07-30T19:53:51Z

Maybe we can just change the name to indicate the function is using a static const backing store so we should avoid copies where we can but it's Ok to not use the literal string in the header entry?

Yeah I'm a little concerned about using the subset LB for this, at least in all cases. Or do you mean just for your plugin? The way I originally envisioned the "don't use the same host again for retry" behavior was to just pass the previous tried hosts as part of the LB context, and let the LB figure it out. So, potentially, should this plugin mechanism actually be a plugin that operates from within the LB context and could be somewhat generic depending on the LB actually used?

snowp · 2018-07-30T20:21:48Z

Delegating the work to the LB makes sense. The subset lb looked attractive because it wouldn't implementing this in every LB, but I imagine the perf benefits makes it worth it. Looking over the code it doesn't seem all that bad to put this into the LB.

Not sure how easy this would be to generically support - it seems like we need special handling depending on what we're filtering on? For hosts we can choose a host set like normal, and if it contains the undesired host we can remove it. For priorities we might scale the remaining priority weights s.t they sum to 100 in order to use the existing priority selection logic.

Maybe there's a generic "ignore hosts matching X" approach that's eluding me?

mattklein123 · 2018-07-30T20:33:05Z

TBH I'm not sure of the best solution here without putting some dedicated thought time into it. There is going to be a tradeoff between flexibility and performance. For example, I could imagine a plugin that does something like:

Some method that optionally pre-filters the hosts/priorities to LB on, given the LB context with list of previously tried hosts.
Some method that optionally post-filters the LB result and says something like "try again."

This is not fleshed out, but I wonder if given something like this we could build it into the various LBs pretty easily? For most of the LBs this would be built-in to the base classes anyway and "just work." Anyway maybe it's worth doing a bit of experimentation here and coming back with a more firm proposal?

snowp · 2018-07-30T20:44:39Z

That sounds reasonable. At least for priorities I think it would have to be a pre-filter simply due to how priorities work: if you're trying to avoid P0 but P0 is healthy you'll always hit P0, so trying again won't help.

I'll spend some time playing around with it and report back when I got a better sense of how this would work. Thanks for all the ideas!

snowp · 2018-07-31T23:00:35Z

After looking at the code for a while it’s clear that 1) priorities need to be filtered before we attempt to select a host and that 2) filtering individual hosts before selection is hard because they’re buried into EdfSchedulers. As such I propose that we add the ability to filter out hosts during host selection (allowing filtering both before and after host selection) and wire up the router to specify these filters by inspecting the RetryState.

We add three functions to the LoadBalancingContext that can influence the hosts considered during host selection:

absl::optional<std::function<bool(uint32_t, const Host&)>> preFilter() PURE;
std::function<bool(const Host&)> postFilter() PURE;
int maxSelectionAttempts() const PURE;

The first function allows one to optionally specify a priority filter. The presence of this would mean that we generate a temporary per_priority_load using the hosts remaining after filtering them out using the predicate. This will allow filtering out entire priorities or affecting priority spillover by excluding certain hosts, which could in turn affect the per priority health %. Priority selection would then follow as usual, using the updated per_priority_load data. The preFilter functions returns an optional due to the fact that rebuilding the per_priority_load isn’t free, so one might want to return nothing here in order to use the existing load information.

The second filter will be used after chooseHost is selected to determine whether we should reattempt host selection, indicated by returning false. This is done so that we can avoid having to rebuild the schedulers with the filtered hosts, and also makes this mechanism agnostic towards the exact host selection mechanism. No optional is necessary here since a simple []() { return true } should be cheap to evaluate for the selected host.

We’ll probably want to handle the case where applying the filter makes it impossible to find a new host. For instance, when routing to a cluster with only one host and a postFilter that excludes that hosts, we’ll be unable to find a new candidate host. We’ll probably want to bail out of filtering hosts, falling back to using no filters. To make this behavior configurable we have the maxSelectionAttempts function which will return the number of times we should attempt to re-select a host.

To configure this for retries, we can introduce a RetryHostFilterFactory which accepts the RetryState and returns the appropriate predicates. I imagine it would look something like this

abstract class RetryHostFilterFactory {
absl::optional<std::function<bool(uint32_t, const Host&)>> preFilter(const RetryState&) PURE;
std::function<bool(const Host&)> postFilter(const RetryState&) PURE;
}

The router would be initialized with default filters for the initial request (empty optional for preFilter, always true for postFIlter, and only one attempt). In doRetry(), it would call into the RetryHostFilterFactory to retrieve the predicates for this retry attempt and store them in order to return them when the LoadBalancerContext functions are called.

To make this extensible we’d use the same kind of pattern that listeners/access logs/etc uses, and allow specifying a HostFilter by name. Something along the lines of:

retry_host_filter:
   name: envoy.retry_other_priority
   max_selection_attempts: 3
   config: {}

which would live on the RouteAction.RetryPolicy.

Rough order of implementation order:

Add filter functions to LoadBalancingContext and implement logic in LoadBalancerBase to respect them. Router will only ever use the noop defaults.
Allow registering arbitrary RetryHostFilterFactory implementations and have the router set the provided values on retry
Implement SkipAttemptedPriority filter
Implement SkipAttemptedHost filter

Happy to move this to a Google doc if it makes commenting easier.

mattklein123 · 2018-08-01T21:25:51Z

@snowp in general I think this sounds great to me. For the retry policy, we might want to also allow specifying at the virtual host level? Also, I'm thinking that we can include one or more more built-in policies to solve the "don't retry same host" open issues.

@envoyproxy/maintainers any thoughts on this? @alyssawilk specifically would love your feedback.

snowp · 2018-08-07T17:37:48Z

Is this in a good enough state that I can start working on the implementation? Or should I wait for more feedback?

alyssawilk · 2018-08-07T18:51:56Z

Ugh, sorry, I was out last week and am swamped by maintainer rotation this week :-(

Chiming in rather belatedly, I totally agree with Matt that an extensible/customization mechanism is the way to go. I suspect other folks will want to preserve legacy behaviors so making it easier to program custom host selection seems ideal.

I guess my only thought is rather than have a callback before and after host selection where you mutate state such as per_priority_load, would it make more sense to make the priority and host selection actually the pluggable bit, where the default action is to call the current code, but if a plug-in is configured that's called instead? Changing the internals before and after feels more fragile than allowing custom code to inspect and make its own decisions. No strong feelings so if you want to go with the Matt-approved design over my hand-waving that's fine. :-)

mattklein123 · 2018-08-07T18:54:26Z

@snowp I'm fine for you to move forward. If you want to investigate @alyssawilk's thoughts first that's fine too. I agree with @alyssawilk that not having before/after would be better, though my concern would be that the plugin would then have to do a lot more than what we are proposing that it does with the above design.

snowp · 2018-08-07T20:46:25Z

Seems like having the pre/post filters would be useful even if we allow overriding the entire host selection method: it'd let people move concerns such as "avoid this host/priority" out of the host selection algorithm, making it easier to reuse between implementations.

If a bunch of awesome host selection algorithms got upstreamed, I could add "retry other priority" to any of them without having them explicitly support that.

Additionally, I'd hate to have to somehow get ahold of the existing priority selection logic so that I can run through the same code with slight modifications.

I think I'll go with the currently approved design for now but open to pivoting if it ends up being too heavily coupled to the lb internals.

As an initial step towards envoyproxy#3958, this implements two mechanisms for affecting the host selected by the load balancer: 1) prePrioritySelectionFilter, which is used to generate a different per_priority_load table before determining what priority to route to 2) postHostSelectionFilter, which can be used to to reject a selected host and retry host selection (up to hostSelectionRetryCount times) Both of these are defined on the LoadBalancerContext, and will eventually be used by the router to affect the host selection strategy based on retry state. Signed-off-by: Snow Pettersen <snowp@squareup.com>

) As an initial step towards #3958, this implements two mechanisms for affecting the host selected by the load balancer during chooseHost: 1. prePrioritySelectionFilter, which is used to generate a different per_priority_load table before determining what priority to route to 2. postHostSelectionFilter, which can be used to to reject a selected host and retry host selection (up to hostSelectionRetryCount times) Both of these are defined on LoadBalancerContext, and will eventually be used by the router to affect the host selection strategy based on retry state. The first mechanism is implemented in the LoadBalancerBase, while the second one is only implemented for EdfLoadBalancerBase and RandomLoadBalancer. Signed-off-by: Snow Pettersen snowp@squareup.com Risk Level: Medium, new feature that's not enabled. Default behavior should match existing behavior. Testing: Unit tests Docs Changes: N/A (feature not configurable) Release Notes: N/A Signed-off-by: Snow Pettersen <snowp@squareup.com>

As an initial step towards envoyproxy#3958, this implements two mechanisms for affecting the host selected by the load balancer: 1) prePrioritySelectionFilter, which is used to generate a different per_priority_load table before determining what priority to route to 2) postHostSelectionFilter, which can be used to to reject a selected host and retry host selection (up to hostSelectionRetryCount times) Both of these are defined on the LoadBalancerContext, and will eventually be used by the router to affect the host selection strategy based on retry state. Signed-off-by: Snow Pettersen <snowp@squareup.com>

stale · 2018-09-06T21:06:51Z

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

cmluciano · 2018-09-10T15:35:46Z

not stale, some work done in #4097

…points (#4212) This adds the necessary configuration and interfaces to register implementations of RetryPriority and RetryHostPredicate, which will allow configuring smarter host selection during retries. Part of #3958 Risk Level: low, api changes Testing: n/a Doc Changes: inline Release Notes:n/a Signed-off-by: Snow Pettersen <snowp@squareup.com>

Wires up route configuration to allow specifying what hosts should be reattempted during retry host selection. Risk Level: Medium, some changes made to the router. Otherwise new optional feature Testing: unit and integration test Docs Changes: n/a Release Notes: n/a Part of #3958 Signed-off-by: Snow Pettersen <snowp@squareup.com>

Plumbs through the max_host_selection_count parameter from retry policy config -> router Risk Level: Low Testing: UT Docs Changes: n/a Release Notes: n/a Part of #3958 Signed-off-by: Snow Pettersen <snowp@squareup.com>

This wires up the necessary logic to allow registering a RetryPriorityFactory that can be used to impact which priority is selected during host selection for retry attempts. Signed-off-by: Snow Pettersen snowp@squareup.com Description: Risk Level: Low, new optional feautre Testing: Integration test Docs Changes: n/a Release Notes: n/a Part of #3958 Signed-off-by: Snow Pettersen <snowp@squareup.com>

Adds a simple RetryHostPredicate that keeps track of hosts that have already been attempted, triggering a new host to be selected if an already attempted host is selected. Signed-off-by: Snow Pettersen snowp@squareup.com Risk Level: Low Testing: Unit test Docs Changes: n/a Release Notes: n/a Part of #3958 Signed-off-by: Snow Pettersen <snowp@squareup.com>

stale · 2018-10-10T16:27:37Z

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

snowp · 2018-10-10T16:33:01Z

Both host selection and retry priority modifications has been implemented, including example implementations.

mattklein123 added the design proposal Needs design doc/proposal before implementation label Jul 26, 2018

snowp mentioned this issue Aug 9, 2018

upstream: implement host/priority filtering during host selection #4097

Merged

snowp mentioned this issue Aug 21, 2018

upstream: add interfaces for priority/host retry selection extension points #4212

Merged

stale bot added the stale stalebot believes this issue/PR has not been touched recently label Sep 6, 2018

stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Sep 10, 2018

snowp mentioned this issue Sep 10, 2018

router: implement RetryHostPredicate #4385

Merged

This was referenced Sep 17, 2018

router: implement RetryPriority #4437

Merged

router: plumb through max_host_selection_count #4443

Merged

extensions: add retry host predicate for attempted hosts #4452

Merged

stale bot added the stale stalebot believes this issue/PR has not been touched recently label Oct 10, 2018

snowp closed this as completed Oct 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Smarter retry behavior/cross priority retries #3958

Smarter retry behavior/cross priority retries #3958

snowp commented Jul 26, 2018

snowp commented Jul 26, 2018

mattklein123 commented Jul 26, 2018

mattklein123 commented Jul 30, 2018

snowp commented Jul 30, 2018

mattklein123 commented Jul 30, 2018

snowp commented Jul 30, 2018

mattklein123 commented Jul 30, 2018

snowp commented Jul 30, 2018

snowp commented Jul 31, 2018

mattklein123 commented Aug 1, 2018

snowp commented Aug 7, 2018

alyssawilk commented Aug 7, 2018

mattklein123 commented Aug 7, 2018

snowp commented Aug 7, 2018

stale bot commented Sep 6, 2018

cmluciano commented Sep 10, 2018

stale bot commented Oct 10, 2018

snowp commented Oct 10, 2018

Smarter retry behavior/cross priority retries #3958

Smarter retry behavior/cross priority retries #3958

Comments

snowp commented Jul 26, 2018

Configurable subsets on retry

Cross-priority failover

Fully configurable retry routing

snowp commented Jul 26, 2018

mattklein123 commented Jul 26, 2018

mattklein123 commented Jul 30, 2018

snowp commented Jul 30, 2018

mattklein123 commented Jul 30, 2018

snowp commented Jul 30, 2018

mattklein123 commented Jul 30, 2018

snowp commented Jul 30, 2018

snowp commented Jul 31, 2018

mattklein123 commented Aug 1, 2018

snowp commented Aug 7, 2018

alyssawilk commented Aug 7, 2018

mattklein123 commented Aug 7, 2018

snowp commented Aug 7, 2018

stale bot commented Sep 6, 2018

cmluciano commented Sep 10, 2018

stale bot commented Oct 10, 2018

snowp commented Oct 10, 2018