Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smarter retry behavior/cross priority retries #3958

Closed
snowp opened this issue Jul 26, 2018 · 18 comments
Closed

Smarter retry behavior/cross priority retries #3958

snowp opened this issue Jul 26, 2018 · 18 comments
Labels
design proposal Needs design doc/proposal before implementation stale stalebot believes this issue/PR has not been touched recently

Comments

@snowp
Copy link
Contributor

snowp commented Jul 26, 2018

We're interested in having Envoy perform smarter retry behavior, where subsequent requests are sent to different hosts than the original request. More specifically, we'd be interested in having requests that route to priority P attempt to send retires to priority P' != P even in the case where the first priority is completely healthy, but fall back to the original priority if no healthy alternatives exist. The use case is mainly to maintain compatibility with our existing legacy systems: it's an attempt to minimize the change in behavior for our internal users when switching over to routing through Envoy.

Got a few ideas on the top of my mind that might help fuel a discussion:

Configurable subsets on retry

One somewhat suboptimal option would be to use subset lb and allow adjusting the metadata match on subsequent retires. I'm imagining something like:

       route:
          metadata_match:
            filter_metadata:
              envoy.lb:
                metadata: "primary"
          cluster: some-cluster
          timeout: 1s
          retry_metadata_match:
            -  envoy.lb:
                  metadata: "secondary"
            -  envoy.lb:
                  metadata: "tertiary"

which would issue the first request to the primary subset, the first retry to secondary, second retry to tertiary and then loop back again to route to secondary (i.e. the retry list wraps around). With this we could add metadata to the hosts in each priority to allow us to exclude certain priorities (e.g. hosts not in P0 get labeled P0=false).

The problem I can see with this is that there's no guards against the subsets being empty, which would cause an immediate 503 and quickly exhaust the permitted number of retries. Presumably this could be solved by being able to query the load balancer to determine if a subset is empty before trying to route to it, but I'm not sure how easy that would be to achieve. The other issue here is that we don't have knowledge about what priority was actually hit - we have to statically guess what priority to hit on the subsequent requests. For instance, if we have P0, P1, P2 and everything is healthy, requests with retries could hit P0, P1, P2, while if P0 is unhealthy, we'd hit P1, P1, P2.

Cross-priority failover

Another option I can think of would be more tightly coupled to priorities and simply keep track in the retry state what priority was last tried, and would attempt to select a host while ignoring the previous priority. Again this seems like it needs some way of checking whether there are available hosts when when retrying (falling back to hitting the same priority), but it seems like a simpler API and would at least achieve what we're looking for. This approach handles the edge case mentioned in the previous paragraph, because we'd know that we hit P1 on the first request and be able to take that into account when selecting the next priority to route to.

Fully configurable retry routing

A last approach is less fleshed out but would involve being able to register custom retry implementations (like how custom filters can be registered). It could expose an API like

RetryRouteDecision routeRetry(const std::vector<RetryAttempt>& previous_attempts);

This would allow Envoy users to implement an arbitrary retry policy based on the previous attempts, and it could be configurable on the route level, something like:

retry_router:
  name: my_retry_router
  config:
    available_subsets: [ primary, secondary, tertiary]

This way any Envoy user could implement their own bespoke retry behavior. It would still be useful to be able to query the health of subsets/clusters etc for aforementioned reasons, but you could presumably add something a like incrementRetryCounter() to the API to let the implementation decide whether we count the retry towards the retry counters.

Happy to consider other ideas, these were just top of mind.

@snowp
Copy link
Contributor Author

snowp commented Jul 26, 2018

A variant of the third option would be to use RetryState for this purpose instead of introducing a new type. It seems doable to pass it additional data to shouldRetry and provide hooks to mutate the the LoadBalancerContext values of the Router, allowing it to change the parameters used to select the next connection.

@mattklein123
Copy link
Member

Note this is related to #2259. @snowp will read/respond to the full proposal in the next few days.

@mattklein123 mattklein123 added the design proposal Needs design doc/proposal before implementation label Jul 26, 2018
@mattklein123
Copy link
Member

@snowp just catching up here. IMO we should probably do the "fully configurable retry routing" option with some built-in options such as "don't route to to the same host," "don't route to same priority," etc. I think this could be built very cleanly into the existing retry/router implementation and would be very flexible. WDYT? Basically, I would like to solve #2259 while also allowing folks to do more complex stuff if they want?

@snowp
Copy link
Contributor Author

snowp commented Jul 30, 2018

Yeah that sounds reasonable to me. It would definitely satisfy our needs and seems generic enough that I'm sure others will find it useful.

Happy to work on this. Would it be natural to use the subset lb to implement this behavior? e.g. generate one subset per priority and when a request fails going to P0, we set metadata_match to priority = [P1, P2, ...] (using #3964)?

One thing that would be nice with this being tied to the subset lb is that the work for host/priority would naturally extend arbitrary subsets.

The main concern I see with using the subset lb is performance, especially with the recent subset lb perf concerns. Does maintaining all these extra subsets in order to do smarter retries make sense? Can this be better implemented in a different way?

@mattklein123
Copy link
Member

Maybe we can just change the name to indicate the function is using a static const backing store so we should avoid copies where we can but it's Ok to not use the literal string in the header entry?

Yeah I'm a little concerned about using the subset LB for this, at least in all cases. Or do you mean just for your plugin? The way I originally envisioned the "don't use the same host again for retry" behavior was to just pass the previous tried hosts as part of the LB context, and let the LB figure it out. So, potentially, should this plugin mechanism actually be a plugin that operates from within the LB context and could be somewhat generic depending on the LB actually used?

@snowp
Copy link
Contributor Author

snowp commented Jul 30, 2018

Delegating the work to the LB makes sense. The subset lb looked attractive because it wouldn't implementing this in every LB, but I imagine the perf benefits makes it worth it. Looking over the code it doesn't seem all that bad to put this into the LB.

Not sure how easy this would be to generically support - it seems like we need special handling depending on what we're filtering on? For hosts we can choose a host set like normal, and if it contains the undesired host we can remove it. For priorities we might scale the remaining priority weights s.t they sum to 100 in order to use the existing priority selection logic.

Maybe there's a generic "ignore hosts matching X" approach that's eluding me?

@mattklein123
Copy link
Member

TBH I'm not sure of the best solution here without putting some dedicated thought time into it. There is going to be a tradeoff between flexibility and performance. For example, I could imagine a plugin that does something like:

  1. Some method that optionally pre-filters the hosts/priorities to LB on, given the LB context with list of previously tried hosts.
  2. Some method that optionally post-filters the LB result and says something like "try again."

This is not fleshed out, but I wonder if given something like this we could build it into the various LBs pretty easily? For most of the LBs this would be built-in to the base classes anyway and "just work." Anyway maybe it's worth doing a bit of experimentation here and coming back with a more firm proposal?

@snowp
Copy link
Contributor Author

snowp commented Jul 30, 2018

That sounds reasonable. At least for priorities I think it would have to be a pre-filter simply due to how priorities work: if you're trying to avoid P0 but P0 is healthy you'll always hit P0, so trying again won't help.

I'll spend some time playing around with it and report back when I got a better sense of how this would work. Thanks for all the ideas!

@snowp
Copy link
Contributor Author

snowp commented Jul 31, 2018

After looking at the code for a while it’s clear that 1) priorities need to be filtered before we attempt to select a host and that 2) filtering individual hosts before selection is hard because they’re buried into EdfSchedulers. As such I propose that we add the ability to filter out hosts during host selection (allowing filtering both before and after host selection) and wire up the router to specify these filters by inspecting the RetryState.

We add three functions to the LoadBalancingContext that can influence the hosts considered during host selection:

absl::optional<std::function<bool(uint32_t, const Host&)>> preFilter() PURE;
std::function<bool(const Host&)> postFilter() PURE;
int maxSelectionAttempts() const PURE;

The first function allows one to optionally specify a priority filter. The presence of this would mean that we generate a temporary per_priority_load using the hosts remaining after filtering them out using the predicate. This will allow filtering out entire priorities or affecting priority spillover by excluding certain hosts, which could in turn affect the per priority health %. Priority selection would then follow as usual, using the updated per_priority_load data. The preFilter functions returns an optional due to the fact that rebuilding the per_priority_load isn’t free, so one might want to return nothing here in order to use the existing load information.

The second filter will be used after chooseHost is selected to determine whether we should reattempt host selection, indicated by returning false. This is done so that we can avoid having to rebuild the schedulers with the filtered hosts, and also makes this mechanism agnostic towards the exact host selection mechanism. No optional is necessary here since a simple []() { return true } should be cheap to evaluate for the selected host.

We’ll probably want to handle the case where applying the filter makes it impossible to find a new host. For instance, when routing to a cluster with only one host and a postFilter that excludes that hosts, we’ll be unable to find a new candidate host. We’ll probably want to bail out of filtering hosts, falling back to using no filters. To make this behavior configurable we have the maxSelectionAttempts function which will return the number of times we should attempt to re-select a host.

To configure this for retries, we can introduce a RetryHostFilterFactory which accepts the RetryState and returns the appropriate predicates. I imagine it would look something like this

abstract class RetryHostFilterFactory {
absl::optional<std::function<bool(uint32_t, const Host&)>> preFilter(const RetryState&) PURE;
std::function<bool(const Host&)> postFilter(const RetryState&) PURE;
}

The router would be initialized with default filters for the initial request (empty optional for preFilter, always true for postFIlter, and only one attempt). In doRetry(), it would call into the RetryHostFilterFactory to retrieve the predicates for this retry attempt and store them in order to return them when the LoadBalancerContext functions are called.

To make this extensible we’d use the same kind of pattern that listeners/access logs/etc uses, and allow specifying a HostFilter by name. Something along the lines of:

retry_host_filter:
   name: envoy.retry_other_priority
   max_selection_attempts: 3
   config: {}

which would live on the RouteAction.RetryPolicy.

Rough order of implementation order:

  • Add filter functions to LoadBalancingContext and implement logic in LoadBalancerBase to respect them. Router will only ever use the noop defaults.
  • Allow registering arbitrary RetryHostFilterFactory implementations and have the router set the provided values on retry
  • Implement SkipAttemptedPriority filter
  • Implement SkipAttemptedHost filter

Happy to move this to a Google doc if it makes commenting easier.

@mattklein123
Copy link
Member

@snowp in general I think this sounds great to me. For the retry policy, we might want to also allow specifying at the virtual host level? Also, I'm thinking that we can include one or more more built-in policies to solve the "don't retry same host" open issues.

@envoyproxy/maintainers any thoughts on this? @alyssawilk specifically would love your feedback.

@snowp
Copy link
Contributor Author

snowp commented Aug 7, 2018

Is this in a good enough state that I can start working on the implementation? Or should I wait for more feedback?

@alyssawilk
Copy link
Contributor

Ugh, sorry, I was out last week and am swamped by maintainer rotation this week :-(

Chiming in rather belatedly, I totally agree with Matt that an extensible/customization mechanism is the way to go. I suspect other folks will want to preserve legacy behaviors so making it easier to program custom host selection seems ideal.

I guess my only thought is rather than have a callback before and after host selection where you mutate state such as per_priority_load, would it make more sense to make the priority and host selection actually the pluggable bit, where the default action is to call the current code, but if a plug-in is configured that's called instead? Changing the internals before and after feels more fragile than allowing custom code to inspect and make its own decisions. No strong feelings so if you want to go with the Matt-approved design over my hand-waving that's fine. :-)

@mattklein123
Copy link
Member

@snowp I'm fine for you to move forward. If you want to investigate @alyssawilk's thoughts first that's fine too. I agree with @alyssawilk that not having before/after would be better, though my concern would be that the plugin would then have to do a lot more than what we are proposing that it does with the above design.

@snowp
Copy link
Contributor Author

snowp commented Aug 7, 2018

Seems like having the pre/post filters would be useful even if we allow overriding the entire host selection method: it'd let people move concerns such as "avoid this host/priority" out of the host selection algorithm, making it easier to reuse between implementations.

If a bunch of awesome host selection algorithms got upstreamed, I could add "retry other priority" to any of them without having them explicitly support that.

Additionally, I'd hate to have to somehow get ahold of the existing priority selection logic so that I can run through the same code with slight modifications.

I think I'll go with the currently approved design for now but open to pivoting if it ends up being too heavily coupled to the lb internals.

snowp added a commit to snowp/envoy that referenced this issue Aug 9, 2018
As an initial step towards envoyproxy#3958, this implements two mechanisms for
affecting the host selected by the load balancer:

1) prePrioritySelectionFilter, which is used to generate a different
per_priority_load table before determining what priority to route to
2) postHostSelectionFilter, which can be used to to reject a selected
host and retry host selection (up to hostSelectionRetryCount times)

Both of these are defined on the LoadBalancerContext, and will
eventually be used by the router to affect the host selection strategy
based on retry state.

Signed-off-by: Snow Pettersen <snowp@squareup.com>
htuch pushed a commit that referenced this issue Aug 20, 2018
)

As an initial step towards #3958, this implements two mechanisms for
affecting the host selected by the load balancer during chooseHost:

1. prePrioritySelectionFilter, which is used to generate a different per_priority_load table before determining what priority to route to

2. postHostSelectionFilter, which can be used to to reject a selected host and retry host selection (up to hostSelectionRetryCount times) Both of these are defined on LoadBalancerContext, and will eventually be used by the router to affect the host selection strategy based on retry state.

The first mechanism is implemented in the LoadBalancerBase, while the second one is only implemented for EdfLoadBalancerBase and RandomLoadBalancer.

Signed-off-by: Snow Pettersen snowp@squareup.com

Risk Level: Medium, new feature that's not enabled. Default behavior should match existing behavior.
Testing: Unit tests
Docs Changes: N/A (feature not configurable)
Release Notes: N/A

Signed-off-by: Snow Pettersen <snowp@squareup.com>
snowp added a commit to snowp/envoy that referenced this issue Sep 5, 2018
As an initial step towards envoyproxy#3958, this implements two mechanisms for
affecting the host selected by the load balancer:

1) prePrioritySelectionFilter, which is used to generate a different
per_priority_load table before determining what priority to route to
2) postHostSelectionFilter, which can be used to to reject a selected
host and retry host selection (up to hostSelectionRetryCount times)

Both of these are defined on the LoadBalancerContext, and will
eventually be used by the router to affect the host selection strategy
based on retry state.

Signed-off-by: Snow Pettersen <snowp@squareup.com>
@stale
Copy link

stale bot commented Sep 6, 2018

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Sep 6, 2018
@cmluciano
Copy link
Member

not stale, some work done in #4097

@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Sep 10, 2018
zuercher pushed a commit that referenced this issue Sep 10, 2018
…points (#4212)

This adds the necessary configuration and interfaces to register
implementations of RetryPriority and RetryHostPredicate, which will
allow configuring smarter host selection during retries.

Part of #3958

Risk Level: low, api changes
Testing: n/a
Doc Changes: inline
Release Notes:n/a

Signed-off-by: Snow Pettersen <snowp@squareup.com>
htuch pushed a commit that referenced this issue Sep 14, 2018
Wires up route configuration to allow specifying what hosts should be
reattempted during retry host selection.
Risk Level: Medium, some changes made to the router. Otherwise new optional feature
Testing: unit and integration test
Docs Changes: n/a
Release Notes: n/a

Part of #3958

Signed-off-by: Snow Pettersen <snowp@squareup.com>
htuch pushed a commit that referenced this issue Sep 18, 2018
Plumbs through the max_host_selection_count parameter from retry policy config -> router
Risk Level: Low
Testing: UT
Docs Changes: n/a
Release Notes: n/a
Part of #3958

Signed-off-by: Snow Pettersen <snowp@squareup.com>
htuch pushed a commit that referenced this issue Sep 23, 2018
This wires up the necessary logic to allow registering a
RetryPriorityFactory that can be used to impact which priority is
selected during host selection for retry attempts.

Signed-off-by: Snow Pettersen snowp@squareup.com

Description:
Risk Level: Low, new optional feautre
Testing: Integration test
Docs Changes: n/a
Release Notes: n/a
Part of #3958

Signed-off-by: Snow Pettersen <snowp@squareup.com>
htuch pushed a commit that referenced this issue Sep 23, 2018
Adds a simple RetryHostPredicate that keeps track of hosts that have
already been attempted, triggering a new host to be selected if an
already attempted host is selected.

Signed-off-by: Snow Pettersen snowp@squareup.com

Risk Level: Low
Testing: Unit test
Docs Changes: n/a
Release Notes: n/a
Part of #3958

Signed-off-by: Snow Pettersen <snowp@squareup.com>
@stale
Copy link

stale bot commented Oct 10, 2018

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Oct 10, 2018
@snowp
Copy link
Contributor Author

snowp commented Oct 10, 2018

Both host selection and retry priority modifications has been implemented, including example implementations.

@snowp snowp closed this as completed Oct 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design proposal Needs design doc/proposal before implementation stale stalebot believes this issue/PR has not been touched recently
Projects
None yet
Development

No branches or pull requests

4 participants