Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic "batteries-included" retry::Policys. #414

Closed
wants to merge 2 commits into from
Closed

Add basic "batteries-included" retry::Policys. #414

wants to merge 2 commits into from

Conversation

hdevalence
Copy link
Contributor

These are lifted out of the test and example code. Two policies are provided:

  • RetryLimit, with a bounded number of retry attempts;
  • RetryError, with unbounded retry attempts.

Both policies require Req: Clone as it's not possible to write a generic
retry policy without being able to clone requests.

The docs should be updated with a little blurb that explains when they should
not be used. Also, the Policy docs had an Attempts example corresponding
to RetryLimit (formerly Limit in the tests code); I removed it because
RetryLimit exists and can be view-sourced.

These are lifted out of the test and example code.  Two policies are provided:

* `RetryLimit`, with a bounded number of retry attempts;
* `RetryError`, with unbounded retry attempts.

Both policies require `Req: Clone` as it's not possible to write a generic
retry policy without being able to clone requests.

The docs should be updated with a little blurb that explains when they should
not be used.  Also, the `Policy` docs had an `Attempts` example corresponding
to `RetryLimit` (formerly `Limit` in the tests code); I removed it because
`RetryLimit` exists and can be view-sourced.
impl<Req: Clone, Res, E> Policy<Req, Res, E> for RetryLimit {
type Future = future::Ready<Self>;
fn retry(&self, _: &Req, result: Result<&Res, &E>) -> Option<Self::Future> {
if result.is_err() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is a way to traitize this? For example, http retries you may want to retry 500s but not 400s. In this case, if I were to apply this to hyper I would only be able to retry system errors not http errors. What do you think about providing some method to allow users to specify what they want to retry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possibility would be to construct the RetryLimit with a closure F: FnMut(&E) -> bool; then callers who wanted to retry all errors could construct it as, e.g.,

let policy = RetryLimit::new(2, |_| true);

and callers who want to filter retries can stick whatever logic they want in the closure. (Does that seem like the right bound for the closure?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though since this only returns an error that wouldn't support retrying on http::Responses?

I think we could use some trait and a TraitFn adapter for a closure, what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not sure what kind of trait you're thinking of. I think I'm missing something: if the user-supplied logic has access to the whole response and error, how would that be different from just implementing Policy::retry directly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is closer to the related vector retrylogic thing I sent on discord, but the idea is that as a user I can just provide what i'd like to retry on instead of having to implement a retry policy myself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is a way to traitize this? For example, http retries you may want to retry 500s but not 400s.

For example, finagle uses a ResponseClassifier in a similar way.

hdevalence added a commit to ZcashFoundation/zebra that referenced this pull request Feb 11, 2020
This should be removed when tower-rs/tower#414 lands
but is good enough for our purposes for now.
dconnolly pushed a commit to ZcashFoundation/zebra that referenced this pull request Feb 11, 2020
This should be removed when tower-rs/tower#414 lands
but is good enough for our purposes for now.
Copy link
Member

@LucioFranco LucioFranco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a simple responseclassifer trait that could allow the user to pass in their own way to determine when to retry, otherwise this looks good to merge. Let me know if you have any questions.

@jonhoo jonhoo added A-retry Area: The tower "retry" middleware C-enhancement Category: A PR with an enhancement or a proposed on in an issue. S-waiting-on-author Status: awaiting some action (such as code changes) from the PR or issue author. labels Mar 31, 2020
@hawkw
Copy link
Member

hawkw commented Jan 13, 2021

@hdevalence any interest in wrapping this up, or should we just close the PR?

@hdevalence
Copy link
Contributor Author

I'd be happy to wrap this up, but I don't really know what the "good" design would be -- the current PR is just what worked for my needs. But, if there was clarity about what changes would be good, I'd be happy to make them, and I think it would be convenient if something like this was included in Tower.

@hawkw
Copy link
Member

hawkw commented Feb 2, 2021

I'd be happy to wrap this up, but I don't really know what the "good" design would be -- the current PR is just what worked for my needs. But, if there was clarity about what changes would be good, I'd be happy to make them, and I think it would be convenient if something like this was included in Tower.

I wonder if @LucioFranco has any ideas about this.

@LucioFranco
Copy link
Member

@rcoh maybe you can chime in here a bit?

@rcoh
Copy link
Contributor

rcoh commented Feb 13, 2021

I don't necessarily want to impose our retry needs on others, but as a data point here are things we need:

  • Classifier interface that has access to the full Result<T, E> (and not just E)
  • The ability for the retry service to modify the responses, eg. set metadata about the number of retries required, set a custom error if you ran out of retries, etc. I sketched a possible design for this in Retry Policy POC #546

Our retry policy behavior (around number of retries, backoff length, etc.) is far too complex to be well served by something shared, unfortunately, so I don't think we have much need for batteries included policies.

@hdevalence
Copy link
Contributor Author

Those things are all possible using the existing retry trait, right? And, if they're not, that's an issue with the trait itself, not with any default policies, correct?

My goal with this PR was just to include some basic implementations that are sufficient for simple retry logic — of course, if there are more complex use cases, nothing prevents a user from implementing the trait themselves, but I'm not sure that that should mean that it's a bad idea to include some simpler default behavior.

Copy link
Member

@davidpdrsn davidpdrsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2 cents: I think this is worth merging (once the FIXMEs are fixed) as simple retry policies people can use if they want.

I also think some kind of response classifier is useful and we've actually been designing something over in tower-http https://github.com/tower-rs/tower-http/blob/classify-response/tower-http/src/classify.rs. The traits themselves aren't protocol specific so they could in theory live in tower. I think thats a separate discussion however.

@davidpdrsn
Copy link
Member

@hdevalence sorry this has taken so long to land.

My opinion from 2 months still lands. If we fix the small docs things I would be okay with merging this. I think the discussion about whether or not to change Retry in a breaking way is a separate discussion.

Do you wanna drive this home or should I take over?

@hdevalence
Copy link
Contributor Author

@davidpdrsn I think I've lost context on this one, so feel free to push it over the finish line.

@hdevalence hdevalence closed this Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-retry Area: The tower "retry" middleware C-enhancement Category: A PR with an enhancement or a proposed on in an issue. S-waiting-on-author Status: awaiting some action (such as code changes) from the PR or issue author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants