Expose Conflict errors and allow wrapping get/replace with backoff #34

clux · 2019-05-31T08:33:31Z

client-go encourages RetryOnConflict which is a reasonable pattern - code's also not that advanced.

Rust has:

backoff
retry - with tokio version

We could probably make wrappers around that..

AFAIKT it's just a wrapper around a deliberate error match case, where the backoff continues. That should be easy to emulate now that we've exposed api error helpers on our Error type. Backoff seems like a good fit atm.

The text was updated successfully, but these errors were encountered:

clux · 2020-07-28T14:58:06Z

https://docs.rs/backoff/0.2.1/backoff/future/fn.retry.html looks pretty nice now. Want to get this into one of the bigger examples and close this.

clux · 2020-07-28T15:11:21Z

Actually, I guess the key point here is to figure out where to use this. If we just put retry inside the reconciliation loop it wouldn't matter much, because that kind of has backoff built in.

However, as the go side lets you wrap a RetryOnConflict to retry the get + replace until it works. That should be less of an issue with server side apply. Still, a util that wraps Api::replace with something like that might be useful.

nightkr · 2020-07-28T16:17:20Z

IMO the "obvious" place to put this would be to provide it as an error_policy helper. The current controller machinery isn't well-suited for long-running reconcilers, since that would block reconciling for all watched objects, not just that one. The main problem then would be to have a good way to associate the controller::Error back to the ObjectRef, but that would be nice to have regardless.

nightkr · 2020-07-28T17:03:16Z

Maybe consider also adding backoff to watcher, actually..

kazk · 2021-06-07T21:42:32Z

Maybe try RetryLayer? It's per request and Policy looks flexible (return Option<Future<Policy>> based on request/response).

clux · 2021-06-07T22:22:53Z

Oh, that's nice! We could even use the Policy to do error matching and globally set RetryOnConflict equivalents if we modified Api::replace methods to optionally do an Api::get right before to get the latest resource version.

nightkr · 2021-06-07T22:54:18Z

Maybe try RetryLayer? It's per request and Policy looks flexible (return Option<Future> based on request/response).

Isn't the layer stack decided when building the Client, IIRC?

This needs cooperation from the user code, since only the user knows what exact changes were made, and why. At the very least, you'd need to replace the current Api<K>::replace(new: K) with a Api<K>::update(mutate: impl FnMut(K) -> K), which could call mutate on every conflict.

We could even use the Policy to do error matching and globally set RetryOnConflict equivalents if we modified Api::replace methods to optionally do an Api::get right before to get the latest resource version.

If you want to steamroll the conflicts (which is the only thing that we can do with our current API) then you don't need any of this machinery anyway; just set resource_version to None.

In order to actually handle the conflicts, you need to understand the mutation. For example, with Controller you'd let it crash and retry the whole reconciliation. I'm not saying Controller is right for every use case, but I think that's roughly the level where you'd need to hook in conflict resolution in any case.

Of course, there are still transient errors that do warrant transparent retries (network errors, and so on). But it's important that we don't get overzealous.

kazk · 2021-06-08T02:49:39Z

Yeah, sorry, I didn't understand the requirements and the Policy trait correctly. We can't use RetryLayer for this. (RetryLayer can be used for other retries, though. e.g., retrying GET requests on Retry-After response like client-go.)

This needs cooperation from the user code, since only the user knows what exact changes were made, and why.

Yeah, I thought users can provide their Policy when creating a client used by the controller. But that Policy won't have enough information to do what we want, and the trait doesn't allow it either.

clux · 2021-11-21T22:27:28Z

Going to close this. It's not worth doing anything smart for get/replace now that server-side apply is ubiquitous.

The way to do this for individual calls is using backoff::future::retry. Based on the error type you set there, it will either retry (for a configurable amount of time/times), or bubble up the error.

Watcher has a similar backoff_watch system (via #577 ) that also uses the same backoff library.

Note that StatusDetails struct (from the api error) will include a retry_after_seconds that indicates approximate wait time for "alternate actions"

clux mentioned this issue Dec 5, 2019

Use tower-service to build more robust clients #100

Closed

clux self-assigned this Jul 28, 2020

clux changed the title ~~exponential backoff~~ Expose Conflict errors and allow wrapping get/replace with backoff Jul 28, 2020

clux added api Api abstraction related help wanted Not immediately prioritised, please help! labels Jul 28, 2020

clux removed their assignment Jul 28, 2020

nightkr mentioned this issue Aug 30, 2020

Add Backoff-based error policy #314

Open

clux removed the help wanted Not immediately prioritised, please help! label Feb 28, 2021

clux added the client http issues with the client label Jun 7, 2021

clux mentioned this issue Nov 4, 2021

Rate Limiting Strategies #500

Closed

clux closed this as completed Nov 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose Conflict errors and allow wrapping get/replace with backoff #34

Expose Conflict errors and allow wrapping get/replace with backoff #34

clux commented May 31, 2019 •

edited

clux commented Jul 28, 2020

clux commented Jul 28, 2020

nightkr commented Jul 28, 2020

nightkr commented Jul 28, 2020

kazk commented Jun 7, 2021

clux commented Jun 7, 2021 •

edited

nightkr commented Jun 7, 2021

kazk commented Jun 8, 2021

clux commented Nov 21, 2021 •

edited

Expose Conflict errors and allow wrapping get/replace with backoff #34

Expose Conflict errors and allow wrapping get/replace with backoff #34

Comments

clux commented May 31, 2019 • edited

clux commented Jul 28, 2020

clux commented Jul 28, 2020

nightkr commented Jul 28, 2020

nightkr commented Jul 28, 2020

kazk commented Jun 7, 2021

clux commented Jun 7, 2021 • edited

nightkr commented Jun 7, 2021

kazk commented Jun 8, 2021

clux commented Nov 21, 2021 • edited

clux commented May 31, 2019 •

edited

clux commented Jun 7, 2021 •

edited

clux commented Nov 21, 2021 •

edited