Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flags for customizing retry behaviour #82

Closed
hansmi opened this issue Jun 14, 2020 · 5 comments
Closed

Flags for customizing retry behaviour #82

hansmi opened this issue Jun 14, 2020 · 5 comments

Comments

@hansmi
Copy link
Contributor

hansmi commented Jun 14, 2020

I have a usecase where a retry every 5 seconds is excessive and would like to contribute customizations for the retries. The PushProx client currently hardcodes the timings for connection retries:

PushProx/cmd/client/main.go

Lines 211 to 217 in eeadbe7

func newJitter() decorrelatedJitter {
rand.Seed(time.Now().UnixNano())
return decorrelatedJitter{
min: 50 * time.Millisecond,
cap: 5 * time.Second,
}
}

The github.com/cenkalti/backoff package (documentation) provides an exponential backoff algorithm. I'd use that instead of the custom implementation and provide flags:

  • --retry.initial-wait=<duration>, default 50ms
  • --retry.multiplier=<float>, default 1.5
  • --retry.max-wait=<duration>, default 5s
  • --retry.max-elapsed=<duration>, default 0 (keep retrying forever, otherwise give up and terminate after given duration)

The default behaviour would be comparable to what's currently implemented while permitting customization.

@SuperQ @brian-brazil Would you be happy with such flags?

@SuperQ
Copy link
Contributor

SuperQ commented Jun 14, 2020

Yes, this seems reasonable to me.

@SuperQ
Copy link
Contributor

SuperQ commented Jun 15, 2020

One thing about the backoff options. I'd like to keep the flags to a minimum to start, and set sane defaults for the rest. If you have suggestions for better defaults than the current ones, we can discuss changing them.

For now, I think the only real flag we would need is --retry.max-wait. The rest can stay hard coded.

I prefer to not over-flag things.

@hansmi
Copy link
Contributor Author

hansmi commented Jun 15, 2020

I agree with you on not exposing too many flags. My primary concern are metered connections, i.e. mobile data, in case the server is not reachable. As such I'd strongly prefer to have at least the minimum and maximum wait duration configurable (--retry.initial-wait, --retry.max-wait). The multiplier can be hardcoded, or we could make it a hidden flag (Kingpin supports them). The max-elapsed is not necessary for my usecase and came from looking at the options the backoff package provides.

As for the flag prefix I'm not sure whether --retry. is good. It could be misinterpreted as "retries against the exporters" when it's actually retries for the proxy connection. I guess this can be addressed by mentioning it in the flag description.

@SuperQ
Copy link
Contributor

SuperQ commented Jun 15, 2020

initial-wait and max-wait sound fine to me. We can leave out max-elapsed and see if anyone really needs it.

For the naming convention, I would put this under the --proxy prefix. How about --proxy.retry.initial-wait. I'd also like to deprecate --proxy-url in favor of --proxy.url so that we have consistent naming.

@hansmi
Copy link
Contributor Author

hansmi commented Jun 15, 2020

#83 implements the flags as discussed. Implementing unittests will take a few more changes which I'd like to do separately (experimental code exists already).

ecksun pushed a commit to AiflooAB/PushProx that referenced this issue Sep 3, 2020
This is a quick fix until PR prometheus-community#82 [0] has landed upstream.

[0]: prometheus-community#82
@hansmi hansmi closed this as completed Mar 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants