New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make a Pool's retryAfterFailure interval directly configurable #102
Comments
We have created an issue in Pivotal Tracker to manage this. You can view the current status of your issue at: https://www.pivotaltracker.com/story/show/108562358. |
After thinking a little more I can see why you might derive https://groups.google.com/a/cloudfoundry.org/d/msg/vcap-dev/yuVYCZkMLG8/LotXj-u1jUUJ This might give you more context into why we have our |
Hi @youngm We've discussed enabling an operator to choose between availability and consistency in the event NATS is unavailable. If you could configure gorouter not to prune routes when NATS was unavailable, would this issue be less important to you? |
@shalako yes. if NATS unavailable didn't cause routes to get pruned then I would feel better about using |
Here's a story for using a manifest property to disable pruning when NATs is unavailable: https://www.pivotaltracker.com/story/show/108659764 |
@shalako Great! You can close this if you'd like. As a side note I know this isn't fully your area, but, I'd like to see Diego/Garden refine the way they choose external ports for apps to help make potential stale bad routing less of an issue. Perhaps using a consistent hash of the app guid or even a random port. Really anything would be better than the incremental solution used by DEA/Warden today. |
Hey, @youngm, the Garden team also has https://www.pivotaltracker.com/story/show/92085170 in flight to help reduce the likelihood of port reuse as garden-linux creates and destroys containers. I'm sure @goonzoid and @julz would appreciate additional suggestions about how to prevent stale routing from misdirecting requests to containers. Thanks, |
Thanks @ematpl for making me aware of that story. I've started a mailing list thread to discuss it. |
Currently it is hard coded at 1/4 of
droplet_stale_threshold
which doesn't really make any sense to me.gorouter/registry/registry.go
Line 70 in 5b91133
We use a large
droplet_stale_threshold
because we are more afraid of apps not getting requests because of nats going down than we are of routes going to the wrong back end.Anyway, we set a large stale threshold of 420 sec which causes gorouter to not retry an instance for 105 seconds. Too long. It would be nice if this value where independently configurable.
The text was updated successfully, but these errors were encountered: