New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow RibbonLoadBalancedRetryPolicy to participate to the ServerStats connection failure count #1878
Comments
Potentially we will have to look at it. |
I wonder, what is the advantage of using spring retry, over the native ribbon retry? |
There was too much inconsistency between various ways requests were made in Spring Cloud Netflix, so we centralized on Spring Retry across all of them. For a good summary of the history behind this see #1290. When I added the feature I did not try using the AvailabilityFilteringRule (which I assume you are using). Have you tried using this rule without using Spring Retry? |
Without spring retry the
Have you considered putting the retry logic into a custom I have look on how to use |
I need to spend some time looking at how this works, I am not to familiar with it. I will post some updates here hopefully tomorrow. |
Are you testing this by proxying requests through Zuul? |
yes |
Has there been any progress on this issue? We are experiencing unnecessary client side exceptions while attempting to gracefully shutdown an service instance. Since the server stats are only updated after retry exhausts its retry limit it is rare for a circuit breaker to open and common for an unavailable server to be chosen by the LoadBalancer. |
Sorry this fell off my radar. I will try to carve out some time to look at it. |
@ryanjbaxter I took a stab at fixing this issue, see the pull request I posted above. Thanks! |
@ryanjbaxter @spencergibb I know you two are busy, but was hoping to get your eyes on the updated pull request that @pway99 submitted. Some background on this: We are moving our legacy deployment infrastructure to a continuous deployment model. All of our stack is still running in a private data center and we are not yet on something like cloud foundry. We are very close to getting a rolling deployment working, however we are having one issue; when we gracefully shutdown our services, we are still getting a few client-side errors. Those clients are using spring-cloud-sidecar (Dalston.SR3) and Pat reworked his pull request after your feedback. The root cause was that in Zuul/Sidecar, the Feign calls were not updating the load balancing stats when using Spring Retry + a retry policy. This was fixed in RibbonLoadBalancedRetryPolicy, the second part of this fix was in the AbstractRibbonCommand. The code now checks the request.isRetriable() to determine if the Spring code or the Netflix code will be used for retry/stats update. This seems to handle all of the use cases we could think of, but wanted to verify with you. We have confirmed that this is now working correctly in our testing environment. We were able to spin up 4 service nodes and 4 clients, run a load test, and then shutdown three of those service nodes while the test was running. No client errors were reported. Thanks for your time! |
Adding a reference to the new pull request. |
It appears that
org.springframework.cloud.netflix.ribbon.RibbonLoadBalancedRetryPolicy#registerThrowable
does not participate to the
ServerStats
connection failure count. Thus thecom.netflix.loadbalancer.BaseLoadBalancer#chooseServer
cannot circuit breaksServer
with too much failure according to theniws.loadbalancer.default.connectionFailureCountThreshold
property.Does it make sense?
Thanks
The text was updated successfully, but these errors were encountered: