Allow RibbonLoadBalancedRetryPolicy to participate to the ServerStats connection failure count #1878

nithril · 2017-04-21T21:00:39Z

It appears that org.springframework.cloud.netflix.ribbon.RibbonLoadBalancedRetryPolicy#registerThrowable
does not participate to the ServerStats connection failure count. Thus the com.netflix.loadbalancer.BaseLoadBalancer#chooseServer cannot circuit breaks Server with too much failure according to the niws.loadbalancer.default.connectionFailureCountThreshold property.

Does it make sense?

Thanks

The text was updated successfully, but these errors were encountered:

ryanjbaxter · 2017-04-24T14:57:51Z

Potentially we will have to look at it.

nithril · 2017-04-24T15:06:20Z

I wonder, what is the advantage of using spring retry, over the native ribbon retry?

ryanjbaxter · 2017-04-24T18:52:48Z

There was too much inconsistency between various ways requests were made in Spring Cloud Netflix, so we centralized on Spring Retry across all of them. For a good summary of the history behind this see #1290.

When I added the feature I did not try using the AvailabilityFilteringRule (which I assume you are using). Have you tried using this rule without using Spring Retry?

nithril · 2017-04-24T19:16:44Z

Without spring retry the AvailabilityFilteringRule works as intended. Note this rule is added by default.

There was too much inconsistency between various ways requests were made in Spring Cloud Netflix

Have you considered putting the retry logic into a custom LoadBalancerCommand ?

I have look on how to use ServerStats into the spring cloud retry logic, but it does not seems straightforward as ribbon is not aware of the next new chooseServer calls.

ryanjbaxter · 2017-04-25T00:11:38Z

I need to spend some time looking at how this works, I am not to familiar with it. I will post some updates here hopefully tomorrow.

ryanjbaxter · 2017-04-26T15:39:01Z

Are you testing this by proxying requests through Zuul?

nithril · 2017-04-27T14:37:30Z

yes

pway99 · 2017-09-20T17:43:48Z

Has there been any progress on this issue? We are experiencing unnecessary client side exceptions while attempting to gracefully shutdown an service instance. Since the server stats are only updated after retry exhausts its retry limit it is rare for a circuit breaker to open and common for an unavailable server to be chosen by the LoadBalancer.

ryanjbaxter · 2017-09-20T18:59:32Z

Sorry this fell off my radar. I will try to carve out some time to look at it.

pway99 · 2017-09-29T22:33:03Z

@ryanjbaxter I took a stab at fixing this issue, see the pull request I posted above. Thanks!

tkvangorder · 2017-10-11T15:59:51Z

@ryanjbaxter @spencergibb I know you two are busy, but was hoping to get your eyes on the updated pull request that @pway99 submitted.

Some background on this: We are moving our legacy deployment infrastructure to a continuous deployment model. All of our stack is still running in a private data center and we are not yet on something like cloud foundry. We are very close to getting a rolling deployment working, however we are having one issue; when we gracefully shutdown our services, we are still getting a few client-side errors. Those clients are using spring-cloud-sidecar (Dalston.SR3) and Pat reworked his pull request after your feedback.

The root cause was that in Zuul/Sidecar, the Feign calls were not updating the load balancing stats when using Spring Retry + a retry policy. This was fixed in RibbonLoadBalancedRetryPolicy, the second part of this fix was in the AbstractRibbonCommand. The code now checks the request.isRetriable() to determine if the Spring code or the Netflix code will be used for retry/stats update. This seems to handle all of the use cases we could think of, but wanted to verify with you.

We have confirmed that this is now working correctly in our testing environment. We were able to spin up 4 service nodes and 4 clients, run a load test, and then shutdown three of those service nodes while the test was running. No client errors were reported.

Thanks for your time!

tkvangorder · 2017-10-24T19:52:52Z

Adding a reference to the new pull request.

#2390

ryanjbaxter self-assigned this Apr 24, 2017

nithril mentioned this issue Apr 24, 2017

RibbonLoadBalancingHttpClient#getRequestSpecificRetryHandler does not create the RequestSpecificRetryHandler with the provided IClientConfig #1879

Closed

pway99 mentioned this issue Sep 29, 2017

Bugfix/ribbon load balanced retry policy update server stats #2334

Closed

ryanjbaxter added this to the 1.4.0.RELEASE milestone Nov 7, 2017

ryanjbaxter modified the milestones: 1.4.0.RELEASE, 1.4.1.RELEASE Nov 20, 2017

pway99 mentioned this issue Nov 29, 2017

Allowing RibbonLoadbalancedRetryPolicy to update ServerStats for circuit tripping exceptions #2483

Merged

ryanjbaxter modified the milestones: 1.4.1.RELEASE, 1.4.2.RELEASE Jan 12, 2018

ryanjbaxter closed this as completed Feb 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow RibbonLoadBalancedRetryPolicy to participate to the ServerStats connection failure count #1878

Allow RibbonLoadBalancedRetryPolicy to participate to the ServerStats connection failure count #1878

nithril commented Apr 21, 2017

ryanjbaxter commented Apr 24, 2017

nithril commented Apr 24, 2017

ryanjbaxter commented Apr 24, 2017

nithril commented Apr 24, 2017

ryanjbaxter commented Apr 25, 2017

ryanjbaxter commented Apr 26, 2017

nithril commented Apr 27, 2017

pway99 commented Sep 20, 2017

ryanjbaxter commented Sep 20, 2017

pway99 commented Sep 29, 2017

tkvangorder commented Oct 11, 2017

tkvangorder commented Oct 24, 2017

Allow RibbonLoadBalancedRetryPolicy to participate to the ServerStats connection failure count #1878

Allow RibbonLoadBalancedRetryPolicy to participate to the ServerStats connection failure count #1878

Comments

nithril commented Apr 21, 2017

ryanjbaxter commented Apr 24, 2017

nithril commented Apr 24, 2017

ryanjbaxter commented Apr 24, 2017

nithril commented Apr 24, 2017

ryanjbaxter commented Apr 25, 2017

ryanjbaxter commented Apr 26, 2017

nithril commented Apr 27, 2017

pway99 commented Sep 20, 2017

ryanjbaxter commented Sep 20, 2017

pway99 commented Sep 29, 2017

tkvangorder commented Oct 11, 2017

tkvangorder commented Oct 24, 2017