CircuitBreaker per downstream node, with shared named HttpClient #624

shagilt · 2019-03-21T22:37:12Z

Summary: What are you wanting to achieve?
We have multiple instances(10 instances on 10 different nodes) of a stateless service. One of our handlers, will select different endpoint(destination node) each time for load balancing.
We also used shared http client which is created using IHttpClientFactory.

When I wrote the test, I see error from one bad instance of my stateless destination service. This is opening the Circuit and not allowing any request at all.
Ideally, I want to open the circuit for bad instance and allow traffic for other instances. Is this possible with client created using IHttpClientFactory? What is your recommendation to achieve this?

What code or approach do you have so far?

It is always useful to see:

Configuring handlers:
serviceCollections.AddHttpClient(clientName)
.AddRetryPolicy() <-Polly retry policy
.AddHttpMessageHandler()
.AddCircuitBreakerPolicy() <-Polly circuit breaker policy
.AddHttpMessageHandler();

Create code:
var client = clientFactory.CreateClient(clientName);

reisenberger · 2019-03-22T20:21:01Z

@shagilt. You need to define and maintain a separate circuit-breaker instance per downstream system for which you want circuits to break independently. There's more about this in the documentation here and here. Some other documentation also related to circuit-breaker and HttpClientFactory (only in case you haven't already seen them) : here and here.

Here is an answer to a similar question recently. It suggests how you could use PolicyRegistry as a dictionary for circuit-breakers per separate downstream system, with a .GetOrAdd(...) approach and a circuitBreakerFactory method, to ensure you manufacture a single circuit-breaker for each downstream system when it is first needed, and thereafter re-use that instance.

When you say:

One of our handlers, will select different endpoint(destination node) each time for load balancing

is that a DelegatingHandler within the HttpClient, which selects that?

If so, you could:

have your delegating handler record which endpoint (destination node) it selected, on the Polly.Context for the execution, using the techniques demonstrated here
use the AsyncPolicySelector shown here to create a policy which manufactures/selects the correct circuit-breaker from the PolicyRegistry, according to the information in Context on which endpoint was selected
enrol that AsyncPolicySelector<HttpResponseMessage> with HttpClientFactory

Does that give you enough information to put / start putting a solution together?

I may be able to find time to code this up more fully, but I would need to find couple of hours +/- somewhere - no promises exactly when that would be - I may not have that time in the coming 7 days or so (am in seminars all next week). So I'm preferring to get you some information/pointers today (this reply) rather than leave you waiting 7 or more days. Hope that helps. Let me know how you get on.

shagilt · 2019-03-23T00:17:27Z

Thank you @reisenberger . I will go thru the documentation you shared and update this thread sometime next week. Thanks again!

shagilt · 2019-03-28T21:37:16Z

@reisenberger Yes, your proposal worked, and my tests are working as expected now.

What I did:
I created policy per downstream service instances(same service is running in multiple nodes) and stored in PolicyRegistry.

Some of my downstream service instances has different keys.So, I have different key generation for different URL. So, I am planning to introduce a Func<HttpRequestMessage, string> to generate key from request’s URL or properties.
For example:

Url: http://1.1.1.1/App/Sv1?query – Down steam service's key: 1.1.1.1/App/Sv1
Url: http://1.1.1.2//Sv2?query – Down steam service's key: 1.1.1.2/Sv2

As per my understanding, we need different policy for different downstream service instances. Did you see this requirement more often? Are you planning to add this in Polly?

I was thinking of writing new handler(same as PolicyHttpMessageHandler) and add key selection as argument. This way, user can provide their key selection logic and handler can keep track of policy for each down stream service. Do you see any issue with this?

public CircuitBreakerHttpMessageHandler(IAsyncPolicy<HttpResponseMessage> policy, Func<HttpRequestMessage, string> keySelector)
     {
            if (policy == null)
            {
                throw new ArgumentNullException(nameof(policy));
            }

            if (keySelector == null)
            {
                throw new ArgumentNullException(nameof(keySelector));
            }

            _policy = policy;
            _keySelector = keySelector;
        }

Probably this can be added in PolicyHttpMessageHandler itself. Thoughts?

reisenberger · 2019-03-29T22:30:47Z

@shagilt , glad this is working for you so far 👍

Re:

we need different policy for different downstream service instances. Did you see this requirement more often? Are you planning to add this in Polly?

We are not planning to add anything in Polly around this, except ingredients which make the existing approach easier: the PolicySelector approach (or similar) could probably usefully be surfaced more clearly, either in Polly or Polly.Contrib.

There was an early architectural decision whether to put dictionaries-of-policy-usage-circumstance-by-key inside policies (such as could have been placed inside circuit-breaker), or outside (like PolicyRegistry). We consciously chose outside because it offers more flexibility, and separation of concerns. And because multiple policy types (not just circuit-breaker) fall into this pattern. So, I am not sure if this was your question, but there are no plans to introduce a circuit-breaker policy which maintains an internal dictionary of states-by-key; we consciously chose not to.

Re:

I was thinking of writing new handler(same as PolicyHttpMessageHandler) and add key selection as argument. This way, user can provide their key selection logic and handler can keep track of policy for each down stream service. Do you see any issue with this?

Something like this seems reasonable. It does not strictly need to be a new handler. If the intention is for the selection of policy by key from PolicyRegistry via a convenience overload that takes a Func<HttpRequestMessage, string> keySelector (nice), then you could define an extension method, something like:

public static IHttpClientBuilder AddPolicyHandlerFromRegistry(
    this IHttpClientBuilder builder,
    Func<HttpRequestMessage, string> keySelector)
    {
         builder.AddHttpMessageHandler((services) =>
         {
            var registry = services.GetRequiredService<IReadOnlyPolicyRegistry<string>>();
            return new PolicyHttpMessageHandler((request) => registry.Get<IAsyncPolicy<HttpResponseMessage>>(keySelector(request)));
        });
        return builder;
    }

(various defensive checks omitted for brevity - would want adding).

EDIT: This code example only encompasses select-from-registry using the keySelector. See comment below for richer example encompassing a .GetOrAdd(...) concept.

There are also some similar overloads in HttpClientFactory already, which you could use to achieve similar effects.

Hope that helps / let us know how you get on.

reisenberger · 2019-03-30T07:57:43Z

Hi @shagilt . Here is an extension method on IHttpClientBuilder which I think encompasses everything we have talked about so far.

It combines your nice Func<HttpRequestMessage, string> keySelector with a policyFactory which can manufacture a new instance of a policy (eg circuit-breaker) when required, based on the string key and/or info in the HttpRequestMessage.

Internally, it uses a .GetOrAdd(...)-style approach against a PolicyRegistry configured in DI (say by services.AddPolicyRegistry();), so that it can manufacture a new CircuitBreaker instance per each new string key generated by keySelector to represent a downstream instance; but thereafter will reuse the previous circuit-breaker instance for that key, to maintain circuit-breaker state.

Does this cover your needs?

    public static IHttpClientBuilder AddPolicyHandlerFromRegistryWithGetOrAdd(
        this IHttpClientBuilder builder,
        Func<HttpRequestMessage, string> keySelector,
        Func<HttpRequestMessage, string, IAsyncPolicy<HttpResponseMessage>> policyFactory
        )
    {
        if (builder == null)
        {
            throw new ArgumentNullException(nameof(builder));
        }

        if (keySelector == null)
        {
            throw new ArgumentNullException(nameof(keySelector));
        }

        if (policyFactory == null)
        {
            throw new ArgumentNullException(nameof(policyFactory));
        }

        builder.AddHttpMessageHandler((services) =>
        {
            var registry = services.GetRequiredService<IPolicyRegistry<string>>();
            return new PolicyHttpMessageHandler((request) =>
            {
                var key = keySelector(request);
                if (registry.TryGet<IAsyncPolicy<HttpResponseMessage>>(key, out var policy))
                {
                    return policy;
                }

                var newPolicy = policyFactory(request, key);
                registry[key] = newPolicy;
                return newPolicy;
            });
        });

        return builder;
    }

If this is looking good, we can consider adding this either to Polly, Polly-Contrib, or to HttpClientFactory (note: HttpClientFactory is outside my control; that would be a Microsoft decision); or to Polly documentation.

Sidenote: The above code ^ doesn't make Polly.Context the primary input driver for policy-selection or manufacture (which had been the original suggestion), because in the HttpClientFactory case we have an HttpRequestMessage which we can use as a natural input element driving identification of the downstream server instance. However, as originally mentioned, you can use the techniques for passing custom information into requests via Context, to attach more specific information to Polly.Context to drive keySelector, if useful.

shagilt · 2019-03-31T19:45:17Z

Hi, @reisenberger sorry for the delay in my response.

Yes, this new extension will work for me, awesome!

I am attaching my test code for key selection function. You can add this in sample or in test code of Polly as a reference.

public static string GetPolicyKey(HttpRequestMessage httpRequestMessage)
        {
            // DestinationHost is destination node name or IP address
            // DestinationService is downstream service name
            if (httpRequestMessage.Properties.ContainsKey("DestinationHost") && httpRequestMessage.Properties.ContainsKey("DestinationService"))
            {
                object host = null;
                object service = null;
                httpRequestMessage.Properties.TryGetValue("DestinationHost", out host);
                httpRequestMessage.Properties.TryGetValue("DestinationService", out service);
                return $"{host.ToString()}/{service.ToString()}";
            }

            //return empty string key otherwise
            return string.Empty;
        }

Instead of documentation, we can add this as a new extension in Polly or HttpClientFactory. What is the process of getting this in HttpClientFactory?

Other open question we need to document is: developer has to mange the life cycle of these new key based circuit breaker policies. Or how to mange stale policies in PolicyRegistry?

reisenberger · 2019-03-31T21:17:50Z

@shagilt Glad this works for you.

Great to have your enthusiasm to contribute!

You can propose an extension to HttpClientFactory directly with the Microsoft ASP.NET team on this repo. They give contributing guidelines. Ryan Nowak oversees a lot of the development on HttpClientFactory, so you could tag him directly.

If the conclusion is (or if you prefer) for this to live on the Polly side rather than directly in HttpClientFactory, then I propose we create an extension library Polly.Contrib.HttpClientFactoryExtensions (++) in Polly-Contrib. I can add you immediately to the Polly-Contrib organisation and you get full write rights on that repo, so you can add code / tests / documentation as necessary. We have a full blank template for contribs: this already contains a solution set up to build contribs against Polly, against the correct tfms, and to build all that to a nuget package; and there is a ready-made test project also.

(I'm proposing we manage HttpClientFactoryExtensions separate from main Polly as they are conceptually separate; need different dependencies; can then be rev'd at different speeds; and we still have .NET Framework consumers of Polly - not everybody is using .NET Core and HCF.)

(++) = (that link will not work yet; I will create that repo if we go this route)

reisenberger · 2019-03-31T21:20:49Z

Re:

developer has to manage the life cycle of these new key based circuit breaker policies. Or how to manage stale policies in PolicyRegistry?

Can you expand a little on your thinking/questions here?

PolicyRegistry has dictionary-like semantics including a .Remove(string key), so it is certainly possible to manually remove policies known to be no longer needed. That does raise broader questions (outside the scope of the code discussed so far) about how users may be managing the set of downstream endpoints.

reisenberger · 2019-03-31T21:34:36Z

Note: I also didn't necessary propose .AddPolicyHandlerFromRegistryWithGetOrAdd(...) as a final name 🙂 . Thoughts welcome.

shagilt · 2019-04-01T17:21:27Z

If the conclusion is (or if you prefer) for this to live on the Polly side rather than directly in HttpClientFactory, then I propose we create an extension library Polly.Contrib.HttpClientFactoryExtensions (++) in Polly-Contrib.

What do you recommendation here? I think adding this to Polly side make more sense because it is clear that the new extension in Polly not in HttpClientFactory(.NET).

Re:

developer has to manage the life cycle of these new key based circuit breaker policies. Or how to manage stale policies in PolicyRegistry?

Can you expand a little on your thinking/questions here?

Say, a service instance is moved from Node1 to Node2. So, policy created for Node1 is no longer valid. I want to remove Node1 policy if it is not accessed in past 1 day.

I know we can clean stale policies by using .Remove() method from PolicyRegistry. To do this, I need to know Last[Access][Used]Time for each policy. Is this possible with current Polly implementation?

reisenberger · 2019-04-04T07:18:15Z

@shagilt Thanks - hoping to get back to this at the weekend.

reisenberger · 2019-04-15T19:07:24Z

Hi @shagilt . Re:

What do you recommendation here? I think adding this to Polly side make more sense because it is clear that the new extension in Polly not in HttpClientFactory(.NET).

I am good with having some extensions to HttpClientFactory on the Polly side for specialist cases like this; it gives us some flexibility also to rev them at a different speed from the revs of ASP.NET Core. However, I'm pinging collaborators @rynowak and @glennc on the Microsoft team to loop them in to any decision.

@rynowak / @glennc : Do you have any concerns with the Polly team maintaining a package such as Polly.Contrib.HttpClientFactoryExtensions with more specialist extension overloads? Or would you like to see these all on the Microsoft side?

Thus far, we are talking about an extension on IHttpClientBuilder (see above) for managing a circuit-breaker instance per downstream node, where that downstream node is determined from some properties on the HttpRequestMessage. The overload provides for:

Func<HttpRequestMessage, string> keySelector to select a PolicyRegistry key based on HttpRequestMessage; (eg maybe based on part of the URI - up to user)
that key is used in GetOrAdd(...) fashion on the PolicyRegistry, to ensure the existing stateful instance of circuit-breaker for that downstream node is re-used, when already exists in registry;
when doesn't already exist, the configured Func<HttpRequestMessage, string, IAsyncPolicy<HttpResponseMessage>> policyFactory specifies how to manufacture a new circuit-breaker

(circuit-breaker being the prime example - the principle extends to other Polly stateful policies)

@shagilt Pending answers from the MS team, I've created the Polly.Contrib.HttpClientFactoryExtensions repo and invited you to the Polly-Contrib team. It would not be wasted to have developed and tested the concept there even if the MS teams do decide to take the overload to the MS side officially.

/cc @joelhulen for info

shagilt · 2019-04-17T01:41:00Z

@reisenberger Thanks. I will setup the dev environment.

rynowak · 2019-05-01T15:28:43Z

We've merged this PR into aspnet/Extensions. I think this issue can be closed. Thanks @shagilt

reisenberger · 2019-05-01T18:13:44Z

Thanks @rynowak , @shagilt . Closing.

theCorb1nator · 2020-01-03T12:14:58Z

@shagilt @reisenberger do you have any examples of how to implement the scenario using the new extension method provided by @shagilt

This is the exact scenario i need to implement into my solution:

Try "https:/host1.example.com"
If failed X times switch to "https://host2.example.com"

I have tried following the docs and the PR but i am unable to see how to correctly implement. i feel an example would be useful for anyone trying to implement this scenario

reisenberger · 2020-01-03T21:46:52Z

@shagilt Do you have any fully-worked sample on this?

@theCorb1nator I might be able to put a small fully-worked sample together, but it would take a few hours to create, and might not happen for some weeks.

The summary is:

The overloads discussed in this issue focus on allowing you to take a GetOrAdd(..)-style approach to generating a single stateful circuit-breaker per downstream node.
To do:

If failed X times switch to "https://host2.example.com"

... you might typically define a Retry policy outside the CircuitBreaker, with that retry policy specifically handling BrokenCircuitException
using the onRetry delegate of the retry policy to bump the choice of downstream node on one.

Something (pseudo-code, abbreviated) conceptually like:

var nodes = new [] {
    "https://host1.example.com",
    "https://host2.example.com",
    "https://host3.example.com",
};
int nodeCount = nodes.Length;

int currentNode = 0;

services.AddHttpClient(/* define your logical client */)

     /* you can have another retry policy here too, for general transient faults */

     /* add a retry policy which handles BrokenCircuitException and cycles to the next node, when that happens */
    .AddPolicyHandler(Policy.Handle<BrokenCircuitException>().RetryForeverAsync<HttpResponseMessage>(onRetry: result => { currentNode = (currentNode + 1) % nodeCount; }))

     /* get-or-add a circuit-breaker per downstream-node, using the overload in Polly issue 624 */
     .AddPolicyHandler(
        policyFactory: (services, requestMessage, key) => HttpPolicyExtensions.HandleTransientHttpError().CircuitBreakerAsync<HttpResponseMessage>(/* your chosen circuit-breaker config */),
        keySelector: requestMessage => $"CircuitBreakerFor{requestMessage.RequestUri.Authority}");

As I say, this is a pseudo-code/conceptual sketch. Of course, you don't want to hard-code the nodes in StartUp.ConfigureServices(). And you are very likely to want access to the same nodes collection and currentNode index where you place the calls through the policy. The logical thing to do would be to wrap the nodes collection up in a NodeSelector class fulfilling something like:

public interface INodeSelector
{
    string GetCurrentNode();
    void MoveToNextNode();
}

and make a NodeSelector implementation available to the call-site by DI. The call-site can then use INodeSelector.GetCurrentNode() to construct an HttpRequestMessage addressing the current node.

Your configuration of the HttpClient might then change to:

services.AddHttpClient(/* define your logical client */)

     /* you can have another retry policy here too, for general transient faults */

     /* add a retry policy which handles BrokenCircuitException and cycles to the next node, when that happens */
     .AddPolicyHandler(
        policyFactory: (services, requestMessage, key) => Policy.Handle<BrokenCircuitException>().RetryForeverAsync<HttpResponseMessage>(onRetry: _ => services.GetRequiredService<INodeSelector>().MoveToNextNode()),
        keySelector: requestMessage => "NodeBouncer");

     /* get-or-add a circuit-breaker per downstream-node, using the overload in Polly issue 624 */
     .AddPolicyHandler(
        policyFactory: (services, requestMessage, key) => HttpPolicyExtensions.HandleTransientHttpError().CircuitBreakerAsync<HttpResponseMessage>(/* your chosen circuit-breaker config */),
        keySelector: requestMessage => $"CircuitBreakerFor{requestMessage.RequestUri.Authority}");

@theCorb1nator The above is a 45-minute sketch to get you an answer now rather than leave you waiting a few weeks; please extrapolate from it, or query if necessary.

@shagilt If you have a full example - or can see any missing elements or enhancements to the above - please say!

theCorb1nator · 2020-01-06T12:59:02Z

Thanks for this @reisenberger i will keep you updated how i get on

shagilt · 2020-01-06T16:46:43Z

@reisenberger @theCorb1nator Sorry for the delay in my response, I was away for past few days.
I don't have any real sample with this new extension, but I remember adding a UT along with this extension. This UT creates a retry policy for each downstream host(node/IPaddress). You can try the same with circuitbreaker policy.
https://github.com/shagilt/Extensions/blob/PollyWithKeySelectionExt/src/HttpClientFactory/Polly/test/DependencyInjection/PollyHttpClientBuilderExtensionsTest.cs - AddPolicyHandlerFromRegistry_PolicySelectorWithKey_AddsPolicyHandler

shagilt · 2020-01-06T18:01:13Z

@reisenberger I don't see anything missing in your last post; your sample looks good. Thanks!

reisenberger changed the title ~~CircuitBreaker with shared named HttpClient~~ CircuitBreaker per downstream system, with shared named HttpClient Mar 22, 2019

reisenberger added the question label Mar 22, 2019

reisenberger changed the title ~~CircuitBreaker per downstream system, with shared named HttpClient~~ CircuitBreaker per downstream node, with shared named HttpClient Mar 24, 2019

reisenberger mentioned this issue Mar 24, 2019

Feature request: rate-limiting (requests/time) #260

Closed

This was referenced Mar 30, 2019

Switch between servers policy #591

Closed

Is it possible to chain Fallback and CircuitBreaker? [How to construct a Failover policy] #262

Closed

reisenberger added the enhancement label Apr 16, 2019

shagilt mentioned this issue Apr 29, 2019

Adding new Polly extension with policy key selection logic dotnet/extensions#1620

Merged

reisenberger closed this as completed May 1, 2019

reisenberger mentioned this issue May 29, 2019

Add IConcurrentPolicyRegistry interface/implementation, for PolicyRegistry #645

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CircuitBreaker per downstream node, with shared named HttpClient #624

CircuitBreaker per downstream node, with shared named HttpClient #624

shagilt commented Mar 21, 2019

reisenberger commented Mar 22, 2019 •

edited

Loading

shagilt commented Mar 23, 2019

shagilt commented Mar 28, 2019 •

edited

Loading

reisenberger commented Mar 29, 2019 •

edited

Loading

reisenberger commented Mar 30, 2019

shagilt commented Mar 31, 2019 •

edited

Loading

reisenberger commented Mar 31, 2019

reisenberger commented Mar 31, 2019

reisenberger commented Mar 31, 2019

shagilt commented Apr 1, 2019

reisenberger commented Apr 4, 2019

reisenberger commented Apr 15, 2019 •

edited

Loading

shagilt commented Apr 17, 2019

rynowak commented May 1, 2019

reisenberger commented May 1, 2019

theCorb1nator commented Jan 3, 2020

reisenberger commented Jan 3, 2020 •

edited

Loading

theCorb1nator commented Jan 6, 2020

shagilt commented Jan 6, 2020

shagilt commented Jan 6, 2020

CircuitBreaker per downstream node, with shared named HttpClient #624

CircuitBreaker per downstream node, with shared named HttpClient #624

Comments

shagilt commented Mar 21, 2019

reisenberger commented Mar 22, 2019 • edited Loading

shagilt commented Mar 23, 2019

shagilt commented Mar 28, 2019 • edited Loading

reisenberger commented Mar 29, 2019 • edited Loading

reisenberger commented Mar 30, 2019

shagilt commented Mar 31, 2019 • edited Loading

reisenberger commented Mar 31, 2019

reisenberger commented Mar 31, 2019

reisenberger commented Mar 31, 2019

shagilt commented Apr 1, 2019

reisenberger commented Apr 4, 2019

reisenberger commented Apr 15, 2019 • edited Loading

shagilt commented Apr 17, 2019

rynowak commented May 1, 2019

reisenberger commented May 1, 2019

theCorb1nator commented Jan 3, 2020

reisenberger commented Jan 3, 2020 • edited Loading

theCorb1nator commented Jan 6, 2020

shagilt commented Jan 6, 2020

shagilt commented Jan 6, 2020

reisenberger commented Mar 22, 2019 •

edited

Loading

shagilt commented Mar 28, 2019 •

edited

Loading

reisenberger commented Mar 29, 2019 •

edited

Loading

shagilt commented Mar 31, 2019 •

edited

Loading

reisenberger commented Apr 15, 2019 •

edited

Loading

reisenberger commented Jan 3, 2020 •

edited

Loading