Future Polly Roadmap - **community feedback sought** #90

Open
reisenberger opened this Issue Mar 7, 2016 · 59 comments

Comments

Projects
None yet
@reisenberger
Member

reisenberger commented Mar 7, 2016

App-vNext has published a Roadmap for Polly.

We are actively seeking community feedback on the roadmap.

Which aspects of the roadmap would be a priority for you, and which not?

Do you have views on how the features in the roadmap should be implemented?

Are there features you'd like to see, that are not on the roadmap?

What is your view on the longer-term possibility, to grow Polly towards a more general resiliency framework for .NET? (as Hystrix is for Java) – for detail see Roadmap. .

Please comment on the Roadmap under this placeholder issue.

@mfjerome

This comment has been minimized.

Show comment
Hide comment
@mfjerome

mfjerome Mar 8, 2016

About your last point:

Several of our teams working on .NET projects have recommended Polly as being useful in building microservice-based systems. It encourages the fluent expression of transient exception-handling policies and the Circuit Breaker pattern, including policies such as Retry, Retry Forever and Wait and Retry. Similar libraries already exist in other languages (Hystrix for Java for example), and Polly is a welcome addition from the .NET community.

Taken from ThoughWorks Technology Radar. There is clearly a lack of good offers in the .NET ecosystem for general resiliency framework and the spotlight is already on Polly as a Hystrix(Java) alternative so why not jump on the occasion =)

mfjerome commented Mar 8, 2016

About your last point:

Several of our teams working on .NET projects have recommended Polly as being useful in building microservice-based systems. It encourages the fluent expression of transient exception-handling policies and the Circuit Breaker pattern, including policies such as Retry, Retry Forever and Wait and Retry. Similar libraries already exist in other languages (Hystrix for Java for example), and Polly is a welcome addition from the .NET community.

Taken from ThoughWorks Technology Radar. There is clearly a lack of good offers in the .NET ecosystem for general resiliency framework and the spotlight is already on Polly as a Hystrix(Java) alternative so why not jump on the occasion =)

@bunceg

This comment has been minimized.

Show comment
Hide comment
@bunceg

bunceg Mar 19, 2016

On the open issue #6, I appreciate it wasn't the path the team wanted to go at the time, but I really can't see any other clean way to solve the issue below...

Two instances in their own processes, each has an instance of circuit breaker. Instance A trips at 00:00 and lasts for 5 minutes.. Due to random load balancing, most requests just happen to go to instance A. However, by chance, at 04:50 instance B gets some requests and also trips. it's 5 minute trip timer now starts. However, the tripped service is back at 05:00. The instance A breaker is cleared at 05:01 but the instance B breaker is not able to be cleared until 09:50 - 4+ minutes after when it should be able to serve requests.

I take on board all the comments re: negotiation between services and what if connection to the shared state handler goes down etc. but I'm taking the pessimistic view that if a service the breaker is protecting is dead then it's more likely that the service is really dead rather than the alternative that the service is alive but one instance can't seem to talk to it.

Besides, IMO, that can be some kind of "quorum" logic - two instances, two need to vote dead etc. More complex, but this can be part of the state handler to be developed by the poor soul (i.e. me!) who desires this kind of functionality...

bunceg commented Mar 19, 2016

On the open issue #6, I appreciate it wasn't the path the team wanted to go at the time, but I really can't see any other clean way to solve the issue below...

Two instances in their own processes, each has an instance of circuit breaker. Instance A trips at 00:00 and lasts for 5 minutes.. Due to random load balancing, most requests just happen to go to instance A. However, by chance, at 04:50 instance B gets some requests and also trips. it's 5 minute trip timer now starts. However, the tripped service is back at 05:00. The instance A breaker is cleared at 05:01 but the instance B breaker is not able to be cleared until 09:50 - 4+ minutes after when it should be able to serve requests.

I take on board all the comments re: negotiation between services and what if connection to the shared state handler goes down etc. but I'm taking the pessimistic view that if a service the breaker is protecting is dead then it's more likely that the service is really dead rather than the alternative that the service is alive but one instance can't seem to talk to it.

Besides, IMO, that can be some kind of "quorum" logic - two instances, two need to vote dead etc. More complex, but this can be part of the state handler to be developed by the poor soul (i.e. me!) who desires this kind of functionality...

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Mar 20, 2016

Member

Hi @bunceg, many thanks for the input! Noted that you are still looking for the best solution to shared circuit-breaker state across processes with Polly, and/or a feature to inject state into a circuit. Further views on this welcome from across the community...

@bunceg I'll respond to the specific scenario back within #6, since that feels like it might turn into a longer discussion.

Member

reisenberger commented Mar 20, 2016

Hi @bunceg, many thanks for the input! Noted that you are still looking for the best solution to shared circuit-breaker state across processes with Polly, and/or a feature to inject state into a circuit. Further views on this welcome from across the community...

@bunceg I'll respond to the specific scenario back within #6, since that feels like it might turn into a longer discussion.

@reisenberger reisenberger referenced this issue in reisenberger/Polly Apr 13, 2016

Merged

Circuit breaker spike march16 added windows #1

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Apr 19, 2016

Member

Following the live demo today at devinteractions, there was a request to add caching into the policy pipeline. So we've added to the roadmap a proposal drawing on ICacheProvider, to allow any existing .NET cache implementation.

// Possible configuration syntax
var cachingPolicy = Policy
  .CacheProvider(ICacheProvider provider);

// Execution syntax might different slightly from the normal policy .Execute() syntax, eg:
// Possible syntax:
cachingPolicy.ExecuteOrGetExisting<TResult>(Func<TResult> func, string cacheKey);

Other .ExecuteOrGetExisting(...) overloads could also be provided, mimicking ICacheProvider .AddOrGetExisting(...) overloads, for easy comprehension.

Comments welcome on this (or any other feature in the roadmap) under this issue.

Member

reisenberger commented Apr 19, 2016

Following the live demo today at devinteractions, there was a request to add caching into the policy pipeline. So we've added to the roadmap a proposal drawing on ICacheProvider, to allow any existing .NET cache implementation.

// Possible configuration syntax
var cachingPolicy = Policy
  .CacheProvider(ICacheProvider provider);

// Execution syntax might different slightly from the normal policy .Execute() syntax, eg:
// Possible syntax:
cachingPolicy.ExecuteOrGetExisting<TResult>(Func<TResult> func, string cacheKey);

Other .ExecuteOrGetExisting(...) overloads could also be provided, mimicking ICacheProvider .AddOrGetExisting(...) overloads, for easy comprehension.

Comments welcome on this (or any other feature in the roadmap) under this issue.

@pj-moviestarplanet

This comment has been minimized.

Show comment
Hide comment
@pj-moviestarplanet

pj-moviestarplanet Apr 20, 2016

Have you thought about implementing some kind of metrics functionality similar to what Hystrix has. It would be very useful to be able to see Circuit Breaker metrics in for example Grafana.

So the idea would be to configure for example a statsd client and use it to send metrics such as call latency, number of succeful calls, number of failed calls etc.

Have you thought about implementing some kind of metrics functionality similar to what Hystrix has. It would be very useful to be able to see Circuit Breaker metrics in for example Grafana.

So the idea would be to configure for example a statsd client and use it to send metrics such as call latency, number of succeful calls, number of failed calls etc.

@joelhulen

This comment has been minimized.

Show comment
Hide comment
@joelhulen

joelhulen Apr 20, 2016

Member

This is an interesting suggestion. We've spoken about ways to visualize the process flow, but not necessarily telemetry information. I wonder if capturing that data would be best served as a plug-in, and not a part of the core package? My line of thinking about this is that there is usually a trade-off between capturing this data and performance. I believe the ideal way to capture the data would be via an asynchronous tool similar to Application Insights. Minimal impact and something you can add in.

Do you have an example of the metrics Hystrix captures that you would like to see? For instance, what would be a priority at first?

Member

joelhulen commented Apr 20, 2016

This is an interesting suggestion. We've spoken about ways to visualize the process flow, but not necessarily telemetry information. I wonder if capturing that data would be best served as a plug-in, and not a part of the core package? My line of thinking about this is that there is usually a trade-off between capturing this data and performance. I believe the ideal way to capture the data would be via an asynchronous tool similar to Application Insights. Minimal impact and something you can add in.

Do you have an example of the metrics Hystrix captures that you would like to see? For instance, what would be a priority at first?

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Apr 21, 2016

Member

Hey @pj-moviestarplanet Great idea, added stats to the roadmap. Happy for any community feedback on an architecture for this.

One question is phasing: whether to start emitting stats/events from Polly as of now or when the fuller Polly Pipeline is in place. Phased would be great if we can see far enough ahead not to introduce breaking changes later.

Related: The Polly Pipeline proposal differs from the Hystrix approach in one significant way. While Hystrix has a fixed structure in which the various elements are composed (fallback, cache, circuit-breaker etc; hmm, no retry), the Polly Pipeline proposal is currently more flexible, allowing users to compose whichever elements of the (envisaged) Polly resilience armoury they want, in any order.

The intended advantage of this flexibility is simplicity: users select and compose only the elements they want. However, a disadvantage may be if (if) that flexibility makes emitting overall stats for calls more difficult ... it may depend whether Polly's role is to emit stats or raw(er) events (perhaps as a ReactiveExtensions Observable<T>) which something else compiles to/pushes to stats.

Keen for any community feedback on the current Pipeline proposal - strengths, weaknesses? Definitely an area of the roadmap that's still up for grabs how we architect it ...

Member

reisenberger commented Apr 21, 2016

Hey @pj-moviestarplanet Great idea, added stats to the roadmap. Happy for any community feedback on an architecture for this.

One question is phasing: whether to start emitting stats/events from Polly as of now or when the fuller Polly Pipeline is in place. Phased would be great if we can see far enough ahead not to introduce breaking changes later.

Related: The Polly Pipeline proposal differs from the Hystrix approach in one significant way. While Hystrix has a fixed structure in which the various elements are composed (fallback, cache, circuit-breaker etc; hmm, no retry), the Polly Pipeline proposal is currently more flexible, allowing users to compose whichever elements of the (envisaged) Polly resilience armoury they want, in any order.

The intended advantage of this flexibility is simplicity: users select and compose only the elements they want. However, a disadvantage may be if (if) that flexibility makes emitting overall stats for calls more difficult ... it may depend whether Polly's role is to emit stats or raw(er) events (perhaps as a ReactiveExtensions Observable<T>) which something else compiles to/pushes to stats.

Keen for any community feedback on the current Pipeline proposal - strengths, weaknesses? Definitely an area of the roadmap that's still up for grabs how we architect it ...

@joelhulen joelhulen referenced this issue Apr 30, 2016

Closed

Thanks !!! #105

@PaybackMan

This comment has been minimized.

Show comment
Hide comment
@PaybackMan

PaybackMan May 24, 2016

Please support .Net Standard Profile

Please support .Net Standard Profile

@joelhulen

This comment has been minimized.

Show comment
Hide comment
@joelhulen

joelhulen May 24, 2016

Member

@PaybackMan That is a great suggestion. Out of curiosity, what is the current holdup you are facing in using the PCL version of the Polly library, or is this request more out of providing greater flexibility for 3rd party libraries to be less dependent upon a specific .NET version within a given project?

Member

joelhulen commented May 24, 2016

@PaybackMan That is a great suggestion. Out of curiosity, what is the current holdup you are facing in using the PCL version of the Polly library, or is this request more out of providing greater flexibility for 3rd party libraries to be less dependent upon a specific .NET version within a given project?

@KBRETON1

This comment has been minimized.

Show comment
Hide comment
@KBRETON1

KBRETON1 May 25, 2016

Please support async methods in .NET 4.0 by using the portable Microsoft.Bcl.Async nuget.

Please support async methods in .NET 4.0 by using the portable Microsoft.Bcl.Async nuget.

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger May 25, 2016

Member

@KBRETON1 Thanks for the vote for this! (async methods in .NET 4.0 using Microsoft.Bcl.Async). This is already in progress as #103 .

Member

reisenberger commented May 25, 2016

@KBRETON1 Thanks for the vote for this! (async methods in .NET 4.0 using Microsoft.Bcl.Async). This is already in progress as #103 .

@SamuelEnglard

This comment has been minimized.

Show comment
Hide comment
@SamuelEnglard

SamuelEnglard Jun 16, 2016

Really want to use this in a ASP.NET Core application, any idea how I can currently?

Really want to use this in a ASP.NET Core application, any idea how I can currently?

@mattwoberts

This comment has been minimized.

Show comment
Hide comment
@mattwoberts

mattwoberts Jun 25, 2016

@SamuelEnglard 👍

I'm also interested in this. A lot of people are coming to dotnet core having used the snappily named "The Transient Fault Handling Application Block for Windows Azure" (topaz), and since that project seems pretty dead, it would be AWESOME if polly could take it's place, and provide a dotnet core version.

@SamuelEnglard 👍

I'm also interested in this. A lot of people are coming to dotnet core having used the snappily named "The Transient Fault Handling Application Block for Windows Azure" (topaz), and since that project seems pretty dead, it would be AWESOME if polly could take it's place, and provide a dotnet core version.

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jun 25, 2016

Member

@mattwoberts The Polly team view .NET core compatibility as essential and high prio on the roadmap. The current PCL259 codebase is entirely netstandard1.0 compatible, so no major code issues. The recent flux in tooling from Microsoft has been a barrier (we have wanted to avoid tracking a moving target and that impacting other active feature development over last few weeks). We are hoping for some stability from Microsoft in the tooling coming Monday 27 but remains to be seen ...

@SamuelEnglard has made a start at this (thanks @SamuelEnglard !), @mattwoberts see conversation and links at end of #51 . I am hoping to look at this in the coming days/weeks, @mattwoberts if you also have time to evaluate and take further with @SamuelEnglard please do!

(I am just now completing #14, which will unlock the path to many of the hystrix-like features that @SeanFarrow, I and many others are interested in on the long-term Roadmap ...)

Member

reisenberger commented Jun 25, 2016

@mattwoberts The Polly team view .NET core compatibility as essential and high prio on the roadmap. The current PCL259 codebase is entirely netstandard1.0 compatible, so no major code issues. The recent flux in tooling from Microsoft has been a barrier (we have wanted to avoid tracking a moving target and that impacting other active feature development over last few weeks). We are hoping for some stability from Microsoft in the tooling coming Monday 27 but remains to be seen ...

@SamuelEnglard has made a start at this (thanks @SamuelEnglard !), @mattwoberts see conversation and links at end of #51 . I am hoping to look at this in the coming days/weeks, @mattwoberts if you also have time to evaluate and take further with @SamuelEnglard please do!

(I am just now completing #14, which will unlock the path to many of the hystrix-like features that @SeanFarrow, I and many others are interested in on the long-term Roadmap ...)

@SamuelEnglard

This comment has been minimized.

Show comment
Hide comment
@SamuelEnglard

SamuelEnglard Jun 26, 2016

@reisenberger should I make a proper pull request?

@reisenberger should I make a proper pull request?

@adamhathcock

This comment has been minimized.

Show comment
Hide comment
@adamhathcock

adamhathcock Jul 4, 2016

I'm interested in the pipeline idea. Is there more details on a proposal that should be implemented or copied?

I'm interested in the pipeline idea. Is there more details on a proposal that should be implemented or copied?

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 4, 2016

Member

Hi @adamhathcock , thanks for your interest! We are hoping to start work on the Pipeline in about a week's time (this week's focus is nuget packages supporting .NET Core). Would be great to hear any input you have on how you see this working, or the context you'd be using it in. I'm also hoping to put out an implementation proposal for comment in the next few days - @SeanFarrow and others also interested in this feature.

For now, see nesting policies in the wiki, for a syntax for nesting calls through multiple policies at present. Similar syntax also works for async.

Member

reisenberger commented Jul 4, 2016

Hi @adamhathcock , thanks for your interest! We are hoping to start work on the Pipeline in about a week's time (this week's focus is nuget packages supporting .NET Core). Would be great to hear any input you have on how you see this working, or the context you'd be using it in. I'm also hoping to put out an implementation proposal for comment in the next few days - @SeanFarrow and others also interested in this feature.

For now, see nesting policies in the wiki, for a syntax for nesting calls through multiple policies at present. Similar syntax also works for async.

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 4, 2016

Member

@adamhathcock @SeanFarrow For a brief comment on implementation thoughts so far around the policy pipeline: I see a Pipeline represented as a linked list. However, we probably also want to retain the ability to re-use a policy across more than one call, and therefore in more than one pipeline (for benefits of this see wiki). This probably means a pipeline is a linked list of pointers to policy instances, not a linked list of actual policy instances. Thoughts?

Member

reisenberger commented Jul 4, 2016

@adamhathcock @SeanFarrow For a brief comment on implementation thoughts so far around the policy pipeline: I see a Pipeline represented as a linked list. However, we probably also want to retain the ability to re-use a policy across more than one call, and therefore in more than one pipeline (for benefits of this see wiki). This probably means a pipeline is a linked list of pointers to policy instances, not a linked list of actual policy instances. Thoughts?

@SeanFarrow

This comment has been minimized.

Show comment
Hide comment
@SeanFarrow

SeanFarrow Jul 4, 2016

That makes sense, do we want to give people the ability to add steps at any point in the pipeline? If yes, does a policy have an id?

SeanFarrow commented Jul 4, 2016

That makes sense, do we want to give people the ability to add steps at any point in the pipeline? If yes, does a policy have an id?

@adamhathcock

This comment has been minimized.

Show comment
Hide comment
@adamhathcock

adamhathcock Jul 4, 2016

@reisenberger @SeanFarrow What I have is an implementation that does exactly what you say: a pipeline that is basically a linked list and can be mutated for different calls. I want to use Polly with it and if you're doing a pipeline thing anyway maybe I could just wholesale remove my implementation.

I don't really like the look of the Nested Policies thing. It looks like there could be a better way to hook things together but each policy remain atomic? Just thinking out loud. Maybe there would have to be a special collection PolicyList that is basically a linked list that allows insertion anywhere and can be executed which invokes the policies in order at anytime.

@reisenberger @SeanFarrow What I have is an implementation that does exactly what you say: a pipeline that is basically a linked list and can be mutated for different calls. I want to use Polly with it and if you're doing a pipeline thing anyway maybe I could just wholesale remove my implementation.

I don't really like the look of the Nested Policies thing. It looks like there could be a better way to hook things together but each policy remain atomic? Just thinking out loud. Maybe there would have to be a special collection PolicyList that is basically a linked list that allows insertion anywhere and can be executed which invokes the policies in order at anytime.

@adamhathcock

This comment has been minimized.

Show comment
Hide comment
@adamhathcock

adamhathcock Jul 4, 2016

I haven't looked closely but I would also like the ability to add custom Policies into the pipeline and I don't think that functionality currently exists (appending a custom Policy). I guess that would basically mean just allowing a custom class that uses the same interface. Maybe that's too big a refactor?

If that was on the table, I would envision policies looking a lot like ASP.NET Core middleware. However the next step would be a method parameter instead of a constructor parameter. Just thinking outloud again :)

I haven't looked closely but I would also like the ability to add custom Policies into the pipeline and I don't think that functionality currently exists (appending a custom Policy). I guess that would basically mean just allowing a custom class that uses the same interface. Maybe that's too big a refactor?

If that was on the table, I would envision policies looking a lot like ASP.NET Core middleware. However the next step would be a method parameter instead of a constructor parameter. Just thinking outloud again :)

@SeanFarrow

This comment has been minimized.

Show comment
Hide comment
@SeanFarrow

SeanFarrow Jul 4, 2016

What custom policies do you envisage adding?

SeanFarrow commented Jul 4, 2016

What custom policies do you envisage adding?

@adamhathcock

This comment has been minimized.

Show comment
Hide comment
@adamhathcock

adamhathcock Jul 4, 2016

Well, just anything. Right now, one of things that my pipeline does is handle the custom OAuth token and refreshes it if necessary. I guess I would think I could add something like that in.

I do note that a lot of the exception handling and execute methods do take custom lambdas so you could do it there. I just thought formalizing it into an official custom policy might be better.

Well, just anything. Right now, one of things that my pipeline does is handle the custom OAuth token and refreshes it if necessary. I guess I would think I could add something like that in.

I do note that a lot of the exception handling and execute methods do take custom lambdas so you could do it there. I just thought formalizing it into an official custom policy might be better.

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 4, 2016

Member

All thinking out loud welcome! (lots of interesting ideas guys)

Rather than respond piece-by-piece, I'm working up (and will post shortly) a full proposal statement for each of the envisaged wider resilience features.

I think it will help to consider them as a group: a lot of these proposed policies and considerations interlock (eg the need for an id/key on a policy, and/or on an individual execution, also comes into play for using a cache, for emitting events/stats, for configuring policies from a PolicyRegistry, etc...)

Member

reisenberger commented Jul 4, 2016

All thinking out loud welcome! (lots of interesting ideas guys)

Rather than respond piece-by-piece, I'm working up (and will post shortly) a full proposal statement for each of the envisaged wider resilience features.

I think it will help to consider them as a group: a lot of these proposed policies and considerations interlock (eg the need for an id/key on a policy, and/or on an individual execution, also comes into play for using a cache, for emitting events/stats, for configuring policies from a PolicyRegistry, etc...)

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 4, 2016

Member

@adamhathcock , re:

I don't really like the look of the Nested Policies thing. It looks like there could be a better way to
hook things together but each policy remain atomic? [...] Maybe there would have to be ...

Yes, the nested policies thing in the wiki is (for clarity) not a proposal. It's just some standard C# syntax that I put out as doco to guide people with chaining the existing policies, when App-vNext took on Polly six months ago. What you're suggesting is exactly the essence of the Pipeline proposal to provide something nicer 😄 (great to hear; all thinking out loud welcome! 😉 )

Member

reisenberger commented Jul 4, 2016

@adamhathcock , re:

I don't really like the look of the Nested Policies thing. It looks like there could be a better way to
hook things together but each policy remain atomic? [...] Maybe there would have to be ...

Yes, the nested policies thing in the wiki is (for clarity) not a proposal. It's just some standard C# syntax that I put out as doco to guide people with chaining the existing policies, when App-vNext took on Polly six months ago. What you're suggesting is exactly the essence of the Pipeline proposal to provide something nicer 😄 (great to hear; all thinking out loud welcome! 😉 )

@adamhathcock

This comment has been minimized.

Show comment
Hide comment
@adamhathcock

adamhathcock Jul 5, 2016

I'm interested to see what you've got @reisenberger and where I can contribute :)

I thinking have a base pipeline and having the current and proposed wider resilience features as links in that chain as well as allowing custom ones fitted in will make Polly much more like Hystrix.

I like the look of a PolicyRegistry from the linked Brighter project. Also, having a task scheduler is part of the bulkhead pattern as well. All things I'm interested in.

My only worry is expanding the scope for this project too wide :)

I'm interested to see what you've got @reisenberger and where I can contribute :)

I thinking have a base pipeline and having the current and proposed wider resilience features as links in that chain as well as allowing custom ones fitted in will make Polly much more like Hystrix.

I like the look of a PolicyRegistry from the linked Brighter project. Also, having a task scheduler is part of the bulkhead pattern as well. All things I'm interested in.

My only worry is expanding the scope for this project too wide :)

@SeanFarrow

This comment has been minimized.

Show comment
Hide comment
@SeanFarrow

SeanFarrow Jul 5, 2016

Maybe we should integrate with other systems, Quartz.net for example.

SeanFarrow commented Jul 5, 2016

Maybe we should integrate with other systems, Quartz.net for example.

@adamhathcock

This comment has been minimized.

Show comment
Hide comment
@adamhathcock

adamhathcock Jul 5, 2016

Polly should try to integrate rather than compete though I'm hoping that the use-cases envisioned don't require something "enterprisey" like Quartz.NET or even Paramore/Brighter (http://iancooper.github.io/Paramore/QuickStart.html). The Brighter command processing I guess is what I kind of envision for Polly but still have a easy syntax for quick task handling.

Scaling up with things like task persistence (which seems to be the realm of Quartz and Paramore) is out of scope, imo.

Polly should try to integrate rather than compete though I'm hoping that the use-cases envisioned don't require something "enterprisey" like Quartz.NET or even Paramore/Brighter (http://iancooper.github.io/Paramore/QuickStart.html). The Brighter command processing I guess is what I kind of envision for Polly but still have a easy syntax for quick task handling.

Scaling up with things like task persistence (which seems to be the realm of Quartz and Paramore) is out of scope, imo.

@SeanFarrow

This comment has been minimized.

Show comment
Hide comment
@SeanFarrow

SeanFarrow Jul 5, 2016

That makes sense.

SeanFarrow commented Jul 5, 2016

That makes sense.

@Amerdrix

This comment has been minimized.

Show comment
Hide comment
@Amerdrix

Amerdrix Jul 6, 2016

Jumping in, pipelining is highest on the list for me. In fact, looking for the how to achieve this is what lead me to find the road map.

Amerdrix commented Jul 6, 2016

Jumping in, pipelining is highest on the list for me. In fact, looking for the how to achieve this is what lead me to find the road map.

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 6, 2016

Member

@Amerdrix See comment from a couple of days ago if you haven't already. Pipeline is shortly up on the development agenda, and there's a pointer to existing C# syntax for chaining policies in that comment.

Stay tuned: we'll be posting a fuller proposal for the pipeline feature for comment, in the next few days.

Member

reisenberger commented Jul 6, 2016

@Amerdrix See comment from a couple of days ago if you haven't already. Pipeline is shortly up on the development agenda, and there's a pointer to existing C# syntax for chaining policies in that comment.

Stay tuned: we'll be posting a fuller proposal for the pipeline feature for comment, in the next few days.

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 7, 2016

Member

Hi @adamhathcock @SeanFarrow There are already a lot of players in the task/job scheduling space in .NET. All essentially have a job triggered as a lambda delegate, or a method to be fulfilled from an interface. Polly can inflect the action of any lambda / wrap how any method is invoked, so in principle it shouldn't be difficult to integrate with any job scheduler.

If one wanted to do tighter integration (allow people to declaratively specify, eg in the manner Brighter does, the Polly policies they want to modify the operation of their scheduled task), I would see that as a separate project (eg Quartz.Polly) (although some of those schedulers, like Hangfire and Quartz, already have retry built in).

Member

reisenberger commented Jul 7, 2016

Hi @adamhathcock @SeanFarrow There are already a lot of players in the task/job scheduling space in .NET. All essentially have a job triggered as a lambda delegate, or a method to be fulfilled from an interface. Polly can inflect the action of any lambda / wrap how any method is invoked, so in principle it shouldn't be difficult to integrate with any job scheduler.

If one wanted to do tighter integration (allow people to declaratively specify, eg in the manner Brighter does, the Polly policies they want to modify the operation of their scheduled task), I would see that as a separate project (eg Quartz.Polly) (although some of those schedulers, like Hangfire and Quartz, already have retry built in).

@adamhathcock

This comment has been minimized.

Show comment
Hide comment
@adamhathcock

adamhathcock Jul 7, 2016

I'm thinking Polly is "lighter" than a real job. It facilitates and encapsulates common logic needed by lots of code paths. Let me know if I'm way off base.

I'm thinking Polly is "lighter" than a real job. It facilitates and encapsulates common logic needed by lots of code paths. Let me know if I'm way off base.

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 7, 2016

Member

Hi @adamhathcock

The essence of Polly is that a policy inflects how an action is actioned. This can even be seen right in the codebase in the base Policy class as Action<Action>. (I like to show this as it's a great way of understanding what Polly does!) The inner Action is the delegate you pass Polly to execute; the outer Action is what the configured policy will do with it.

In that sense, Polly doesn't own any jobs at all. A policy inflects the operation of delegates/lambdas that are passed to/through it. So yes, as you say: Polly encapsulates common (resilience) logic that may be applied to delegates anywhere.

(Brief further comments on the code line I pulled out: The Context you see in there too is an execution context that travels with each invocation. And the async version is similar with, you guessed it, some Funcs. 😉 )

Member

reisenberger commented Jul 7, 2016

Hi @adamhathcock

The essence of Polly is that a policy inflects how an action is actioned. This can even be seen right in the codebase in the base Policy class as Action<Action>. (I like to show this as it's a great way of understanding what Polly does!) The inner Action is the delegate you pass Polly to execute; the outer Action is what the configured policy will do with it.

In that sense, Polly doesn't own any jobs at all. A policy inflects the operation of delegates/lambdas that are passed to/through it. So yes, as you say: Polly encapsulates common (resilience) logic that may be applied to delegates anywhere.

(Brief further comments on the code line I pulled out: The Context you see in there too is an execution context that travels with each invocation. And the async version is similar with, you guessed it, some Funcs. 😉 )

@adamhathcock

This comment has been minimized.

Show comment
Hide comment
@adamhathcock

adamhathcock Jul 7, 2016

Thanks for the explanation. Just trying to solidify what you (and me) see as the use case for Polly.

Thanks for the explanation. Just trying to solidify what you (and me) see as the use case for Polly.

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 7, 2016

Member

@adamhathcock No problem. Debating the roadmap and boundaries - above all hearing people's use cases and integration points - sharpens things up for everybody.

Doco and applicability for open source projects is really important too: we've tried hard to make the readme cover all the syntax, and the wiki the concepts, but I'd like to get some other usage patterns (like the oauth stuff, and an MVC pipeline example) out into an expanded policy patterns section. Are people really aware of what's there in the wiki? (maybe we don't highlight it enough)

If anyone thinks there are things that would be useful to see more of in the doco, feel free to shout.

Member

reisenberger commented Jul 7, 2016

@adamhathcock No problem. Debating the roadmap and boundaries - above all hearing people's use cases and integration points - sharpens things up for everybody.

Doco and applicability for open source projects is really important too: we've tried hard to make the readme cover all the syntax, and the wiki the concepts, but I'd like to get some other usage patterns (like the oauth stuff, and an MVC pipeline example) out into an expanded policy patterns section. Are people really aware of what's there in the wiki? (maybe we don't highlight it enough)

If anyone thinks there are things that would be useful to see more of in the doco, feel free to shout.

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 7, 2016

Member

Interfaces in Polly and custom policies could potentially be a powerful development: People could develop their own custom policies; even publish them as packages; the best could be taken into the main Polly project.

The question: is now the right time to add interfaces? ... At mo, the planned functionality means we'd probably (necessarily) be adding to the base class/interface relatively frequently (breaking the interface if there was one). I'm wary of publishing an interface now and breaking it repeatedly over the coming months ... (thoughts anyone?)

Member

reisenberger commented Jul 7, 2016

Interfaces in Polly and custom policies could potentially be a powerful development: People could develop their own custom policies; even publish them as packages; the best could be taken into the main Polly project.

The question: is now the right time to add interfaces? ... At mo, the planned functionality means we'd probably (necessarily) be adding to the base class/interface relatively frequently (breaking the interface if there was one). I'm wary of publishing an interface now and breaking it repeatedly over the coming months ... (thoughts anyone?)

@SeanFarrow

This comment has been minimized.

Show comment
Hide comment
@SeanFarrow

SeanFarrow Jul 7, 2016

Could we not make the changes and then work out how we extract an interface at a later date?

SeanFarrow commented Jul 7, 2016

Could we not make the changes and then work out how we extract an interface at a later date?

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 7, 2016

Member

Looking at things from another angle too @adamhathcock, it could be just as advantageous for you (thinking of your pipeline) to put your own interface around Polly. The Polly Execute() overloads themselves, given the fluent syntax, have to try to do all things for all people. For instance, there are numerous overloads for ExecuteAsync() controlling continueOnCapturedContext, CancellationToken, involving Polly’s Context or not, then TResult versus void-returning etc ... You might decide for your own purposes you always/never want to capture async context, always insist on a CancellationToken etc, and might define a much tighter interface. It's only a few minutes’ work then to write a lightweight wrapper for a Polly policy which delegates downwards to the chosen overload/s and your own custom operations of course plug into that interface. ...

Apologies for this detail, but thought it worth calling out the numerous overloads, as it is a potential issue for a Polly interface. The fluent configuration is one of Polly's original greatest strengths, but it's stretching a bit more as we add more functionality - on the radar and feedback welcome.

Member

reisenberger commented Jul 7, 2016

Looking at things from another angle too @adamhathcock, it could be just as advantageous for you (thinking of your pipeline) to put your own interface around Polly. The Polly Execute() overloads themselves, given the fluent syntax, have to try to do all things for all people. For instance, there are numerous overloads for ExecuteAsync() controlling continueOnCapturedContext, CancellationToken, involving Polly’s Context or not, then TResult versus void-returning etc ... You might decide for your own purposes you always/never want to capture async context, always insist on a CancellationToken etc, and might define a much tighter interface. It's only a few minutes’ work then to write a lightweight wrapper for a Polly policy which delegates downwards to the chosen overload/s and your own custom operations of course plug into that interface. ...

Apologies for this detail, but thought it worth calling out the numerous overloads, as it is a potential issue for a Polly interface. The fluent configuration is one of Polly's original greatest strengths, but it's stretching a bit more as we add more functionality - on the radar and feedback welcome.

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 7, 2016

Member

@SeanFarrow,

Could we not make the changes and then work out how we extract an interface at a later date?

Yes (potentially). I'm also thinking about whether some interface segregation could stage things ... this will also become clearer as the new functionality is planned and executed.

Worth pointing out though that it was a huge benefit that the founder of Polly hadn't started with an interface at 1.0; we would have been breaking them repeatedly as the library grew. To judge ongoing.

Also imperative to decide whether to tighten the number of overloads (see foot of roadmap) before publishing an interface.

@SeanFarrow See for example also Hystrix holding off interfaces in places to give themselves room to grow. Big potential interface benefits for reasons previously stated (and usual design reasons), but care also in still-evolving library?

Member

reisenberger commented Jul 7, 2016

@SeanFarrow,

Could we not make the changes and then work out how we extract an interface at a later date?

Yes (potentially). I'm also thinking about whether some interface segregation could stage things ... this will also become clearer as the new functionality is planned and executed.

Worth pointing out though that it was a huge benefit that the founder of Polly hadn't started with an interface at 1.0; we would have been breaking them repeatedly as the library grew. To judge ongoing.

Also imperative to decide whether to tighten the number of overloads (see foot of roadmap) before publishing an interface.

@SeanFarrow See for example also Hystrix holding off interfaces in places to give themselves room to grow. Big potential interface benefits for reasons previously stated (and usual design reasons), but care also in still-evolving library?

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 8, 2016

Member

@PaybackMan You asked for .Net Standard Support for Polly. Just alerting you that we're in beta on this! There are beta packages posted (targeting .NETStandard versions 1.0 for PCL; and 1.6 for .NET Core) in the threads #132 and #133 respectively . Please feel free to try them out! (and we would love feedback). (We are letting interested parties pre-trial them before posting them to nuget, though they should go out there fairly soon...)

Member

reisenberger commented Jul 8, 2016

@PaybackMan You asked for .Net Standard Support for Polly. Just alerting you that we're in beta on this! There are beta packages posted (targeting .NETStandard versions 1.0 for PCL; and 1.6 for .NET Core) in the threads #132 and #133 respectively . Please feel free to try them out! (and we would love feedback). (We are letting interested parties pre-trial them before posting them to nuget, though they should go out there fairly soon...)

@mfjerome

This comment has been minimized.

Show comment
Hide comment
@mfjerome

mfjerome Jul 8, 2016

Hi @reisenberger. I am very happy with the recently updated roadmap. The reason I initially proposed the "Fallback" improvement and pushed for Hystrix-like features in March was because we are currently assembling our project templates/frameworks to build microservices inside large distributed, scalable apps for our finance business and we are using Polly in it.

Priority for us would be to add Timeout, Bulkheads and Fallbacks to policies, and chaining them in a pipeline. We are currently experimenting some implementation ideas proposed in this tool: https://github.com/hudl/Mjolnir .

We might have a programmer or 2 available during the summer if you come up with specific up-for-grabs. Cheers.

mfjerome commented Jul 8, 2016

Hi @reisenberger. I am very happy with the recently updated roadmap. The reason I initially proposed the "Fallback" improvement and pushed for Hystrix-like features in March was because we are currently assembling our project templates/frameworks to build microservices inside large distributed, scalable apps for our finance business and we are using Polly in it.

Priority for us would be to add Timeout, Bulkheads and Fallbacks to policies, and chaining them in a pipeline. We are currently experimenting some implementation ideas proposed in this tool: https://github.com/hudl/Mjolnir .

We might have a programmer or 2 available during the summer if you come up with specific up-for-grabs. Cheers.

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 11, 2016

Member

@mfjerome That’s great to hear! And perfect timing – after a big refactor (#130), we’re just about to embark on those Hystrix-like wider resilience policies. Please stay tuned as we’ll be putting up deeper docs on those features in the coming days/24hrs(?) and it would be great to have your feedback!

Extra dev power could well be useful @mfjerome, definitely opportunities. Are you able to contact me briefly off-github at software[at]reisenberger.net, we can talk a little more freely about how you/your devs could contribute? (and timings - I have some away time coming up). All contributors will be credited!

Timeout, Bulkheads and Fallbacks [... for use in ...] project templates/frameworks to build microservices

A good fit. I have experience of these in a microservices environment, part of my desire to apply to Polly.

Thank you for your support and involvement!

Member

reisenberger commented Jul 11, 2016

@mfjerome That’s great to hear! And perfect timing – after a big refactor (#130), we’re just about to embark on those Hystrix-like wider resilience policies. Please stay tuned as we’ll be putting up deeper docs on those features in the coming days/24hrs(?) and it would be great to have your feedback!

Extra dev power could well be useful @mfjerome, definitely opportunities. Are you able to contact me briefly off-github at software[at]reisenberger.net, we can talk a little more freely about how you/your devs could contribute? (and timings - I have some away time coming up). All contributors will be credited!

Timeout, Bulkheads and Fallbacks [... for use in ...] project templates/frameworks to build microservices

A good fit. I have experience of these in a microservices environment, part of my desire to apply to Polly.

Thank you for your support and involvement!

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 11, 2016

Member

I thought it might be useful to set down some thoughts I had about differences in approach/implementation (on resilience policies) between Hystrix and Polly.

NB In each case, neither of the approaches is necessarily better/worse (I can often see advantages from one perspective, disadvantages from another), but identifying them helps stimulate discussion and think about product direction for Polly. Hystrix-derivatives in .NET also tend to have Hystrix characteristics.

(The comparison is intended to call out differences in approach/emphasis, not be a worklist, so the observations assume the intended features on the Polly roadmap will be in place.)

Hystrix Polly
has a central command-type class which brings together all the resilience features. has a fixed set of elements all of which the dev has potentially to understand and configure before using a command (slightly heavier-weight) has separate components - users can choose to use only the Policys they need
the command class is derived from to manage an execution to be put through Hystrix. Code to be executed through the command is placed in an overridden run() method or similar. a Policy is not coupled to the code it can run. A Policy can be applied to any delegate. Policies can thread-safely be reused across multiple delegates (for benefits see wiki). Policies can be passed to entirely different parts of the codebase (allows decoupling/dependency injection/easier stubbing by DI, for testing).
uses the fixed set of strategies-in-a-command always in the same order allows users to compose policies to apply to a call in any order
uses each strategy (circuit-breaker; fallback etc) once in each command allows configuring PolicyWraps which use individual strategies (eg retry; fallback) multiple times. Combined with filtering, allows for responding to different exceptions differently, eg different retry strategies by exception (retry only once a XException, but on YException 5 times); or eg different fallback strategies/stub values for different outcomes.
has configuration by setting properties on the command instance has fluent configuration during instantiation of a policy
has the ability to recalibrate command parameters after initialisation does not (yet) have the ability to recalibrate policy parameters after initialisation (policies are immutable and pretty much 'black boxes') UPDATE: January 2018, now offered via PolicyRegistry
has central system-wide defaults for configuration parameters has no/few hidden defaults (policies only do what you explicitly configure them to do)
has no retry policies has retry policies
is geared more explicitly around result-returning calls, the execution of remote requests caters natively for both result-returning and void-returning actions
has exception filtering only by exception type; 'greedy' exception-catching with opt out can filter by deeper exception properties; specify explicitly the exceptions to handle (opt in), but can greedily catch-all if desired
is not (check?) able to interpret return values as faults to handle is able to interpret return values as faults to handle

(And maybe other dimensions - if you think there are others to consider, or if a comparison seems inaccurate / could be better stated, please comment!)

(And: There is clearly a raft of things Hystrix-in-Java does beyond fault-handling, in terms of stats/metrics, dashboard for visibility etc, beyond these resilience aspects.)


To date I have tried to grow Polly (since AppvNext took stewardship) in the spirit of the original, that is: fluent configuration; lightweight; minimal impact on your code (to use) after you have configured a policy.

Any thoughts arising from the above? Things we should/shouldn't do in Polly? Should/shouldn't do like Hystrix?

Member

reisenberger commented Jul 11, 2016

I thought it might be useful to set down some thoughts I had about differences in approach/implementation (on resilience policies) between Hystrix and Polly.

NB In each case, neither of the approaches is necessarily better/worse (I can often see advantages from one perspective, disadvantages from another), but identifying them helps stimulate discussion and think about product direction for Polly. Hystrix-derivatives in .NET also tend to have Hystrix characteristics.

(The comparison is intended to call out differences in approach/emphasis, not be a worklist, so the observations assume the intended features on the Polly roadmap will be in place.)

Hystrix Polly
has a central command-type class which brings together all the resilience features. has a fixed set of elements all of which the dev has potentially to understand and configure before using a command (slightly heavier-weight) has separate components - users can choose to use only the Policys they need
the command class is derived from to manage an execution to be put through Hystrix. Code to be executed through the command is placed in an overridden run() method or similar. a Policy is not coupled to the code it can run. A Policy can be applied to any delegate. Policies can thread-safely be reused across multiple delegates (for benefits see wiki). Policies can be passed to entirely different parts of the codebase (allows decoupling/dependency injection/easier stubbing by DI, for testing).
uses the fixed set of strategies-in-a-command always in the same order allows users to compose policies to apply to a call in any order
uses each strategy (circuit-breaker; fallback etc) once in each command allows configuring PolicyWraps which use individual strategies (eg retry; fallback) multiple times. Combined with filtering, allows for responding to different exceptions differently, eg different retry strategies by exception (retry only once a XException, but on YException 5 times); or eg different fallback strategies/stub values for different outcomes.
has configuration by setting properties on the command instance has fluent configuration during instantiation of a policy
has the ability to recalibrate command parameters after initialisation does not (yet) have the ability to recalibrate policy parameters after initialisation (policies are immutable and pretty much 'black boxes') UPDATE: January 2018, now offered via PolicyRegistry
has central system-wide defaults for configuration parameters has no/few hidden defaults (policies only do what you explicitly configure them to do)
has no retry policies has retry policies
is geared more explicitly around result-returning calls, the execution of remote requests caters natively for both result-returning and void-returning actions
has exception filtering only by exception type; 'greedy' exception-catching with opt out can filter by deeper exception properties; specify explicitly the exceptions to handle (opt in), but can greedily catch-all if desired
is not (check?) able to interpret return values as faults to handle is able to interpret return values as faults to handle

(And maybe other dimensions - if you think there are others to consider, or if a comparison seems inaccurate / could be better stated, please comment!)

(And: There is clearly a raft of things Hystrix-in-Java does beyond fault-handling, in terms of stats/metrics, dashboard for visibility etc, beyond these resilience aspects.)


To date I have tried to grow Polly (since AppvNext took stewardship) in the spirit of the original, that is: fluent configuration; lightweight; minimal impact on your code (to use) after you have configured a policy.

Any thoughts arising from the above? Things we should/shouldn't do in Polly? Should/shouldn't do like Hystrix?

@mfjerome

This comment has been minimized.

Show comment
Hide comment
@mfjerome

mfjerome Jul 11, 2016

a Policy is not coupled to the code it can run.

This second element in your table is definitely a key difference that should stay going forward!

a Policy is not coupled to the code it can run.

This second element in your table is definitely a key difference that should stay going forward!

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 11, 2016

Member

@adamhathcock / @Amerdrix / @SeanFarrow / @mfjerome / @joelhulen / anyone interested: I have now posted some more detailed proposals for the possible operation/implementation of the planned, hystrix-like resilience features:

#80 Fallback
#136 Cache
#137 Timeout
#138 Throttle
#139 (Keys, to use with cache, potential later PolicyRegistry and events/stats emission)
#140 Policy Wrap (was: Pipeline)

I appreciate this is a lot of material to read: thanks in advance for anyone's time/interest/involvement. Most are essentially more detailed statements of an implementation already outlined in the Roadmap. The biggest change is in #140 Policy wrap (may add more comments/qs around this shortly).

Please do add comments / qs / thoughts under the relevant issue! (Or equally if you think a proposal sounds good-to-go as it is.)

Member

reisenberger commented Jul 11, 2016

@adamhathcock / @Amerdrix / @SeanFarrow / @mfjerome / @joelhulen / anyone interested: I have now posted some more detailed proposals for the possible operation/implementation of the planned, hystrix-like resilience features:

#80 Fallback
#136 Cache
#137 Timeout
#138 Throttle
#139 (Keys, to use with cache, potential later PolicyRegistry and events/stats emission)
#140 Policy Wrap (was: Pipeline)

I appreciate this is a lot of material to read: thanks in advance for anyone's time/interest/involvement. Most are essentially more detailed statements of an implementation already outlined in the Roadmap. The biggest change is in #140 Policy wrap (may add more comments/qs around this shortly).

Please do add comments / qs / thoughts under the relevant issue! (Or equally if you think a proposal sounds good-to-go as it is.)

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jul 11, 2016

Member

To answer the v good question @SeanFarrow raised under specific new proposed policies: All new policies would cater for all existing Execute syntaxes and sync/async variants unless otherwise stated. The syntax examples given were just examples (too many overloads to state all).

In general:

  • ExecuteAndCapture() should be supported throughout
  • sync and async should be supported throughout

... and if there are exceptions the Scope section (or detailed notes) in each policy proposal should call them out. For example:

  • Cache makes no sense for void returning 😉
  • Timeout requires new co-operative-cancellation overloads with Cancellation for sync .Execute()
  • ExecuteAndCapture() probably wants special treatment in a policy wrap: ... one would want more-inner policies in the wrap still to throw, and only the outermost call to ...Capture?)

Thanks for the great q. I'll go over them again and look for any missed cases. Anyone shout if you think any thinking steps missed on this!

Member

reisenberger commented Jul 11, 2016

To answer the v good question @SeanFarrow raised under specific new proposed policies: All new policies would cater for all existing Execute syntaxes and sync/async variants unless otherwise stated. The syntax examples given were just examples (too many overloads to state all).

In general:

  • ExecuteAndCapture() should be supported throughout
  • sync and async should be supported throughout

... and if there are exceptions the Scope section (or detailed notes) in each policy proposal should call them out. For example:

  • Cache makes no sense for void returning 😉
  • Timeout requires new co-operative-cancellation overloads with Cancellation for sync .Execute()
  • ExecuteAndCapture() probably wants special treatment in a policy wrap: ... one would want more-inner policies in the wrap still to throw, and only the outermost call to ...Capture?)

Thanks for the great q. I'll go over them again and look for any missed cases. Anyone shout if you think any thinking steps missed on this!

@gsogol

This comment has been minimized.

Show comment
Hide comment
@gsogol

gsogol Jul 18, 2016

How about reusing and integrating with hystrix dashboard via turbine (https://github.com/Netflix/Turbine/wiki)? This is useful because the hystrix community is huge. No point in recreating dashboards if integration exists via server sent events.

gsogol commented Jul 18, 2016

How about reusing and integrating with hystrix dashboard via turbine (https://github.com/Netflix/Turbine/wiki)? This is useful because the hystrix community is huge. No point in recreating dashboards if integration exists via server sent events.

@drewburlingame

This comment has been minimized.

Show comment
Hide comment
@drewburlingame

drewburlingame Jul 18, 2016

... and if there are exceptions the Scope section (or detailed notes) in each policy proposal should call them out. For example:

Cache makes no sense for void returning 😉

Ideally, we'd prefer a pass through vs throwing an exception.

The null pattern would be helpful in cases where all the methods for a service could use the same PolicyWrap. If some of the methods were void and the Cache policy threw an exception for void methods, then we'd need to create two PolicyWraps. The null pattern would allow us to use just one PolicyWrap.

If there are cases where we'd want to throw an exception, then hopefully it's a config option we can setup for the policy. i.e. ignoreVoidMethods=true;

... and if there are exceptions the Scope section (or detailed notes) in each policy proposal should call them out. For example:

Cache makes no sense for void returning 😉

Ideally, we'd prefer a pass through vs throwing an exception.

The null pattern would be helpful in cases where all the methods for a service could use the same PolicyWrap. If some of the methods were void and the Cache policy threw an exception for void methods, then we'd need to create two PolicyWraps. The null pattern would allow us to use just one PolicyWrap.

If there are cases where we'd want to throw an exception, then hopefully it's a config option we can setup for the policy. i.e. ignoreVoidMethods=true;

@lakario

This comment has been minimized.

Show comment
Hide comment
@lakario

lakario Jan 9, 2017

Contributor

Have you thought about implementing some kind of metrics functionality similar to what Hystrix has. It would be very useful to be able to see Circuit Breaker metrics in for example Grafana.

My current project would benefit greatly from being able to export more detailed state from circuit breaker and other policies. Specific information like total number of calls, time stamps for each call (start and end), as well as aggregates by CircuitState.

Contributor

lakario commented Jan 9, 2017

Have you thought about implementing some kind of metrics functionality similar to what Hystrix has. It would be very useful to be able to see Circuit Breaker metrics in for example Grafana.

My current project would benefit greatly from being able to export more detailed state from circuit breaker and other policies. Specific information like total number of calls, time stamps for each call (start and end), as well as aggregates by CircuitState.

@tomkerkhove

This comment has been minimized.

Show comment
Hide comment
@tomkerkhove

tomkerkhove Jan 9, 2017

+1 for the metrics

+1 for the metrics

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jan 15, 2017

Member

COMMUNITY FEEDBACK SOUGHT

Following release of Polly v5.0, the potential Polly roadmap has been updated with a few new items and clarifications.

There is clear community interest in metrics; work on this is beginning, including discussion in our slack channel. Full proposals will be posted on GitHub as they cohere.

Beyond metrics, what is your view of priorities among the other items?

What would you like to see, that is not on the roadmap?

Member

reisenberger commented Jan 15, 2017

COMMUNITY FEEDBACK SOUGHT

Following release of Polly v5.0, the potential Polly roadmap has been updated with a few new items and clarifications.

There is clear community interest in metrics; work on this is beginning, including discussion in our slack channel. Full proposals will be posted on GitHub as they cohere.

Beyond metrics, what is your view of priorities among the other items?

What would you like to see, that is not on the roadmap?

@SamuelEnglard

This comment has been minimized.

Show comment
Hide comment
@SamuelEnglard

SamuelEnglard Jan 16, 2017

Personally failover and dynamic configurations would be my next top items. I'd have dynamic configuration first since that can be used to create failovers

Personally failover and dynamic configurations would be my next top items. I'd have dynamic configuration first since that can be used to create failovers

@railarmenien

This comment has been minimized.

Show comment
Hide comment
@railarmenien

railarmenien Apr 2, 2017

CircuitBreaker and AdvancedCircuitBreaker have a locking nature due to the use of a Monitor in the TimedLock implementation. A implementation using the Interlocked class may avoid to lock the circuit state. What do you think about it ? Would it be relevant ?

CircuitBreaker and AdvancedCircuitBreaker have a locking nature due to the use of a Monitor in the TimedLock implementation. A implementation using the Interlocked class may avoid to lock the circuit state. What do you think about it ? Would it be relevant ?

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Apr 8, 2017

Member

@railarmenien re:

CircuitBreaker and AdvancedCircuitBreaker have a locking nature due to the use of a Monitor in the
TimedLock implementation. A implementation using the Interlocked class may avoid to lock the circuit
state. What do you think about it ? Would it be relevant ?

Thanks @railarmenien for this great point! The circuit-breaker design App-vNext inherited when taking over the project, as you say, uses locking. I started investigating moving the design to Interlocked in January when we resolved #216 with lock-free techniques. My broad-brush assessment was:

(a) Metrics calculation and general operation in Closed state: is the key case to optimise. It makes sense to absolutely minimise any latency through contention for concurrent, multi-threaded requests on a healthy system. As most metrics updates consist of increments and assignments/queue-manipulation (rolling the metrics buckets over), these can likely be moved to Interlocked and/or thread-safe collections. Same for HalfOpen state if #239 is pursued.

(b) Transitions of circuit-state (where we have locks guarding modifications to several variables): seems not/less worth optimising away the locks. If your circuit is breaking, the problems you are likely suffering (eg network latency; faulting downstream system; breaking the circuit for presumably several seconds) are all orders of magnitude greater time-wise than the differences between locking techniques. No value to increase code complexity/risks here?

(c) Read locks: Some locks effectively act only as read locks. Some I think could be reasoned away as unnecessary. Others (eg if read-then-modify) cannot.

There's also extensive discussion here on pros/cons including from the eminent Reed Copsey. The broad conclusion is 'measure don't assume' (so we should do that). My instinct, reviewing the code, is that we should pursue (a), and would make gains, but we ought to take an evidence-based approach and get some benchmarking around this (as @ankitbko has suggested elsewhere) .

What do you / does anyone think?

I can take this forward but it requires a dedicated chunk of time. I'd be very happy to hear further comment/contributions from anyone in the community.

Member

reisenberger commented Apr 8, 2017

@railarmenien re:

CircuitBreaker and AdvancedCircuitBreaker have a locking nature due to the use of a Monitor in the
TimedLock implementation. A implementation using the Interlocked class may avoid to lock the circuit
state. What do you think about it ? Would it be relevant ?

Thanks @railarmenien for this great point! The circuit-breaker design App-vNext inherited when taking over the project, as you say, uses locking. I started investigating moving the design to Interlocked in January when we resolved #216 with lock-free techniques. My broad-brush assessment was:

(a) Metrics calculation and general operation in Closed state: is the key case to optimise. It makes sense to absolutely minimise any latency through contention for concurrent, multi-threaded requests on a healthy system. As most metrics updates consist of increments and assignments/queue-manipulation (rolling the metrics buckets over), these can likely be moved to Interlocked and/or thread-safe collections. Same for HalfOpen state if #239 is pursued.

(b) Transitions of circuit-state (where we have locks guarding modifications to several variables): seems not/less worth optimising away the locks. If your circuit is breaking, the problems you are likely suffering (eg network latency; faulting downstream system; breaking the circuit for presumably several seconds) are all orders of magnitude greater time-wise than the differences between locking techniques. No value to increase code complexity/risks here?

(c) Read locks: Some locks effectively act only as read locks. Some I think could be reasoned away as unnecessary. Others (eg if read-then-modify) cannot.

There's also extensive discussion here on pros/cons including from the eminent Reed Copsey. The broad conclusion is 'measure don't assume' (so we should do that). My instinct, reviewing the code, is that we should pursue (a), and would make gains, but we ought to take an evidence-based approach and get some benchmarking around this (as @ankitbko has suggested elsewhere) .

What do you / does anyone think?

I can take this forward but it requires a dedicated chunk of time. I'd be very happy to hear further comment/contributions from anyone in the community.

@ankitbko

This comment has been minimized.

Show comment
Hide comment
@ankitbko

ankitbko Apr 8, 2017

Contributor

@reisenberger
There may be another approach but I am not sure about this. Hystrix seems to do it little differently (need to check in detail). It extensively uses Reactive for both circuit breaking and metrics implementation. What I saw was that events are written into a thread-local Rx stream. From there it is then aggregated into command-specific stream. The entire circuit breaker logic(bucketing and aggregating) is done through Rx streams (by using window). Since Rx streams are thread safe, this may allow us to circumvent manual locking.

Benchmarking first should allow us to compare the results and also check if introducing addional complexity makes sense or not.

PS: I am not sure about this. I was just glancing over Hystrix code quite some time ago. I am not also very familiar with Rx.

Contributor

ankitbko commented Apr 8, 2017

@reisenberger
There may be another approach but I am not sure about this. Hystrix seems to do it little differently (need to check in detail). It extensively uses Reactive for both circuit breaking and metrics implementation. What I saw was that events are written into a thread-local Rx stream. From there it is then aggregated into command-specific stream. The entire circuit breaker logic(bucketing and aggregating) is done through Rx streams (by using window). Since Rx streams are thread safe, this may allow us to circumvent manual locking.

Benchmarking first should allow us to compare the results and also check if introducing addional complexity makes sense or not.

PS: I am not sure about this. I was just glancing over Hystrix code quite some time ago. I am not also very familiar with Rx.

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger Jun 11, 2017

Member

Added the idea of a RateLimitPolicy (serial throttle) to the Roadmap.

Member

reisenberger commented Jun 11, 2017

Added the idea of a RateLimitPolicy (serial throttle) to the Roadmap.

@MihaMarkic

This comment has been minimized.

Show comment
Hide comment
@MihaMarkic

MihaMarkic Mar 22, 2018

Would be nice to have a .NET Standard 2.0 target in NuGet package as well. The reason is that NuGet would import a ton of referenced packages otherwise.

Would be nice to have a .NET Standard 2.0 target in NuGet package as well. The reason is that NuGet would import a ton of referenced packages otherwise.

@reisenberger

This comment has been minimized.

Show comment
Hide comment
@reisenberger

reisenberger May 6, 2018

Member

@MihaMarkic @devapalanisamy Polly v6.0.0, now published, includes a .NET Standard 2.0 target.

Member

reisenberger commented May 6, 2018

@MihaMarkic @devapalanisamy Polly v6.0.0, now published, includes a .NET Standard 2.0 target.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment