Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DISCUSS] Declaring event handlers globally or per policy #154

Closed
jhalterman opened this issue Dec 22, 2018 · 11 comments
Closed

[DISCUSS] Declaring event handlers globally or per policy #154

jhalterman opened this issue Dec 22, 2018 · 11 comments

Comments

@jhalterman
Copy link
Member

jhalterman commented Dec 22, 2018

Failsafe currently allows most event handlers to be declared globally. Ex:

RetryPolicy retryPolicy = new RetryPolicy();
CircuitBreaker circuitBreaker = new CircuitBreaker();
Failsafe
  .with(retryPolicy)
  .with(circuitBreaker)
  .onRetry(f -> refreshContext())
  .get(this::connect);

Rather than per policy:

RetryPolicy retryPolicy = new RetryPolicy()
  .onRetry(f -> refreshContext());
CircuitBreaker circuitBreaker = new CircuitBreaker();
Failsafe
  .with(retryPolicy)
  .with(circuitBreaker)
  .get(this::connect);

One obvious problem with the current approach is that some events don't apply to all policies (ex: onRetry does nothing for a CircuitBreaker). So why the current approach? It was mostly based on the idea that policy configuration (RetryPolicy) should be separate from execution logic (Failsafe...) which presumably includes event handling. A common RetryPolicy might be reused for lots of different types of executions, but the thinking was that if you start to embed event handling logic into it then maybe it wouldn't be.

Option 1

The first option is to leave things as they are based on the idea that separating event handling from policy config is a good thing and that it will give us some policy reuse.

Option 2

Another option is to move the policy specific event handling to the policies. So the retry related event handling would be moved to RetryPolicy. This would leave just the completion related events at the global level. Ex:

RetryPolicy retryPolicy = new RetryPolicy()
  .onRetry(f -> refreshContext());
CircuitBreaker circuitBreaker = new CircuitBreaker();
Failsafe
  .with(retryPolicy)
  .with(circuitBreaker)
  .onSuccess(cxn -> log.info("Connected to {}", cxn))
  .get(this::connect);

Option 3

Another option is to move all event handling to the policy level, including (some) completion events:

RetryPolicy retryPolicy = new RetryPolicy()
  .onRetry(f -> refreshContext())
  .onSuccess(cxn -> log.info("Connected to {}", cxn))
CircuitBreaker circuitBreaker = new CircuitBreaker()
  .onFailure(f ->log.warn("The CircuitBreaker logged a failure"))

Failsafe
  .with(retryPolicy)
  .with(circuitBreaker)
  .get(this::connect);

In practice, each policy makes its own decision about whether an execution was a success or failure, and different Exceptions/results may be failures for different policies (ex: a CircuitBreaker might handle TimeoutExceptions whereas a RetryPolicy might handle ConnectExceptions). In that sense, having a global onSuccess event handler might not make sense, since it can only be called if all of the policies were successful and loses some detail.

That said, the onComplete event handler may still need to be global, since it provides a completion callback for async executions, similar to CompletableFuture.whenComplete, and while success/failure is a policy-specific decision, completion is a global event.

Thoughts?

@jhalterman
Copy link
Member Author

Shamelessly tagging a few people who dropped by previously. Please weigh in on this topic if interested :)

/cc @ben-manes @Tembrel @MALPI @axelfontaine @whiskeysierra @chhsiao90

@whiskeysierra
Copy link

Option 1 would require to redesign the current approach of circuit breaker callbacks which are bound to a circuit breaker, rather than per execution. Right?

Option 2 and 3 require that those policies are either created whenever Failsafe is invoked (non-shared) and once when a shared policy is created. Creating them whenever needed kind of defeats the idea of having something that I can easily pass like a RetryPolicy object. Having a singleton policy that I can share including event handlers defeats the purpose of having event dedicated handlers per execution.

Someone who wants a) a policy given/shared and b) it's own event handlers won't work with option 2/3.

I'd propose the following:

  1. Make policies immutable (not sure if/how that would work with CircuitBreaker), so they can be shared freely without the risk of someone modifying it afterwards
  2. Extend the scope of policies from basic parameters (max number of retries, delays, jitter, etc.) to include default event handlers/callbacks.
  3. Allow users to either override or add event handlers on each execution/invocation.

@jhalterman
Copy link
Member Author

Option 1 would require to redesign the current approach of circuit breaker callbacks which are bound to a circuit breaker, rather than per execution. Right?

Option 1 would just be to leave things as they are today.

Having a singleton policy that I can share including event handlers defeats the purpose of having event dedicated handlers per execution.

Do you think something like onRetry should not be shared?

@whiskeysierra
Copy link

Option 1 would just be to leave things as they are today.

But today the location of event handlers is already inconsistent. Retry event handlers are per execution, circuit breaker callbacks are per circuit breaker.

@whiskeysierra
Copy link

Do you think something like onRetry should not be shared?

I could imagine use cases for both. Counting retries across the board for metrics and monitoring would be a common use case that would be easier with a shared event handler on a per-policy basis. On the other hand, one might want to log or persist the retry counter for one particular execution but not for all of them, something that would be done way easier within a per-execution event handler.

@jhalterman
Copy link
Member Author

jhalterman commented Dec 24, 2018

I could imagine use cases for both. Counting retries across the board for metrics and monitoring would be a common use case that would be easier with a shared event handler on a per-policy basis.

Another way we could solve this would be to have a base RetryPolicy that logs events system-wide and make a copy for more specific things.

@whiskeysierra How about the onSuccess and onFailure events? Success/failure is really specific to a policy and doesn't make as much sense globally since one policy might succeed and one might fail.

@jhalterman
Copy link
Member Author

jhalterman commented Dec 27, 2018

Thinking more about this... for Failsafe 2.0, it might make sense to move basically all event handling out of the Failsafe... flow and into separate policies. My thinking is:

  • onComplete is not necessary for sync
  • onComplete for async could be accomplished by returning something like a CompletableFuture
  • onSuccess/onFailure are more closely aligned with Policies, where the notion of success/failure are defined, rather than globally.
  • onRetry/onFailedAttempt/etc are specific to RetryPolicy

Thoughts? /cc @Tembrel (since I see you're active atm :) )

@MALPI
Copy link
Contributor

MALPI commented Dec 28, 2018

From my experience it totally makes sense to have globally assigned listeners and maybe even provide some default ones. For example we use by default LoggingListener in each of our projects which do write logs for onFailure and the other cases. Some of them are written in debug only (for example onSuccess, some of them in warn or info for example onFailure.

Having this assignment on each usage of Failsafe creates boilerplate. Additionally for policies, we mostly re-use the RetryPolicy per project but have to assign it each time as well. Having means to assign certain policies and listeners globally would make life easier.

Also it would make life for maintainers of extensions easier, for example there could be provided a library for metrics exposure of Failsafe by using Micrometer.

Last but not least, I'd like to thank you for your work, but it would be great if you could share your ideas and thoughts for the project in a structured way so that others can contribute. I would be more than happy to support the project by contributing to it. Currently it's rather hard as it's not visible to me what your plans and thoughts are.

@jhalterman
Copy link
Member Author

jhalterman commented Dec 28, 2018

From my experience it totally makes sense to have globally assigned listeners and maybe even provide some default ones. For example we use by default LoggingListener in each of our projects which do write logs for onFailure and the other cases. Some of them are written in debug only (for example onSuccess, some of them in warn or info for example onFailure

Could you, alternatively, use a base RetryPolicy that has your default logging for onSuccess and onFailure? Or are you logging those things for CircuitBreaker executions as well?

The problem I'm trying to resolve is if we have global onSuccess or onFailure listeners, going forward we'll need to decide how to deal with an execution that succeeds for one policy but might fail for another (since Failsafe 2 will offer composable policies). We can either base a global event handler off the last policy to be executed, or we could only consider it a success if all policies were successful.

Having means to assign certain policies and listeners globally would make life easier.

Indeed - I'd like to make it more clear in the API that Failsafe instances can be saved and re-used for multiple executions (see "Saving Failsafe Instances" in #160). I'm not sure if this is what you were thinking?

Also it would make life for maintainers of extensions easier, for example there could be provided a library for metrics exposure of Failsafe by using Micrometer.

Any thoughts on specific metrics or events we'd like to collect metrics for? Metrics has come up in #140, but feel free to open a dedicated issue for this.

Last but not least, I'd like to thank you for your work, but it would be great if you could share your ideas and thoughts for the project in a structured way so that others can contribute. I would be more than happy to support the project by contributing to it. Currently it's rather hard as it's not visible to me what your plans and thoughts are.

Agree. I opened #159 to serve as a starting point for discussing future ideas and roadmap items. Specific issues, such as this one, are included in the things I'd like to get done for 2.0. In general, I'd like to focus on providing extension points in Failsafe so 3rd party implementors can add additional integrations and even Policy implementations.

@jhalterman
Copy link
Member Author

Implemented in master.

@jhalterman
Copy link
Member Author

jhalterman commented Jan 16, 2019

The implementation:

onComplete will still be a top level event listener that is always called. onSuccess and onFailure can be registered on an individual Policy, or at the top level. Ex:

RetryPolicy<Object> retryPolicy = new RetryPolicy<>()
  .onSuccess(e -> log.info("retry policy succeeded"))
  .onFailure(e -> log.warn("retry policy failed"));

CircuitBreaker<Object> circuitBreaker = new CircuitBreaker<>()
  .onSuccess(e -> log.info("circuit breaker succeeded"))
  .onFailure(e -> log.warn("circuit breaker failed"));

Failsafe
  .with(retryPolicy, circuitBreaker)
  .onComplete(e -> log.info("done"))
  .onSuccess(e -> log.info("all policies were successful"))
  .onFailure(e -> log.warn("at least one policy failed"))
  .get(this::connect);

At the Policy level, they will be called each time a Policy evaluates an execution result. If retries are involved, this could be multiple times.

At the top (FailsafeExecutor) level, onSuccess will only be called if all policies were successful, else onFailure will be called.

Other event handlers that are specific to particular Policy implementations will live inside those classes, ex: RetryPolicy.onRetry or CircuitBreaker.onOpen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants