Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recovery strategies when handler fails to process, #17 #95

Merged
merged 4 commits into from
May 8, 2020

Conversation

patriknw
Copy link
Member

@patriknw patriknw commented May 7, 2020

  • moved recovery strategies and implementation to core
  • used in CassandraProjection (and SlickProjection)
  • new unit test of the HandlerRecoveryImpl
  • also found and fixed a few bugs
    • problem with at-least-once after 1 in Slick
    • problem with at-most-once in Cassandra, async boundary may cause more than one
      offset to be saved before processing

References #17

Draft because on top of other PR, but ready for review.

case (offset, envelope) =>
offsetStore
.saveOffset(projectionId, offset)
.flatMap(_ => applyUserRecovery(envelope, () => handler.process(envelope)))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was wrong because the async boundary may cause more than one offset to be saved before processing


// TODO: add a LogSource for projection when we have a name and key
val logger = Logging(systemProvider.classicSystem, this.getClass)

val futDone = Future.successful(Done)

def applyUserRecovery(envelope: Envelope)(futureCallback: () => Future[Done]) = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was moved to core (and improved)

}
}

def aroundUserHandler(env: Envelope)(handlerFunc: Envelope => slick.dbio.DBIO[Done]) = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this. I think the purpose was to deal with thrown exceptions, but that is now handled in same way as if the handler returned Future.failed. There were no tests for this and I don't think it was completely correct anyway.

.transactionally

applyUserRecovery(env) { () =>
databaseConfig.db.run(txDBIO).map(_ => Done)
}
}

def processEnvelopeAndStoreOffsetInSeparateTransactions(env: Envelope): Future[Done] = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this, and using future composition instead, see the AtLeastOnce(1, _) case below. Strange if offsetStore fails, then we would retry and run the handler transaction again.

Copy link
Member

@octonato octonato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

A few change request here and there though

def applyUserRecovery[Offset, Envelope](
handler: HandlerRecovery[Envelope],
envelope: Envelope,
sourceProvider: SourceProvider[Offset, Envelope],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we can pass just the extracted offset

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good

@@ -17,6 +29,6 @@ import akka.Done
* guarantees between the invocations are handled automatically, i.e. no volatile or
* other concurrency primitives are needed for managing the state.
*/
trait Handler[Envelope] {
abstract class Handler[Envelope] extends HandlerRecovery[Envelope] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ApiMayChange like in Scala?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I'll also add a ticket so that we take a look at all of it before "final"

retryAndSkip(retries, delay.asScala)

/**
* INTERNAL API: placed here instead of the `internal` package because of sealed trait
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

retied

case RetryAndSkip(retries, delay) =>
logger.error(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning since we are skipping

"fail" in {
val handler = new FailHandler(HandlerRecoveryStrategy.fail, failOnOffset = 3)
val result =
HandlerRecoveryImpl.applyUserRecovery(handler, env3, sourceProvider, logger, () => handler.process(env3))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we pass only the offset, we don't even need to hack a SourceProvider here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

The projection impl will have access to the Envelope and SourceProvider already so it can extract the offset then.


class ConcatHandlerFail4(recoveryStrategy: HandlerRecoveryStrategy = HandlerRecoveryStrategy.fail)
extends SlickHandler[Envelope] {
private val _attempts = new AtomicInteger()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍, I didn't think about making this atomic on first impl. Good catch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reason is that it is accessed from the outside of the handler, from the test

@patriknw
Copy link
Member Author

patriknw commented May 7, 2020

changed so that retries are not used for atMostOnce because it would be more than once

Copy link
Member

@seglo seglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice 👍


// retries - 1 because retry() is based on attempts
// first attempt is performed immediately and therefore we must first delay
val retied = after(delay, scheduler)(retry(tryFutureCallback, retries - 1, delay))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo

futDone

case RetryAndFail(retries, delay) =>
logger.error(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a warning since we're going to retry?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think so

"fail" in {
val handler = new FailHandler(HandlerRecoveryStrategy.fail, failOnOffset = 3)
val result =
HandlerRecoveryImpl.applyUserRecovery(handler, env3, sourceProvider, logger, () => handler.process(env3))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

The projection impl will have access to the Envelope and SourceProvider already so it can extract the offset then.

@patriknw patriknw force-pushed the wip-43-eventhandler-trait-patriknw branch from 4707a78 to d74431d Compare May 7, 2020 17:37
* moved recovery strategies and implementation to core
* used in CassandraProjection (and SlickProjection)
* new unit test of the HandlerRecoveryImpl
* also found and fixed a few bugs
  * problem with at-least-once after 1 in Slick
  * problem with at-most-once in Cassandra, async boundary may cause more than one
    offset to be saved before processing
@patriknw patriknw force-pushed the wip-17-fail-strat-patriknw branch from a6c58b1 to 3b38393 Compare May 7, 2020 18:24
@patriknw patriknw changed the base branch from wip-43-eventhandler-trait-patriknw to master May 7, 2020 18:25
@patriknw patriknw marked this pull request as ready for review May 7, 2020 18:25
@patriknw
Copy link
Member Author

patriknw commented May 7, 2020

incorporated feedback

@patriknw patriknw closed this May 7, 2020
@patriknw patriknw reopened this May 7, 2020
@patriknw
Copy link
Member Author

patriknw commented May 8, 2020

some test failure, I'll look into it

@patriknw patriknw merged commit 42d4bf0 into master May 8, 2020
@patriknw patriknw deleted the wip-17-fail-strat-patriknw branch May 8, 2020 09:37
johanandren pushed a commit that referenced this pull request Mar 23, 2023
* simple search and replace refactoring so not much too review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants