Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible split-brain of ShardCoordinators during oldest node restart #28093

Closed
retriku opened this issue Oct 30, 2019 · 6 comments
Closed

Possible split-brain of ShardCoordinators during oldest node restart #28093

retriku opened this issue Oct 30, 2019 · 6 comments

Comments

@retriku
Copy link

@retriku retriku commented Oct 30, 2019

Hello,

We have encountered following issue when cluster oldest node got restarted. We use akka-cluster and after oldest node got restarted it looks that we had some kind of split-brain condition. Couple of PersistentShardCoordinators could not recover with following error:

2019-10-23 06:03:33,976 ERROR a.c.s.PersistentShardCoordinator Exception in receiveRecover when replaying event type [akka.cluster.sharding.ShardCoordinator$Internal$ShardHomeAllocated] with sequence number [2104] for persistenceId [/sharding/SessionsCoordinator].
java.lang.IllegalArgumentException: requirement failed: Shard [98] already allocated...

Our guess on scenario is as follows:

  1. We had slowed down write to Cassandra (for whatever reason), so it took more time for PersistentShardCoordinator to persist its state
  2. While persisting of state was still in progress, ShardCoordinator migrated to other node and managed to recover (I think this is possible due to usage of PoisonPill as termination message in ClusterSingletonManager)
  3. New coordinator persisted same message with different sequence number as the one from step 1, leading to corrupted journal

Code snippet from ClusterShardingGuardian for creating shard region:

val shardRegion = context.child(encName).getOrElse {
  if (context.child(cName).isEmpty) {
    val coordinatorProps =
      if (settings.stateStoreMode == ClusterShardingSettings.StateStoreModePersistence)
        ShardCoordinator.props(typeName, settings, allocationStrategy)
      else
        ShardCoordinator.props(typeName, settings, allocationStrategy, rep, majorityMinCap)
    val singletonProps =
      BackoffOpts
        .onStop(
          childProps = coordinatorProps,
          childName = "coordinator",
          minBackoff = coordinatorFailureBackoff,
          maxBackoff = coordinatorFailureBackoff * 5,
          randomFactor = 0.2)
        .props
        .withDeploy(Deploy.local)
    val singletonSettings = settings.coordinatorSingletonSettings.withSingletonName("singleton").withRole(role)
    context.actorOf(
      ClusterSingletonManager
        .props(singletonProps, terminationMessage = PoisonPill, singletonSettings)
        .withDispatcher(context.props.dispatcher),
      name = cName)
  }

  context.actorOf(
    ShardRegion
      .props(
        typeName = typeName,
        entityProps = entityProps,
        settings = settings,
        coordinatorPath = cPath,
        extractEntityId = extractEntityId,
        extractShardId = extractShardId,
        handOffStopMessage = handOffStopMessage,
        replicator = rep,
        majorityMinCap)
      .withDispatcher(context.props.dispatcher),
    name = encName)
}

Also Akka documentation states that PoisonPill should not be used when shutting down persistent actors - https://doc.akka.io/docs/akka/current/persistence.html#safely-shutting-down-persistent-actors.

I wonder if this could be the reason? Any suggestions on how to validate this in test?

Our setup:
jvm: 11.0.4
scala: 2.12.10
akka: 2.5.25
akka-persistence-cassandra: 0.62
rememberEntities: true

@patriknw

This comment has been minimized.

Copy link
Member

@patriknw patriknw commented Oct 30, 2019

Was this during a graceful leaving (CoordinatedShutdown) or a crash followed by unreachability and downing?

I think your analysis is a possible explanation.
I agree that we should use another message here than PoisonPill.
Would you like to fix it, or shall I?

@retriku

This comment has been minimized.

Copy link
Author

@retriku retriku commented Oct 30, 2019

Cluster leaving was graceful. From Cassandra:

persistence_id                | sequence_nr | toTimestamp(timestamp)          | ser_manifest
/sharding/SessionsCoordinator |        2103 | 2019-10-23 05:58:23.651000+0000 |           AG
/sharding/SessionsCoordinator |        2103 | 2019-10-23 05:58:23.905000+0000 |           AD

From cluster sharding logs:

docker5-pr node1 2019-10-23 05:58:23,737 INFO a.c.s.ClusterSingletonManager Singleton manager stopping singleton actor [akka://***/system/sharding/SessionsCoordinator/singleton]
docker6-pr node2 2019-10-23 05:58:23,749 INFO a.c.s.ClusterSingletonManager Singleton manager starting singleton actor [akka://***/system/sharding/SessionsCoordinator/singleton]

I guess I can try to fix this myself.

@patriknw

This comment has been minimized.

Copy link
Member

@patriknw patriknw commented Oct 30, 2019

Cluster leaving was graceful.

yes, then it can be pretty quick

From Cassandra

There is also a writer_uuid column, that is probably different for the the two 2103 events.

I guess I can try to fix this myself.

Thanks, let me know if any questions.

@retriku

This comment has been minimized.

Copy link
Author

@retriku retriku commented Nov 6, 2019

I created a small app based on akka/akka-sample-cqrs-scala that can reproduce the issue.
One needs to run Cassandra and two nodes as described in Readme file.And then restarting nodes starting with oldest node. And check logs for something like

11:38:41.473 ERROR [a.c.s.PersistentShardCoordinator] [ClusterSystem-akka.actor.default-dispatcher-11] - Exception in receiveRecover when replaying event type [akka.cluster.sharding.ShardCoordinator$Internal$ShardHomeDeallocated] with sequence number [106] for persistenceId [/sharding/switch-processorCoordinator].

Source of app can be check at https://github.com/retriku/akka-sample-cqrs-scala

johanandren added a commit that referenced this issue Nov 25, 2019
with a dedicated ShardCoordinator termination message
@johanandren johanandren modified the milestones: 2.6.1, 2.5.27 Nov 25, 2019
johanandren added a commit to johanandren/akka that referenced this issue Nov 25, 2019
…28093

with a dedicated ShardCoordinator termination message
raboof added a commit that referenced this issue Nov 26, 2019
#28244)

with a dedicated ShardCoordinator termination message
@qux42

This comment has been minimized.

Copy link

@qux42 qux42 commented Nov 29, 2019

Hello,
does this only affect ClusterSharding while using persistence mode? We aren't finished debugging our problem, but we have encountered a something that might be a split-brain of ShardCoordinators too. We are using the ddata mode, though. So we are wondering if it could be the same cause. Using akka 2.5.26

@johanandren johanandren closed this Dec 6, 2019
@patriknw

This comment has been minimized.

Copy link
Member

@patriknw patriknw commented Dec 10, 2019

The fix was also for ddata mode. I think it could be possible for ddata also, but unlikely.

2.5.27 with this fix is released by the way.

navaro1 added a commit to navaro1/akka that referenced this issue Dec 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.