Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail replica shards locally upon failures #5847

Closed
wants to merge 2 commits into from

Commits on Apr 17, 2014

  1. Fail replica shards locally upon failures

    When a replication operation (index/delete/update) fails to be executed properly, we fail the replica and allow master to allocate a new copy of it. At the moment, the node hosting the primary shard is responsible of notifying the master of a failed replica. However, if the replica shard is initializing (`POST_RECOVERY` state), we have a racing condition between the failed shard message and moving the shard into the `STARTED` state. If the latter happen first, master will fail to resolve the fail shard message.
    
    This PR builds on elastic#5800 and fails the engine of the replica shard if a replication operation fails. This protects us against the above as the shard will reject the `STARTED` command from master. It also makes us more resilient to other racing conditions in this area.
    bleskes committed Apr 17, 2014
    Configuration menu
    Copy the full SHA
    2875dd4 View commit details
    Browse the repository at this point in the history

Commits on Apr 18, 2014

  1. removed duplicate logging message and made sure the information that …

    …it contained is passed on
    bleskes committed Apr 18, 2014
    Configuration menu
    Copy the full SHA
    40ff440 View commit details
    Browse the repository at this point in the history