Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Segment Replication] Prioritize replica shard movement during shard relocation #8265

Closed
Poojita-Raj opened this issue Jun 26, 2023 · 3 comments · Fixed by #8875
Closed

[Segment Replication] Prioritize replica shard movement during shard relocation #8265

Poojita-Raj opened this issue Jun 26, 2023 · 3 comments · Fixed by #8875
Assignees
Labels
distributed framework enhancement Enhancement or improvement to existing feature or request Indexing:Replication Issues and PRs related to core replication framework eg segrep v2.10.0

Comments

@Poojita-Raj
Copy link
Contributor

Poojita-Raj commented Jun 26, 2023

Is your feature request related to a problem? Please describe.
When some node or set of nodes is excluded, the shards are moved away in random order. When segment replication is enabled for a cluster, we might end up in a mixed version state where replicas will be on lower version and unable to read segments sent from higher version primaries and fail.

To avoid this, we could prioritize replica shard movement to avoid entering this situation.

Describe the solution you'd like
Prioritize replica shard movement (when segment replication is enabled) and entering a mixed version state.

Reference: #1445

Parent issue: #3881

@Poojita-Raj
Copy link
Contributor Author

There is an existing cluster level setting for prioritizing primary shard movement.

public static final Setting<Boolean> SHARD_MOVE_PRIMARY_FIRST_SETTING = Setting.boolSetting(
        "cluster.routing.allocation.move.primary_first",
        false,
        Property.Dynamic,
        Property.NodeScope
    );

In order to introduce replica first movement of shards, these are the approaches we can take:

  1. Introduce a different setting that can take 3 different strategies of shard movement: primary_first, replica_first and no_preference.
public static final Setting<ShardMovementStrategy> SHARD_MOVEMENT_SETTING = new Setting<>(
        "cluster.routing.allocation.move.strategy",
        ShardMovementStrategy.NO_PREFERENCE,
        Property.Dynamic,
        Property.NodeScope
    );
    
public enum ShardMovementStrategy {
   
   PRIMARY_FIRST, 
   REPLICA_FIRST, 
   NO_PREFERENCE;

   }

With the previous setting we have 2 options:

  1. Instead of trying to combine the different options into one setting, we can have 2 separate boolean cluster settings primary_first and replica_first. This leads to some convoluted logic around which setting should have priority over the other and why. Can be explored further if we cannot make any solution from (1) work.

Regardless of approach:

  1. NO_PREFERENCE - which is shard movement of any shard will be the default strategy if no setting is enabled.
  2. This replica_first setting is being introduced to allow mixed cluster version movement for when segment replication is enabled. However, it is a completely optional setting and will not be set by default on segment replication enabled clusters. Customers should be able to specify what they would like.

@Poojita-Raj
Copy link
Contributor Author

After further investigation, it's not possible to keep the SHARD_MOVE_PRIMARY_FIRST_SETTING as the fallback setting for SHARD_MOVEMENT_STRATEGY_SETTING after deprecating it. This is because the type of the two settings do not match. We can only set a fallback setting if they are of same type - since we are moving from a boolean setting to a Enum of ShardMovementStrategy, this is not possible.

We are still deprecating the SHARD_MOVE_PRIMARY_FIRST_SETTING, with the expected behavior of the two settings to be:

If SHARD_MOVEMENT_STRATEGY_SETTING is changed from its default behavior to be either PRIMARY_FIRST or REPLICA_FIRST then we perform this behavior whether or not SHARD_MOVE_PRIMARY_FIRST_SETTING is enabled.

If SHARD_MOVEMENT_STRATEGY_SETTING is still at its default setting of NO_PREFERENCE and SHARD_MOVE_PRIMARY_FIRST_SETTING is enabled we move the primary shards first. This ensures that users still using this setting will not see any changes in behavior.

@Bukhtawar Bukhtawar added the Indexing:Replication Issues and PRs related to core replication framework eg segrep label Jul 27, 2023
@Poojita-Raj
Copy link
Contributor Author

Additions:

  • Add in a check in the NodeVersionAllocationDecider to disallow allocation of a primary shard onto a higher version node in a mixed version node cluster if segment replication is enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distributed framework enhancement Enhancement or improvement to existing feature or request Indexing:Replication Issues and PRs related to core replication framework eg segrep v2.10.0
Projects
None yet
4 participants