[Segment Replication] Update peer recovery logic for segment replication #5344

dreamer-89 · 2022-11-22T21:16:13Z

Signed-off-by: Suraj Singh surajrider@gmail.com

Description

Update the peer recovery logic to work with segment replication. The existing primary relocation is broken because of segment files conflicts. This happens because of target (new primary) is started with InternalEngine writes its own version of segment files which later on conflicts with file copies on replicas (copied from older primary); when this target is promoted as primary.

This change patches the recovery flow by first recovering the primary target with NRTReplicationEngine engine (to prevent target from generating segment files). Post phase 1 file copy operation, the older primary is blocked from performing segment file copy events to replicas (prevent segment file copy to replicas from auto refreshes). Post phase 2 before finalizeRecovery and handoff, the target (new primary) is forced to refresh segment files from older primary (to catch up with other replicas) followed by switching engine to InternalEngine.

Issues Resolved

#5242

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- New functionality has javadoc added
Commits are signed per the DCO using --signoff
Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

github-actions · 2022-11-23T02:16:20Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/7257/
CommitID: 1414c09
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

dreamer-89 · 2022-11-23T16:12:29Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌

URL: https://build.ci.opensearch.org/job/gradle-check/7257/

CommitID: 1414c09
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

~~Looks like a legit test failure. Looking into it.~~ Fixed. Working on adding unit tests around recovery inside IndexShardTests

  2> REPRODUCE WITH: ./gradlew ':server:test' --tests "org.opensearch.index.shard.IndexShardTests.testPrimaryHandOffUpdatesLocalCheckpoint" -Dtests.seed=EF62886CB90D105 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=lv-LV -Dtests.timezone=Atlantic/St_Helena -Druntime.java=19
  2> java.lang.AssertionError: RecoveryFailedException[[index][0]: Recovery failed from {dsEyIzXaub}{dsEyIzXaub}{Lyp3hhGUQ6KLW2cur9rZLQ}{0.0.0.0}{0.0.0.0:72}{dimrs} into {VrZdpzmpJk}{VrZdpzmpJk}{yNPgntL_ROidrq8BPYEYrw}{0.0.0.0}{0.0.0.0:71}{dimrs} ([index][0]: Recovery failed from {VrZdpzmpJk}{VrZdpzmpJk}{yNPgntL_ROidrq8BPYEYrw}{0.0.0.0}{0.0.0.0:71}{dimrs} into {dsEyIzXaub}{dsEyIzXaub}{Lyp3hhGUQ6KLW2cur9rZLQ}{0.0.0.0}{0.0.0.0:72}{dimrs})]; nested: RecoveryFailedException[[index][0]: Recovery failed from {VrZdpzmpJk}{VrZdpzmpJk}{yNPgntL_ROidrq8BPYEYrw}{0.0.0.0}{0.0.0.0:71}{dimrs} into {dsEyIzXaub}{dsEyIzXaub}{Lyp3hhGUQ6KLW2cur9rZLQ}{0.0.0.0}{0.0.0.0:72}{dimrs}]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: IllegalArgumentException[Shard can only be wired as a read only replica with Segment Replication enabled];
        at __randomizedtesting.SeedInfo.seed([EF62886CB90D105:92C2488BA82DB989]:0)
        at org.opensearch.index.shard.IndexShardTestCase$2.onFailure(IndexShardTestCase.java:187)
        at org.opensearch.indices.recovery.RecoveryTarget.notifyListener(RecoveryTarget.java:141)
        at org.opensearch.indices.replication.common.ReplicationTarget.fail(ReplicationTarget.java:175)
        at org.opensearch.index.shard.IndexShardTestCase.recoverUnstartedReplica(IndexShardTestCase.java:899)
        at org.opensearch.index.shard.IndexShardTestCase.recoverReplica(IndexShardTestCase.java:824)
        at org.opensearch.index.shard.IndexShardTestCase.recoverReplica(IndexShardTestCase.java:806)
        at org.opensearch.index.shard.IndexShardTests.testPrimaryHandOffUpdatesLocalCheckpoint(IndexShardTests.java:2431)

...er/src/internalClusterTest/java/org/opensearch/indices/replication/SegmentReplicationIT.java

server/src/main/java/org/opensearch/index/shard/IndexShard.java

server/src/main/java/org/opensearch/indices/recovery/RecoveryTarget.java

server/src/main/java/org/opensearch/index/shard/IndexShard.java

CHANGELOG.md

server/src/main/java/org/opensearch/indices/recovery/LocalStorePeerRecoverySourceHandler.java

ashking94 · 2022-12-07T18:38:45Z

server/src/main/java/org/opensearch/index/shard/CheckpointRefreshListener.java

+        if (didRefresh
+            && shard.state() != IndexShardState.CLOSED
+            && shard.getReplicationTracker().isPrimaryMode()
+            && shard.isBlockInternalCheckPointRefresh() == false) {


Why is primary mode not sufficient for publishing?

Lets assume the condition is correct, how do we ensure that the publisher stops lets say in usecase where once publish starts and then this condition becomes false? The primary term validation is present only for TransportReplicationActions. Is it possible that a particular target node could be receiving segment files from old primary as well a new primary given the afterRefresh method is async. If this understanding is incorrect, pls do help with the concerned code that handles this.

Thanks @ashking94 for raising this point. Today., SegRepTargetService uses events (current shard promoted as primary & shard close) to cancel ongoing replication. When received a new checkpoint, SegRepTargetService also cancels on-going replication happening on a lower primary term (older primary). But, I have observed that during primary relocation, the primary term bump does not happen and thus needs handling.

Thats correct primary term bump does not happen during peer primary recoveries

I think it can be handled in clusterChanged event inside SegmentReplicationSourceService. I will start with a test first. Added one integ test here with other minor cosmetic changes

Discussed this offline with @mch2, will take this up in a separate task. Created #5706 to track this separately.

server/src/main/java/org/opensearch/index/shard/IndexShard.java

server/src/main/java/org/opensearch/indices/recovery/RecoveryTarget.java

Signed-off-by: Suraj Singh <surajrider@gmail.com>

…tService Signed-off-by: Suraj Singh <surajrider@gmail.com>

mch2

LGTM

Signed-off-by: Suraj Singh <surajrider@gmail.com>

dreamer-89 · 2023-01-05T06:15:58Z

@Bukhtawar @ashking94 : Please review the latest changes.

dreamer-89 · 2023-01-06T18:25:24Z

@Bukhtawar : Please feel free to leave any comment in case you have any concerns.

ashking94 · 2023-01-18T07:33:36Z

@dreamer-89 are you planning to backport this PR to 2.x?

dreamer-89 · 2023-01-18T18:08:43Z

@dreamer-89 are you planning to backport this PR to 2.x?

Yes @ashking94. I am planning to backport this along with the fix. Right now, I am testing the fix on actual cluster with setup you mentioned in #5848

andrross · 2023-01-20T16:55:36Z

@dreamer-89 are you planning to backport this PR to 2.x?

I am planning to backport this along with the fix. Right now, I am testing the fix on actual cluster with setup you mentioned in #5848

It is probably best to stay on top of the backports (even if for this case it means backporting a muted integration test). When you don't backport a change that touches the same files as a subsequent change that is to be backported, then the cherry-pick will fail and it will be harder for everyone. #5944 is a backport of a change that touched the new integration test introduced in this PR. I had to resolve a conflict on the cherry-pick to exclude the new file, but it also means when you do backport you'll need to manually apply the subsequent changes to that integration test that happened on main.

opensearch-trigger-bot · 2023-01-20T17:50:22Z

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-5344-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 d56965c59636b18450a48378668f6e2f0f0529e9
# Push it to GitHub
git push --set-upstream origin backport/backport-5344-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-5344-to-2.x.

dreamer-89 · 2023-01-20T18:05:11Z

Thank you @andrross for the comment and apologies for inconvenience. Backport was initially delayed to avoid the change in 2.5.0. I am backporting this change now.

@dreamer-89 are you planning to backport this PR to 2.x?

I am planning to backport this along with the fix. Right now, I am testing the fix on actual cluster with setup you mentioned in #5848

It is probably best to stay on top of the backports (even if for this case it means backporting a muted integration test). When you don't backport a change that touches the same files as a subsequent change that is to be backported, then the cherry-pick will fail and it will be harder for everyone. #5944 is a backport of a change that touched the new integration test introduced in this PR. I had to resolve a conflict on the cherry-pick to exclude the new file, but it also means when you do backport you'll need to manually apply the subsequent changes to that integration test that happened on main.

Thank you @andrross for the comment and apologies for inconvenience. Backport was initially delayed to avoid the change in 2.5.0. I am backporting this change now.

…ion (opensearch-project#5344) * [Segment Replication] Update peer recovery logic for segment replication Signed-off-by: Suraj Singh <surajrider@gmail.com> * Add integration test for indexing during relocation Signed-off-by: Suraj Singh <surajrider@gmail.com> * Address review comments Signed-off-by: Suraj Singh <surajrider@gmail.com> * Spotless check apply fixes & one failing unit test Signed-off-by: Suraj Singh <surajrider@gmail.com> * Delay force segment replication till relocation handoff Signed-off-by: Suraj Singh <surajrider@gmail.com> * Spotless fix Signed-off-by: Suraj Singh <surajrider@gmail.com> * Unit test fix Signed-off-by: Suraj Singh <surajrider@gmail.com> * Address review comment, move force segrep sync handler to SegRepTargetService Signed-off-by: Suraj Singh <surajrider@gmail.com> * Resolve conflicts and enabled tests Signed-off-by: Suraj Singh <surajrider@gmail.com> * Spotless fix Signed-off-by: Suraj Singh <surajrider@gmail.com> Signed-off-by: Suraj Singh <surajrider@gmail.com>

…ion (#5344) (#5959) * [Segment Replication] Update peer recovery logic for segment replication Signed-off-by: Suraj Singh <surajrider@gmail.com> * Add integration test for indexing during relocation Signed-off-by: Suraj Singh <surajrider@gmail.com> * Address review comments Signed-off-by: Suraj Singh <surajrider@gmail.com> * Spotless check apply fixes & one failing unit test Signed-off-by: Suraj Singh <surajrider@gmail.com> * Delay force segment replication till relocation handoff Signed-off-by: Suraj Singh <surajrider@gmail.com> * Spotless fix Signed-off-by: Suraj Singh <surajrider@gmail.com> * Unit test fix Signed-off-by: Suraj Singh <surajrider@gmail.com> * Address review comment, move force segrep sync handler to SegRepTargetService Signed-off-by: Suraj Singh <surajrider@gmail.com> * Resolve conflicts and enabled tests Signed-off-by: Suraj Singh <surajrider@gmail.com> * Spotless fix Signed-off-by: Suraj Singh <surajrider@gmail.com> Signed-off-by: Suraj Singh <surajrider@gmail.com> Signed-off-by: Suraj Singh <surajrider@gmail.com>

…ion (opensearch-project#5344) * [Segment Replication] Update peer recovery logic for segment replication Signed-off-by: Suraj Singh <surajrider@gmail.com> * Add integration test for indexing during relocation Signed-off-by: Suraj Singh <surajrider@gmail.com> * Address review comments Signed-off-by: Suraj Singh <surajrider@gmail.com> * Spotless check apply fixes & one failing unit test Signed-off-by: Suraj Singh <surajrider@gmail.com> * Delay force segment replication till relocation handoff Signed-off-by: Suraj Singh <surajrider@gmail.com> * Spotless fix Signed-off-by: Suraj Singh <surajrider@gmail.com> * Unit test fix Signed-off-by: Suraj Singh <surajrider@gmail.com> * Address review comment, move force segrep sync handler to SegRepTargetService Signed-off-by: Suraj Singh <surajrider@gmail.com> * Resolve conflicts and enabled tests Signed-off-by: Suraj Singh <surajrider@gmail.com> * Spotless fix Signed-off-by: Suraj Singh <surajrider@gmail.com> Signed-off-by: Suraj Singh <surajrider@gmail.com>

dreamer-89 requested review from a team and reta as code owners November 22, 2022 21:16

dreamer-89 changed the title ~~[Segment Replicaiton] Update peer recovery logic for segment replication~~ [Segment Replication] Update peer recovery logic for segment replication Nov 22, 2022

dreamer-89 marked this pull request as draft November 22, 2022 21:16

dreamer-89 force-pushed the fix_primary_relocation branch from 1e127d0 to 4af83a1 Compare November 22, 2022 21:34

This comment was marked as outdated.

Sign in to view

dreamer-89 force-pushed the fix_primary_relocation branch from 4af83a1 to b413298 Compare November 22, 2022 21:42

dreamer-89 requested review from mch2 and andrross November 22, 2022 21:43

This comment was marked as outdated.

Sign in to view

dreamer-89 marked this pull request as ready for review November 22, 2022 21:58

This comment was marked as outdated.

Sign in to view

dreamer-89 force-pushed the fix_primary_relocation branch from 53fee4e to 1414c09 Compare November 23, 2022 01:18

mch2 reviewed Nov 30, 2022

View reviewed changes

ashking94 reviewed Dec 7, 2022

View reviewed changes

server/src/main/java/org/opensearch/index/shard/IndexShard.java Outdated Show resolved Hide resolved

ashking94 reviewed Dec 7, 2022

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

ashking94 reviewed Dec 7, 2022

View reviewed changes

server/src/main/java/org/opensearch/indices/recovery/LocalStorePeerRecoverySourceHandler.java Outdated Show resolved Hide resolved

ashking94 reviewed Dec 7, 2022

View reviewed changes

server/src/main/java/org/opensearch/index/shard/IndexShard.java Outdated Show resolved Hide resolved

ashking94 reviewed Dec 7, 2022

View reviewed changes

server/src/main/java/org/opensearch/indices/recovery/RecoveryTarget.java Outdated Show resolved Hide resolved

dreamer-89 force-pushed the fix_primary_relocation branch from 1414c09 to d7f1217 Compare December 24, 2022 06:11

dreamer-89 requested review from anasalkouz, Bukhtawar, CEHENKLE, dblock, setiah and kartg as code owners December 24, 2022 06:11

dreamer-89 added 3 commits January 4, 2023 16:27

Spotless fix

49ae554

Signed-off-by: Suraj Singh <surajrider@gmail.com>

Unit test fix

0776c28

Signed-off-by: Suraj Singh <surajrider@gmail.com>

Address review comment, move force segrep sync handler to SegRepTarge…

aa5e3f3

…tService Signed-off-by: Suraj Singh <surajrider@gmail.com>

mch2 approved these changes Jan 5, 2023

View reviewed changes

Resolve conflicts and enabled tests

68d16e2

Signed-off-by: Suraj Singh <surajrider@gmail.com>

dreamer-89 force-pushed the fix_primary_relocation branch from 1edd581 to 68d16e2 Compare January 5, 2023 01:24

This comment was marked as outdated.

Sign in to view

Spotless fix

bc21aba

Signed-off-by: Suraj Singh <surajrider@gmail.com>

This comment was marked as outdated.

Sign in to view

dreamer-89 mentioned this pull request Jan 5, 2023

[BUG] FsHealthServiceTests.testFailsHealthOnHungIOBeyondHealthyTimeout (Random test Failure) #1567

Closed

dreamer-89 merged commit d56965c into opensearch-project:main Jan 6, 2023

This was referenced Jan 6, 2023

Fix flaky Segment Replication test testStartReplicaAfterPrimaryIndexesDocs. #5722

Merged

[Segment Replication] [BUG] Primary to primary recovery (relocation) breaks with segment conflicts. #5242

Closed

This was referenced Jan 12, 2023

[BUG] [Segment Replication] Possible data loss (or inconsistencies) during primary relocation and noticed issues with performSegRep within Peer Recovery code #5848

Closed

Handle translog upload during primary relocation for remote-backed indexes #5804

Merged

dreamer-89 mentioned this pull request Jan 19, 2023

[Segment Replication] [BUG] Missing documents post ingestion during primary failover and relocation #5946

Closed

dreamer-89 added the backport 2.x Backport to 2.x branch label Jan 20, 2023

dreamer-89 mentioned this pull request Jan 20, 2023

[Segment Replication] Update peer recovery logic for segment replication #5959

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Segment Replication] Update peer recovery logic for segment replication #5344

[Segment Replication] Update peer recovery logic for segment replication #5344

dreamer-89 commented Nov 22, 2022 •

edited

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

github-actions bot commented Nov 23, 2022

dreamer-89 commented Nov 23, 2022 •

edited

Gradle Check (Jenkins) Run Completed with:

ashking94 Dec 7, 2022 •

edited

dreamer-89 Dec 21, 2022 •

edited

Bukhtawar Dec 27, 2022

dreamer-89 Dec 27, 2022 •

edited

dreamer-89 Jan 4, 2023

mch2 left a comment

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

dreamer-89 commented Jan 5, 2023 •

edited

dreamer-89 commented Jan 6, 2023

ashking94 commented Jan 18, 2023

dreamer-89 commented Jan 18, 2023

andrross commented Jan 20, 2023

opensearch-trigger-bot bot commented Jan 20, 2023

dreamer-89 commented Jan 20, 2023

[Segment Replication] Update peer recovery logic for segment replication #5344

[Segment Replication] Update peer recovery logic for segment replication #5344

Conversation

dreamer-89 commented Nov 22, 2022 • edited

Description

Issues Resolved

Check List

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

github-actions bot commented Nov 23, 2022

Gradle Check (Jenkins) Run Completed with:

dreamer-89 commented Nov 23, 2022 • edited

Gradle Check (Jenkins) Run Completed with:

ashking94 Dec 7, 2022 • edited

Choose a reason for hiding this comment

dreamer-89 Dec 21, 2022 • edited

Choose a reason for hiding this comment

Bukhtawar Dec 27, 2022

Choose a reason for hiding this comment

dreamer-89 Dec 27, 2022 • edited

Choose a reason for hiding this comment

dreamer-89 Jan 4, 2023

Choose a reason for hiding this comment

mch2 left a comment

Choose a reason for hiding this comment

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

dreamer-89 commented Jan 5, 2023 • edited

dreamer-89 commented Jan 6, 2023

ashking94 commented Jan 18, 2023

dreamer-89 commented Jan 18, 2023

andrross commented Jan 20, 2023

opensearch-trigger-bot bot commented Jan 20, 2023

dreamer-89 commented Jan 20, 2023

dreamer-89 commented Nov 22, 2022 •

edited

dreamer-89 commented Nov 23, 2022 •

edited

ashking94 Dec 7, 2022 •

edited

dreamer-89 Dec 21, 2022 •

edited

dreamer-89 Dec 27, 2022 •

edited

dreamer-89 commented Jan 5, 2023 •

edited