Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Segment Replication] Update peer recovery logic for segment replication #5344

Merged
merged 10 commits into from
Jan 6, 2023

Conversation

dreamer-89
Copy link
Member

@dreamer-89 dreamer-89 commented Nov 22, 2022

Signed-off-by: Suraj Singh surajrider@gmail.com

Description

Update the peer recovery logic to work with segment replication. The existing primary relocation is broken because of segment files conflicts. This happens because of target (new primary) is started with InternalEngine writes its own version of segment files which later on conflicts with file copies on replicas (copied from older primary); when this target is promoted as primary.

This change patches the recovery flow by first recovering the primary target with NRTReplicationEngine engine (to prevent target from generating segment files). Post phase 1 file copy operation, the older primary is blocked from performing segment file copy events to replicas (prevent segment file copy to replicas from auto refreshes). Post phase 2 before finalizeRecovery and handoff, the target (new primary) is forced to refresh segment files from older primary (to catch up with other replicas) followed by switching engine to InternalEngine.

Issues Resolved

#5242

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@dreamer-89 dreamer-89 requested review from a team and reta as code owners November 22, 2022 21:16
@dreamer-89 dreamer-89 changed the title [Segment Replicaiton] Update peer recovery logic for segment replication [Segment Replication] Update peer recovery logic for segment replication Nov 22, 2022
@dreamer-89 dreamer-89 marked this pull request as draft November 22, 2022 21:16
@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@dreamer-89 dreamer-89 marked this pull request as ready for review November 22, 2022 21:58
@github-actions

This comment was marked as outdated.

@github-actions

This comment was marked as outdated.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@dreamer-89
Copy link
Member Author

dreamer-89 commented Nov 23, 2022

Gradle Check (Jenkins) Run Completed with:

Looks like a legit test failure. Looking into it. Fixed. Working on adding unit tests around recovery inside IndexShardTests

  2> REPRODUCE WITH: ./gradlew ':server:test' --tests "org.opensearch.index.shard.IndexShardTests.testPrimaryHandOffUpdatesLocalCheckpoint" -Dtests.seed=EF62886CB90D105 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=lv-LV -Dtests.timezone=Atlantic/St_Helena -Druntime.java=19
  2> java.lang.AssertionError: RecoveryFailedException[[index][0]: Recovery failed from {dsEyIzXaub}{dsEyIzXaub}{Lyp3hhGUQ6KLW2cur9rZLQ}{0.0.0.0}{0.0.0.0:72}{dimrs} into {VrZdpzmpJk}{VrZdpzmpJk}{yNPgntL_ROidrq8BPYEYrw}{0.0.0.0}{0.0.0.0:71}{dimrs} ([index][0]: Recovery failed from {VrZdpzmpJk}{VrZdpzmpJk}{yNPgntL_ROidrq8BPYEYrw}{0.0.0.0}{0.0.0.0:71}{dimrs} into {dsEyIzXaub}{dsEyIzXaub}{Lyp3hhGUQ6KLW2cur9rZLQ}{0.0.0.0}{0.0.0.0:72}{dimrs})]; nested: RecoveryFailedException[[index][0]: Recovery failed from {VrZdpzmpJk}{VrZdpzmpJk}{yNPgntL_ROidrq8BPYEYrw}{0.0.0.0}{0.0.0.0:71}{dimrs} into {dsEyIzXaub}{dsEyIzXaub}{Lyp3hhGUQ6KLW2cur9rZLQ}{0.0.0.0}{0.0.0.0:72}{dimrs}]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: IllegalArgumentException[Shard can only be wired as a read only replica with Segment Replication enabled];
        at __randomizedtesting.SeedInfo.seed([EF62886CB90D105:92C2488BA82DB989]:0)
        at org.opensearch.index.shard.IndexShardTestCase$2.onFailure(IndexShardTestCase.java:187)
        at org.opensearch.indices.recovery.RecoveryTarget.notifyListener(RecoveryTarget.java:141)
        at org.opensearch.indices.replication.common.ReplicationTarget.fail(ReplicationTarget.java:175)
        at org.opensearch.index.shard.IndexShardTestCase.recoverUnstartedReplica(IndexShardTestCase.java:899)
        at org.opensearch.index.shard.IndexShardTestCase.recoverReplica(IndexShardTestCase.java:824)
        at org.opensearch.index.shard.IndexShardTestCase.recoverReplica(IndexShardTestCase.java:806)
        at org.opensearch.index.shard.IndexShardTests.testPrimaryHandOffUpdatesLocalCheckpoint(IndexShardTests.java:2431)

CHANGELOG.md Outdated Show resolved Hide resolved
if (didRefresh
&& shard.state() != IndexShardState.CLOSED
&& shard.getReplicationTracker().isPrimaryMode()
&& shard.isBlockInternalCheckPointRefresh() == false) {
Copy link
Member

@ashking94 ashking94 Dec 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is primary mode not sufficient for publishing?

Lets assume the condition is correct, how do we ensure that the publisher stops lets say in usecase where once publish starts and then this condition becomes false? The primary term validation is present only for TransportReplicationActions. Is it possible that a particular target node could be receiving segment files from old primary as well a new primary given the afterRefresh method is async. If this understanding is incorrect, pls do help with the concerned code that handles this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ashking94 for raising this point. Today., SegRepTargetService uses events (current shard promoted as primary & shard close) to cancel ongoing replication. When received a new checkpoint, SegRepTargetService also cancels on-going replication happening on a lower primary term (older primary). But, I have observed that during primary relocation, the primary term bump does not happen and thus needs handling.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats correct primary term bump does not happen during peer primary recoveries

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can be handled in clusterChanged event inside SegmentReplicationSourceService. I will start with a test first. Added one integ test here with other minor cosmetic changes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this offline with @mch2, will take this up in a separate task. Created #5706 to track this separately.

Signed-off-by: Suraj Singh <surajrider@gmail.com>
Signed-off-by: Suraj Singh <surajrider@gmail.com>
…tService

Signed-off-by: Suraj Singh <surajrider@gmail.com>
Copy link
Member

@mch2 mch2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Suraj Singh <surajrider@gmail.com>
@github-actions

This comment was marked as outdated.

Signed-off-by: Suraj Singh <surajrider@gmail.com>
@github-actions

This comment was marked as outdated.

@dreamer-89

This comment was marked as outdated.

@dreamer-89
Copy link
Member Author

dreamer-89 commented Jan 5, 2023

@Bukhtawar @ashking94 : Please review the latest changes.

@dreamer-89 dreamer-89 merged commit d56965c into opensearch-project:main Jan 6, 2023
@dreamer-89
Copy link
Member Author

@Bukhtawar : Please feel free to leave any comment in case you have any concerns.

@ashking94
Copy link
Member

@dreamer-89 are you planning to backport this PR to 2.x?

@dreamer-89
Copy link
Member Author

@dreamer-89 are you planning to backport this PR to 2.x?

Yes @ashking94. I am planning to backport this along with the fix. Right now, I am testing the fix on actual cluster with setup you mentioned in #5848

@andrross
Copy link
Member

@dreamer-89 are you planning to backport this PR to 2.x?

I am planning to backport this along with the fix. Right now, I am testing the fix on actual cluster with setup you mentioned in #5848

It is probably best to stay on top of the backports (even if for this case it means backporting a muted integration test). When you don't backport a change that touches the same files as a subsequent change that is to be backported, then the cherry-pick will fail and it will be harder for everyone. #5944 is a backport of a change that touched the new integration test introduced in this PR. I had to resolve a conflict on the cherry-pick to exclude the new file, but it also means when you do backport you'll need to manually apply the subsequent changes to that integration test that happened on main.

@dreamer-89 dreamer-89 added the backport 2.x Backport to 2.x branch label Jan 20, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-5344-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 d56965c59636b18450a48378668f6e2f0f0529e9
# Push it to GitHub
git push --set-upstream origin backport/backport-5344-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-5344-to-2.x.

@dreamer-89
Copy link
Member Author

Thank you @andrross for the comment and apologies for inconvenience. Backport was initially delayed to avoid the change in 2.5.0. I am backporting this change now.

@dreamer-89 are you planning to backport this PR to 2.x?

I am planning to backport this along with the fix. Right now, I am testing the fix on actual cluster with setup you mentioned in #5848

It is probably best to stay on top of the backports (even if for this case it means backporting a muted integration test). When you don't backport a change that touches the same files as a subsequent change that is to be backported, then the cherry-pick will fail and it will be harder for everyone. #5944 is a backport of a change that touched the new integration test introduced in this PR. I had to resolve a conflict on the cherry-pick to exclude the new file, but it also means when you do backport you'll need to manually apply the subsequent changes to that integration test that happened on main.

Thank you @andrross for the comment and apologies for inconvenience. Backport was initially delayed to avoid the change in 2.5.0. I am backporting this change now.

dreamer-89 added a commit to dreamer-89/OpenSearch that referenced this pull request Jan 20, 2023
…ion (opensearch-project#5344)

* [Segment Replication] Update peer recovery logic for segment replication

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Add integration test for indexing during relocation

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comments

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless check apply fixes & one failing unit test

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Delay force segment replication till relocation handoff

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Unit test fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comment, move force segrep sync handler to SegRepTargetService

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Resolve conflicts and enabled tests

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

Signed-off-by: Suraj Singh <surajrider@gmail.com>
kotwanikunal pushed a commit that referenced this pull request Jan 20, 2023
…ion (#5344) (#5959)

* [Segment Replication] Update peer recovery logic for segment replication

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Add integration test for indexing during relocation

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comments

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless check apply fixes & one failing unit test

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Delay force segment replication till relocation handoff

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Unit test fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comment, move force segrep sync handler to SegRepTargetService

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Resolve conflicts and enabled tests

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

Signed-off-by: Suraj Singh <surajrider@gmail.com>

Signed-off-by: Suraj Singh <surajrider@gmail.com>
kotwanikunal pushed a commit that referenced this pull request Jan 25, 2023
…ion (#5344) (#5959)

* [Segment Replication] Update peer recovery logic for segment replication

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Add integration test for indexing during relocation

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comments

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless check apply fixes & one failing unit test

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Delay force segment replication till relocation handoff

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Unit test fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comment, move force segrep sync handler to SegRepTargetService

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Resolve conflicts and enabled tests

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

Signed-off-by: Suraj Singh <surajrider@gmail.com>

Signed-off-by: Suraj Singh <surajrider@gmail.com>
mch2 pushed a commit to mch2/OpenSearch that referenced this pull request Mar 4, 2023
…ion (opensearch-project#5344)

* [Segment Replication] Update peer recovery logic for segment replication

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Add integration test for indexing during relocation

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comments

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless check apply fixes & one failing unit test

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Delay force segment replication till relocation handoff

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Unit test fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Address review comment, move force segrep sync handler to SegRepTargetService

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Resolve conflicts and enabled tests

Signed-off-by: Suraj Singh <surajrider@gmail.com>

* Spotless fix

Signed-off-by: Suraj Singh <surajrider@gmail.com>

Signed-off-by: Suraj Singh <surajrider@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants