Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync translog to remote on primary activate #10839

Conversation

sachinpkale
Copy link
Member

@sachinpkale sachinpkale commented Oct 23, 2023

Description

  • Currently, we don't guarantee translog sync to remote on primary mode activation.
  • This creates issue in flows similar to one listed below:
    • Index some data with 0 replica
    • wait for snapshot to be created
    • delete index
    • restore from snapshot - wait for index to turn green
    • terminate instance
    • _remote/restore the index — Fails.. shards are not getting assigned. — recovery is failing with exception: java.nio.file.NoSuchFileException: <index path>/4/translog/translog.ckp
  • In this change, we make sure to always sync translog on primary activation.

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 23, 2023

Compatibility status:

Checks if related components are compatible with change 27245ff

Incompatible components

Incompatible components: [https://github.com/opensearch-project/cross-cluster-replication.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git]

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Sachin Kale <kalsac@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Sachin Kale <kalsac@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov
Copy link

codecov bot commented Oct 23, 2023

Codecov Report

Merging #10839 (27245ff) into main (51626d0) will decrease coverage by 0.06%.
Report is 14 commits behind head on main.
The diff coverage is 90.15%.

@@             Coverage Diff              @@
##               main   #10839      +/-   ##
============================================
- Coverage     71.31%   71.25%   -0.06%     
+ Complexity    58671    58667       -4     
============================================
  Files          4860     4869       +9     
  Lines        276335   276451     +116     
  Branches      40198    40198              
============================================
- Hits         197068   196988      -80     
- Misses        62803    63015     +212     
+ Partials      16464    16448      -16     
Files Coverage Δ
...upport/replication/TransportReplicationAction.java 77.10% <100.00%> (-3.58%) ⬇️
...ava/org/opensearch/cluster/node/DiscoveryNode.java 91.62% <100.00%> (+0.17%) ⬆️
...a/org/opensearch/common/network/NetworkModule.java 92.20% <100.00%> (+0.20%) ⬆️
...rg/opensearch/common/settings/ClusterSettings.java 92.85% <ø> (ø)
.../java/org/opensearch/gateway/GatewayMetaState.java 69.25% <100.00%> (+0.73%) ⬆️
...earch/index/remote/RemoteStorePressureService.java 93.22% <ø> (-6.78%) ⬇️
server/src/main/java/org/opensearch/node/Node.java 85.31% <100.00%> (+0.09%) ⬆️
...ting/admissioncontrol/AdmissionControlService.java 100.00% <100.00%> (ø)
...issioncontrol/controllers/AdmissionController.java 100.00% <100.00%> (ø)
...g/admissioncontrol/enums/AdmissionControlMode.java 100.00% <100.00%> (ø)
... and 14 more

... and 451 files with indirect coverage changes

Copy link
Member

@ashking94 ashking94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Sachin Kale <kalsac@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.search.SearchWeightedRoutingIT.testMultiGetWithNetworkDisruption_FailOpenEnabled

@sachinpkale sachinpkale merged commit 6f36752 into opensearch-project:main Oct 24, 2023
18 checks passed
@sachinpkale sachinpkale added the backport 2.x Backport to 2.x branch label Oct 24, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-10839-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6f36752d9e84e95ce2280347cc26b0c9138b2d57
# Push it to GitHub
git push --set-upstream origin backport/backport-10839-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-10839-to-2.x.

sachinpkale added a commit to sachinpkale/OpenSearch that referenced this pull request Oct 24, 2023
---------

Signed-off-by: Sachin Kale <kalsac@amazon.com>
Co-authored-by: Sachin Kale <kalsac@amazon.com>
sachinpkale added a commit to sachinpkale/OpenSearch that referenced this pull request Oct 25, 2023
---------

Signed-off-by: Sachin Kale <kalsac@amazon.com>
Co-authored-by: Sachin Kale <kalsac@amazon.com>
gbbafna pushed a commit that referenced this pull request Oct 25, 2023
---------

Signed-off-by: Sachin Kale <kalsac@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
---------

Signed-off-by: Sachin Kale <kalsac@amazon.com>
Co-authored-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants