Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Same translog metadata file uploaded from old primary during a race condition #11322

Closed
ashking94 opened this issue Nov 24, 2023 · 2 comments
Assignees
Labels
bug Something isn't working Storage:Durability Issues and PRs related to the durability framework Storage:Remote v2.12.0 Issues and PRs related to version 2.12.0

Comments

@ashking94
Copy link
Member

Describe the bug
Across different nodes, the combination of primary term, translog generation has to be unique for the translog metadata file name.
There is a bug where the old primary can still upload a translog metadata file which has same primary term and generation which is generated as part of the relocation handoff by the new primary. This happens when there is any internal or background flush triggered around the same time as the relocation handoff but just before the primary mode becomes false on the old primary. In the cases where we found the issue, the internal flush was triggered due to no writes happening in last 5 mins on a shard and the relocation happening around the same time as of the internal flush.

public void flushOnIdle(long inactiveTimeNS) {
Engine engineOrNull = getEngineOrNull();
if (engineOrNull != null && System.nanoTime() - engineOrNull.getLastWriteNanos() >= inactiveTimeNS) {
boolean wasActive = active.getAndSet(false);
if (wasActive) {
logger.debug("flushing shard on inactive");
threadPool.executor(ThreadPool.Names.FLUSH).execute(new AbstractRunnable() {
@Override
public void onFailure(Exception e) {
if (state != IndexShardState.CLOSED) {
logger.warn("failed to flush shard on inactive", e);
}
}
@Override
protected void doRun() {
flush(new FlushRequest().waitIfOngoing(false).force(false));
periodicFlushMetric.inc();
}
});
}
}
}

To Reproduce
This is very difficult to reproduce and shows up at very high scale. However, we can still attempt to reproduce by creating mutliple indexes and triggering the relocation around the 5th minute of no write on the shard.

Expected behavior
The old primary must not upload once the control reaches the handoff stage.

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@ashking94 ashking94 added bug Something isn't working untriaged labels Nov 24, 2023
@ashking94 ashking94 self-assigned this Nov 24, 2023
@ashking94 ashking94 added Storage:Durability Issues and PRs related to the durability framework Storage:Remote v2.12.0 Issues and PRs related to version 2.12.0 and removed untriaged labels Nov 24, 2023
@kiranprakash154
Copy link
Contributor

Hi, are we on track for this to be released in 2.12 ?

@ashking94
Copy link
Member Author

This has been solved, the PR is referenced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Storage:Durability Issues and PRs related to the durability framework Storage:Remote v2.12.0 Issues and PRs related to version 2.12.0
Projects
None yet
Development

No branches or pull requests

2 participants