Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid losing ops in file-based recovery #22945

Merged
merged 4 commits into from Feb 3, 2017

Conversation

Projects
None yet
3 participants
@jasontedor
Copy link
Member

commented Feb 2, 2017

When a primary is relocated from an old node to a new node, it can have ops in its translog that do not have a sequence number assigned. When a file-based recovery is started, this can lead to skipping these ops when replaying the translog due to a bug in the recovery logic. This commit addresses this bug and adds a test in the BWC tests.

Relates #22484

Avoid losing ops in file-based recovery
When a primary is relocated from an old node to a new node, it can have
ops in its translog that do not have a sequence number assigned. When a
file-based recovery is started, this can lead to skipping these ops when
replaying the translog due to a bug in the recovery logic. This commit
addresses this bug and adds a test in the BWC tests.
@bleskes
Copy link
Member

left a comment

This LGTM. It would be great to have tests added in RecoverySourceHandlerTests .

assertOK(response);
final InputStream content = response.getEntity().getContent();
final int actualCount =
Integer.parseInt(XContentHelper.convertToMap(JsonXContent.jsonXContent, content, false).get("count").toString());

This comment has been minimized.

Copy link
@bleskes

bleskes Feb 2, 2017

Member

Any chance to use something like ObjectPath.evaluate(shard, "seq_no.local_checkpoint")?

This comment has been minimized.

Copy link
@jasontedor

jasontedor Feb 2, 2017

Author Member

I pushed 4430caa.

assertSeqNoOnShards(nodes, checkGlobalCheckpoints, 0, newNodeClient);

logger.info("allowing shards on all nodes");
updateIndexSetting(index, Settings.builder().putNull("index.routing.allocation.include._name"));
ensureGreen();

This comment has been minimized.

Copy link
@bleskes

bleskes Feb 2, 2017

Member

shall we add assert counts here too?

This comment has been minimized.

Copy link
@jasontedor

jasontedor Feb 2, 2017

Author Member

I pushed 0112f90.

@jasontedor

This comment has been minimized.

Copy link
Member Author

commented Feb 2, 2017

@bleskes I pushed a unit tests in bb8884a.

@jasontedor jasontedor force-pushed the jasontedor:file-based-lost-ops branch to bb8884a Feb 2, 2017

jasontedor added some commits Feb 2, 2017

@jasontedor

This comment has been minimized.

Copy link
Member Author

commented Feb 2, 2017

Thanks @bleskes. I've pushed commits in response to your comments.

@bleskes

bleskes approved these changes Feb 3, 2017

Copy link
Member

left a comment

awesome

@bleskes

This comment has been minimized.

Copy link
Member

commented Feb 3, 2017

test this please

1 similar comment
@bleskes

This comment has been minimized.

Copy link
Member

commented Feb 3, 2017

test this please

@jasontedor jasontedor merged commit 6e99402 into elastic:master Feb 3, 2017

1 of 2 checks passed

elasticsearch-ci Build finished.
Details
CLA Commit author is a member of Elasticsearch
Details

@jasontedor jasontedor deleted the jasontedor:file-based-lost-ops branch Feb 3, 2017

@jasontedor

This comment has been minimized.

Copy link
Member Author

commented Feb 3, 2017

Thanks @bleskes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.