Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flush immediately after a remote recovery finishes (unless there are ongoing ones) #9439

Closed
wants to merge 3 commits into from

Conversation

bleskes
Copy link
Contributor

@bleskes bleskes commented Jan 27, 2015

To properly replicate, we currently stop flushing during recovery so we can repay the translog once copying files are done. Once recovery is done, the translog will be flushed by a background thread that, by default, kicks in every 5s. In case of a recovery failure and a quick re-assignment of a new shard copy, we may fail to flush before starting a new recovery, causing it to deal with potentially even longer translog. This commit makes sure we flush immediately when the ongoing recovery count goes to 0.

I also added a simple recovery benchmark.

…there are ongoing ones)

To properly replicate, we currently stop flushing during recovery so we can repay the translog once copying files are done. Once recovery is done, the translog will be flushed by a background thread that, by default, kicks in every 5s. In case of a recovery failure and a quick re-assignment of a new shard copy, we may fail to flush before starting a new recovery, causing it to deal with potentially even longer translog. This commit makes sure we flush immediately when the ongoing recovery count goes to 0.

  I also added a simple recovery benchmark.
@bleskes bleskes added v1.4.3 v1.5.0 v2.0.0-beta1 :Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. >enhancement labels Jan 27, 2015
assert onGoingRecoveries.get() >= 0 : "ongoingRecoveries must be >= 0 but was: " + onGoingRecoveries.get();
if (left == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't understand the concurrency here, but I just wanted to verify that there is no risk that another recovery kicks off right here, after a first recovery just finished and dropped the count to 0, and then we clear the xlog out from under that second recovery, losing documents?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see, looks good. Thanks @bleskes

@mikemccand
Copy link
Contributor

LGTM, but I left one concurrency question...

@imotov
Copy link
Contributor

imotov commented Jan 27, 2015

LGTM

assert onGoingRecoveries.get() >= 0 : "ongoingRecoveries must be >= 0 but was: " + onGoingRecoveries.get();
if (left == 0) {
try {
flush(new Engine.Flush().type(Flush.Type.COMMIT_TRANSLOG));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the 1.x and master future-port, how are you going to implement the flush? Are you going to set force and waitIfOngoing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both can be left at their defaults? force=false waitIfOngoing=false. It means there is already an ongoing flush or there is nothing to do. In both cases, no need for an extra flush to reduce the translog.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me

@dakrone
Copy link
Member

dakrone commented Jan 27, 2015

LGTM, left one comment

bleskes added a commit that referenced this pull request Jan 28, 2015
…there are ongoing ones)

To properly replicate, we currently stop flushing during recovery so we can repay the translog once copying files are done. Once recovery is done, the translog will be flushed by a background thread that, by default, kicks in every 5s. In case of a recovery failure and a quick re-assignment of a new shard copy, we may fail to flush before starting a new recovery, causing it to deal with potentially even longer translog. This commit makes sure we flush immediately when the ongoing recovery count goes to 0.

I also added a simple recovery benchmark.

Closes #9439
bleskes added a commit that referenced this pull request Jan 28, 2015
…there are ongoing ones)

To properly replicate, we currently stop flushing during recovery so we can repay the translog once copying files are done. Once recovery is done, the translog will be flushed by a background thread that, by default, kicks in every 5s. In case of a recovery failure and a quick re-assignment of a new shard copy, we may fail to flush before starting a new recovery, causing it to deal with potentially even longer translog. This commit makes sure we flush immediately when the ongoing recovery count goes to 0.

I also added a simple recovery benchmark.

Closes #9439
@bleskes bleskes closed this in 22a576d Jan 28, 2015
@bleskes bleskes deleted the recovery_benchmark branch January 28, 2015 08:15
bleskes added a commit to bleskes/elasticsearch that referenced this pull request Feb 19, 2015
When we clean up after recoveries, we currently close the recovery counter first, followed up by the different snapshots. Since the recovery counter may issue a flush (introduced in elastic#9439) , the snapshot references prevent the flush from cleaning up the current translog file. This commit changes the order of the close command.

Note: this is not a problem on master, as we moved the translog delete logic, making it kick in if needed when the ref counter goes to 0.
bleskes added a commit to bleskes/elasticsearch that referenced this pull request Feb 27, 2015
When we clean up after recoveries, we currently close the recovery counter first, followed up by the different snapshots. Since the recovery counter may issue a flush (introduced in elastic#9439) , the snapshot references prevent the flush from cleaning up the current translog file. This commit changes the order of the close command.

Note: this is not a problem on master, as we moved the translog delete logic, making it kick in if needed when the ref counter goes to 0.

Closes elastic#9760
bleskes added a commit to bleskes/elasticsearch that referenced this pull request Feb 27, 2015
When we clean up after recoveries, we currently close the recovery counter first, followed up by the different snapshots. Since the recovery counter may issue a flush (introduced in elastic#9439) , the snapshot references prevent the flush from cleaning up the current translog file. This commit changes the order of the close command.

Note: this is not a problem on master, as we moved the translog delete logic, making it kick in if needed when the ref counter goes to 0.

Closes elastic#9760
@clintongormley clintongormley changed the title Recovery: flush immediately after a remote recovery finishes (unless there are ongoing ones) Flush immediately after a remote recovery finishes (unless there are ongoing ones) Jun 7, 2015
mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015
…there are ongoing ones)

To properly replicate, we currently stop flushing during recovery so we can repay the translog once copying files are done. Once recovery is done, the translog will be flushed by a background thread that, by default, kicks in every 5s. In case of a recovery failure and a quick re-assignment of a new shard copy, we may fail to flush before starting a new recovery, causing it to deal with potentially even longer translog. This commit makes sure we flush immediately when the ongoing recovery count goes to 0.

I also added a simple recovery benchmark.

Closes elastic#9439
mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015
When we clean up after recoveries, we currently close the recovery counter first, followed up by the different snapshots. Since the recovery counter may issue a flush (introduced in elastic#9439) , the snapshot references prevent the flush from cleaning up the current translog file. This commit changes the order of the close command.

Note: this is not a problem on master, as we moved the translog delete logic, making it kick in if needed when the ref counter goes to 0.

Closes elastic#9760
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Recovery Anything around constructing a new shard, either from a local or a remote source. >enhancement resiliency v1.4.3 v1.5.0 v2.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants