Flush immediately after a remote recovery finishes (unless there are ongoing ones) #9439

Closed
wants to merge 3 commits into
from

Projects

None yet

4 participants

@bleskes
Member
bleskes commented Jan 27, 2015

To properly replicate, we currently stop flushing during recovery so we can repay the translog once copying files are done. Once recovery is done, the translog will be flushed by a background thread that, by default, kicks in every 5s. In case of a recovery failure and a quick re-assignment of a new shard copy, we may fail to flush before starting a new recovery, causing it to deal with potentially even longer translog. This commit makes sure we flush immediately when the ongoing recovery count goes to 0.

I also added a simple recovery benchmark.

bleskes added some commits Jan 26, 2015
@bleskes bleskes Recovery: flush immediately after a remote recovery finishes (unless …
…there are ongoing ones)

To properly replicate, we currently stop flushing during recovery so we can repay the translog once copying files are done. Once recovery is done, the translog will be flushed by a background thread that, by default, kicks in every 5s. In case of a recovery failure and a quick re-assignment of a new shard copy, we may fail to flush before starting a new recovery, causing it to deal with potentially even longer translog. This commit makes sure we flush immediately when the ongoing recovery count goes to 0.

  I also added a simple recovery benchmark.
38ed2e7
@bleskes bleskes minor tweak 9c846b6
@mikemccand mikemccand commented on the diff Jan 27, 2015
...asticsearch/index/engine/internal/InternalEngine.java
assert onGoingRecoveries.get() >= 0 : "ongoingRecoveries must be >= 0 but was: " + onGoingRecoveries.get();
+ if (left == 0) {
@mikemccand
mikemccand Jan 27, 2015 Contributor

I really don't understand the concurrency here, but I just wanted to verify that there is no risk that another recovery kicks off right here, after a first recovery just finished and dropped the count to 0, and then we clear the xlog out from under that second recovery, losing documents?

@mikemccand
mikemccand Jan 27, 2015 Contributor

OK, I see, looks good. Thanks @bleskes

@mikemccand
Contributor

LGTM, but I left one concurrency question...

@imotov
Member
imotov commented Jan 27, 2015

LGTM

@dakrone dakrone commented on the diff Jan 27, 2015
...asticsearch/index/engine/internal/InternalEngine.java
assert onGoingRecoveries.get() >= 0 : "ongoingRecoveries must be >= 0 but was: " + onGoingRecoveries.get();
+ if (left == 0) {
+ try {
+ flush(new Engine.Flush().type(Flush.Type.COMMIT_TRANSLOG));
@dakrone
dakrone Jan 27, 2015 Member

For the 1.x and master future-port, how are you going to implement the flush? Are you going to set force and waitIfOngoing?

@bleskes
bleskes Jan 27, 2015 Member

I think both can be left at their defaults? force=false waitIfOngoing=false. It means there is already an ongoing flush or there is nothing to do. In both cases, no need for an extra flush to reduce the translog.

@dakrone
dakrone Jan 27, 2015 Member

sounds good to me

@dakrone
Member
dakrone commented Jan 27, 2015

LGTM, left one comment

@bleskes bleskes added a commit that referenced this pull request Jan 28, 2015
@bleskes bleskes Recovery: flush immediately after a remote recovery finishes (unless …
…there are ongoing ones)

To properly replicate, we currently stop flushing during recovery so we can repay the translog once copying files are done. Once recovery is done, the translog will be flushed by a background thread that, by default, kicks in every 5s. In case of a recovery failure and a quick re-assignment of a new shard copy, we may fail to flush before starting a new recovery, causing it to deal with potentially even longer translog. This commit makes sure we flush immediately when the ongoing recovery count goes to 0.

I also added a simple recovery benchmark.

Closes #9439
30b48cd
@bleskes bleskes added a commit that referenced this pull request Jan 28, 2015
@bleskes bleskes Recovery: flush immediately after a remote recovery finishes (unless …
…there are ongoing ones)

To properly replicate, we currently stop flushing during recovery so we can repay the translog once copying files are done. Once recovery is done, the translog will be flushed by a background thread that, by default, kicks in every 5s. In case of a recovery failure and a quick re-assignment of a new shard copy, we may fail to flush before starting a new recovery, causing it to deal with potentially even longer translog. This commit makes sure we flush immediately when the ongoing recovery count goes to 0.

I also added a simple recovery benchmark.

Closes #9439
701fbc2
@bleskes bleskes added a commit that closed this pull request Jan 28, 2015
@bleskes bleskes Recovery: flush immediately after a remote recovery finishes (unless …
…there are ongoing ones)

To properly replicate, we currently stop flushing during recovery so we can repay the translog once copying files are done. Once recovery is done, the translog will be flushed by a background thread that, by default, kicks in every 5s. In case of a recovery failure and a quick re-assignment of a new shard copy, we may fail to flush before starting a new recovery, causing it to deal with potentially even longer translog. This commit makes sure we flush immediately when the ongoing recovery count goes to 0.

I also added a simple recovery benchmark.

Closes #9439
22a576d
@bleskes bleskes closed this in 22a576d Jan 28, 2015
@bleskes bleskes deleted the bleskes:recovery_benchmark branch Jan 28, 2015
@bleskes bleskes added the resiliency label Feb 2, 2015
@bleskes bleskes added a commit to bleskes/elasticsearch that referenced this pull request Feb 19, 2015
@bleskes bleskes Engine: close snapshots before recovery counter
When we clean up after recoveries, we currently close the recovery counter first, followed up by the different snapshots. Since the recovery counter may issue a flush (introduced in #9439) , the snapshot references prevent the flush from cleaning up the current translog file. This commit changes the order of the close command.

Note: this is not a problem on master, as we moved the translog delete logic, making it kick in if needed when the ref counter goes to 0.
5311b56
@bleskes bleskes added a commit to bleskes/elasticsearch that referenced this pull request Feb 27, 2015
@bleskes bleskes Engine: close snapshots before recovery counter
When we clean up after recoveries, we currently close the recovery counter first, followed up by the different snapshots. Since the recovery counter may issue a flush (introduced in #9439) , the snapshot references prevent the flush from cleaning up the current translog file. This commit changes the order of the close command.

Note: this is not a problem on master, as we moved the translog delete logic, making it kick in if needed when the ref counter goes to 0.

Closes #9760
f27b08b
@bleskes bleskes added a commit to bleskes/elasticsearch that referenced this pull request Feb 27, 2015
@bleskes bleskes Engine: close snapshots before recovery counter
When we clean up after recoveries, we currently close the recovery counter first, followed up by the different snapshots. Since the recovery counter may issue a flush (introduced in #9439) , the snapshot references prevent the flush from cleaning up the current translog file. This commit changes the order of the close command.

Note: this is not a problem on master, as we moved the translog delete logic, making it kick in if needed when the ref counter goes to 0.

Closes #9760
897919d
@clintongormley clintongormley changed the title from Recovery: flush immediately after a remote recovery finishes (unless there are ongoing ones) to Flush immediately after a remote recovery finishes (unless there are ongoing ones) Jun 7, 2015
@mute mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015
@bleskes bleskes Recovery: flush immediately after a remote recovery finishes (unless …
…there are ongoing ones)

To properly replicate, we currently stop flushing during recovery so we can repay the translog once copying files are done. Once recovery is done, the translog will be flushed by a background thread that, by default, kicks in every 5s. In case of a recovery failure and a quick re-assignment of a new shard copy, we may fail to flush before starting a new recovery, causing it to deal with potentially even longer translog. This commit makes sure we flush immediately when the ongoing recovery count goes to 0.

I also added a simple recovery benchmark.

Closes #9439
7a80059
@mute mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015
@bleskes bleskes Engine: close snapshots before recovery counter
When we clean up after recoveries, we currently close the recovery counter first, followed up by the different snapshots. Since the recovery counter may issue a flush (introduced in #9439) , the snapshot references prevent the flush from cleaning up the current translog file. This commit changes the order of the close command.

Note: this is not a problem on master, as we moved the translog delete logic, making it kick in if needed when the ref counter goes to 0.

Closes #9760
a71231e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment