Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Engine: close snapshots before recovery counter #9760

Closed
wants to merge 2 commits into from

Conversation

@bleskes
Copy link
Member

commented Feb 19, 2015

When we clean up after recoveries, we currently close the recovery counter first, followed up by the different snapshots. Since the recovery counter may issue a flush (introduced in #9439) , the snapshot references prevent the flush from deleting the current translog file once it's no longer needed.

Note: this is not a problem on master, as we moved the translog delete logic, making it kick in if needed
when the ref counter goes to 0.

relates to #9226 (comment)

Engine: close snapshots before recovery counter
When we clean up after recoveries, we currently close the recovery counter first, followed up by the different snapshots. Since the recovery counter may issue a flush (introduced in #9439) , the snapshot references prevent the flush from cleaning up the current translog file. This commit changes the order of the close command.

Note: this is not a problem on master, as we moved the translog delete logic, making it kick in if needed when the ref counter goes to 0.

@bleskes bleskes added v1.5.0 v1.4.4 >bug v1.4.5 and removed v1.4.4 labels Feb 19, 2015

@kimchy

This comment has been minimized.

Copy link
Member

commented Feb 19, 2015

can we do next to the code the reason for the importance of close ordering?

@bleskes

This comment has been minimized.

Copy link
Member Author

commented Feb 19, 2015

@kimchy comments added.

phase2Snapshot, phase3Snapshot); // hmm why can't we use try-with here?
// close the snapshots first to release the reference to the translog file, so a flush post recovery can delete it
Releasables.close(success, phase1Snapshot, phase2Snapshot, phase3Snapshot,
onGoingRecoveries, writeLock); // hmm why can't we use try-with here?

This comment has been minimized.

Copy link
@s1monw

s1monw Feb 24, 2015

Contributor

we can do try-with but we need to have 2 try blocks. since the write lock needs to be released last. but in a try / with blokc it's released before the finally block is executed

This comment has been minimized.

Copy link
@s1monw

s1monw Feb 24, 2015

Contributor

it in-fact might make sense to have the write lock in an outer try catch to ensure it's released last?

This comment has been minimized.

Copy link
@bleskes

bleskes Feb 24, 2015

Author Member

The problem with try with resources clause is that we release the phase1 and phase2 snapshots which are defined earlier in the scope. I can make the write lock use a try-with but this requires another catch to convert the exception the close method throws into a RecoveryEngineException. I'm not sure it's worth it.

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Feb 24, 2015

left a minor comment

@bleskes

This comment has been minimized.

Copy link
Member Author

commented Feb 27, 2015

@s1monw ping?

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Feb 27, 2015

ok fair enough LGTM then

bleskes added a commit to bleskes/elasticsearch that referenced this pull request Feb 27, 2015
Engine: close snapshots before recovery counter
When we clean up after recoveries, we currently close the recovery counter first, followed up by the different snapshots. Since the recovery counter may issue a flush (introduced in elastic#9439) , the snapshot references prevent the flush from cleaning up the current translog file. This commit changes the order of the close command.

Note: this is not a problem on master, as we moved the translog delete logic, making it kick in if needed when the ref counter goes to 0.

Closes elastic#9760
bleskes added a commit to bleskes/elasticsearch that referenced this pull request Feb 27, 2015
Engine: close snapshots before recovery counter
When we clean up after recoveries, we currently close the recovery counter first, followed up by the different snapshots. Since the recovery counter may issue a flush (introduced in elastic#9439) , the snapshot references prevent the flush from cleaning up the current translog file. This commit changes the order of the close command.

Note: this is not a problem on master, as we moved the translog delete logic, making it kick in if needed when the ref counter goes to 0.

Closes elastic#9760

@bleskes bleskes closed this in 0f1c779 Feb 27, 2015

@bleskes bleskes deleted the bleskes:leftover_translog branch Feb 27, 2015

@bleskes bleskes removed the v2.0.0-beta1 label Feb 27, 2015

mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015
Engine: close snapshots before recovery counter
When we clean up after recoveries, we currently close the recovery counter first, followed up by the different snapshots. Since the recovery counter may issue a flush (introduced in elastic#9439) , the snapshot references prevent the flush from cleaning up the current translog file. This commit changes the order of the close command.

Note: this is not a problem on master, as we moved the translog delete logic, making it kick in if needed when the ref counter goes to 0.

Closes elastic#9760
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.