New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add translog checkpoints to prevent translog corruption #11143
Conversation
Today we are almost intentionally corrupt the translog if we loose a node due to powerloss or similary disasters. In the translog reading code we simply read until we hit an EOF exception ignoring the rest of the translog file once hit. There is no information stored how many records we are expecting or what the last written offset was. This commit restructures the translog to add checkpoints that are written with every sync operation recording the number of synced operations as well as the last synced offset. These checkpoints are also used to identify the actual transaction log file to open instead of relying on directory traversal. This change adds a significant amount of additional checks and pickyness to the translog code. For instance is the translog now associated with a specific engine via a UUID that is written to each translog file as part of it's header. If an engine opens a translog file it was not associated with the operation will fail. Closes to #10933 Relates to #11011
Channels.writeToChannel(buffer, channel); | ||
} | ||
|
||
public void write(DataOutput out) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be private? Callers use the static Checkpoint.write method?
- Renamed TranslogSnapshot to MultiSnapshot - moved legacy logic for trucation into LegacyTranslogReaderBase - made several methods private and pkg private where applicable - renamed arguments for consistency
@mikemccand I addressed all your comments, thanks!! |
this looks great @s1monw! |
…tedTranslogException
@@ -203,8 +200,8 @@ public String getVersionMapSizeSetting() { | |||
} | |||
|
|||
/** if true the engine will start even if the translog id in the commit point can not be found */ | |||
public boolean getIgnoreUnknownTranslog() { | |||
return ignoreUnknownTranslog; | |||
public boolean forceNewTranlog() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to forceNewTranslog? (missing 's')
public final String translogUUID; | ||
public final long translogFileGeneration; | ||
|
||
public TranslogGeneration(String translogUUID, long translogFileGeneration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a boolean success
here and use closeWhileHandlingExc if an exc happens, so we don't suppress any original exc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is where?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woops, sorry, wrong place ... I was referring to the try/finally in Translog.commit.
@mikemccand pushed a new commit |
final Translog translog = new Translog(translogConfig); | ||
if (generation == null) { | ||
if (createNew) { | ||
throw new IllegalStateException("no tranlog generation present in commit data but tranlog is expected to exists"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tranlog -> translog, and exists -> exist
… code Clarify when a new tranlog should be created by passing the same create flag to the IndexWriter as well as to the Translog creation
@@ -702,15 +702,14 @@ private void flush(boolean commitTranslog, boolean force, boolean waitIfOngoing) | |||
if (commitTranslog) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-existing, and we should fix this later, but: do we really need this commitTranslog boolean anymore? Can't we just always commit the translog, now that recovery is able to pull its own snapshot and is decoupled from flushing?
- rename one method - move "generation == null" check under existing "if (createNew == false)" - fix typo/whitespace - add a TODO
LGTM |
Conflicts: src/main/java/org/elasticsearch/action/termvectors/TermVectorsFields.java
i will push this in the on monday unless anybody objects |
Conflicts: src/main/java/org/elasticsearch/index/engine/InternalEngine.java
Today we are almost intentionally corrupt the translog if we loose
a node due to powerloss or similary disasters. In the translog reading
code we simply read until we hit an EOF exception ignoring the rest of the
translog file once hit. There is no information stored how many records
we are expecting or what the last written offset was.
This commit restructures the translog to add checkpoints that are written
with every sync operation recording the number of synced operations as well
as the last synced offset. These checkpoints are also used to identify the actual
transaction log file to open instead of relying on directory traversal.
This change adds a significant amount of additional checks and pickyness to the translog
code. For instance is the translog now associated with a specific engine via a UUID that is
written to each translog file as part of it's header. If an engine opens a translog file it
was not associated with the operation will fail.
Closes to #10933
Relates to #11011