New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add translog checksums #7232
Add translog checksums #7232
Conversation
@@ -269,7 +268,9 @@ public void recover(boolean indexShouldExists, RecoveryState recoveryState) thro | |||
throw new IndexShardGatewayRecoveryException(shardId, "failed to recover shard", e); | |||
} finally { | |||
try { | |||
fs.close(); | |||
if (stream != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use IOUtils.close()
handles null
values
I left a bunch of comments. I personally think we should add the checksum calculation on the TranslogStream level somehow instead of doing it on each operation. we should have an StreamOutput impl that calculates the checksum as well as a StreamInput that makes it entirely transparent. I am also worried about the potential collision of the header so I think we should make use of the index version to detect if we can use the new version of the translog |
* A CRC32 checksum class that has helpers for adding strings and numeric types | ||
*/ | ||
public class ChecksumHelper { | ||
private final Checksum checksum = new CRC32(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
each invocation of CRC32 is pretty slow: if we do it byte-by-byte its a native call, and we cant use vectorized bulk processing. can you try:
import org.apache.lucene.store.BufferedChecksum;
...
private final Checksum checksum = new BufferedChecksum(new CRC32());
This should improve performance significantly.
|
||
@Override | ||
public int writeHeader(FileChannel channel) throws IOException { | ||
OutputStreamDataOutput out = new OutputStreamDataOutput(Channels.newOutputStream(channel)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we put a comment here that we don't close on purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes good idea, always good to make the intent known, I'll do that.
f8b3a0d
to
ae626c9
Compare
* Similar to Lucene's BufferedChecksumIndexOutput, however this wraps a | ||
* {@link StreamOutput} so anything written will update the checksum | ||
*/ | ||
public class BufferedChecksumStreamOutput extends StreamOutput { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense to make these classes final?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it definitely does, I'll change that.
* the transient or current translog does not match, returns null | ||
*/ | ||
private FsTranslogFile translogForLocation(Location location) { | ||
if (this.trans != null && trans.id() == location.translogId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove the this
qualifier for consistency?
I left some more comments, I think it's close though... I would really really like to see a test that does corrupt a translog on a node and then checks that we handle it correctly... I am not even sure what the correct behavior is but IMO we should fail the shard? |
} | ||
|
||
@Override | ||
public int read() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the contract of this method? Usually (e.g. java InputStream) its that it returns an integer value between 0 and 255. So I think this is missing & 0xFF ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, added this, thanks!
@s1monw added an integration test for this. |
@dakrone this test looks awesome. Can we maybe add that it tells you if it corrupted a replica and if so we make sure that the replica is allocated on another node? if you wanna do that afterwards I am ok with pushing this as it is... |
0ca533d
to
eaf3921
Compare
Switches TranslogStreams to check a header in the file to determine the translog format, delegating to the version-specific stream. Version 1 of the translog format writes a header using Lucene's CodecUtil at the beginning of the file and appends a checksum for each translog operation written. Also refactors much of the translog operations, such as merging .hasNext() and .next() in FsChannelSnapshot Relates to elastic#6554
Switches TranslogStreams to check a header in the file to determine the
translog format, delegating to the version-specific stream.
Version 1 of the translog format writes a
0xffff_ffff_0000_0001
headerat the beginning of the file and appends a checksum for each translog
operation written.
Also refactors much of the translog operations, such as merging
.hasNext() and .next() in FsChannelSnapshot
Relates to #6554