Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add checksumming and versions to the Translog's Checkpoint files #19797

Merged
merged 6 commits into from Aug 4, 2016

Conversation

Projects
None yet
4 participants
@bleskes
Copy link
Member

commented Aug 4, 2016

This prepares the infrastructure to be able to extend the checkpoint file to store more information.

bleskes added some commits Aug 3, 2016

@bleskes

This comment has been minimized.

Copy link
Member Author

commented Aug 4, 2016

@s1monw @mikemccand can you guys take a look?

PS. I would love to get rid of the DirectOutputStreamIndexOutput I added there, but I didn't want multiple in memory buffers - it just irked me the wrong way. It will be wonderful to hear if there is a better way in the big jungle of OutputStreams and DataOutputs

return new Checkpoint(new InputStreamDataInput(in));
try (Directory dir = new SimpleFSDirectory(path.getParent())) {
try (final IndexInput indexInput = dir.openInput(path.getFileName().toString(), IOContext.DEFAULT)) {
if (indexInput.length() == 20) {

This comment has been minimized.

Copy link
@mikemccand

mikemccand Aug 4, 2016

Contributor

Can we pull 20 out to a static constant like OLD_CHECKSUM_FILE_LENGTH?

// We checksum the entire file before we even go and parse it. If it's corrupted we barf right here.
CodecUtil.checksumEntireFile(indexInput);
final int fileVersion = CodecUtil.checkHeader(indexInput, CHECKPOINT_CODEC, INITIAL_VERSION, INITIAL_VERSION);
assert fileVersion == INITIAL_VERSION : "trust no one! " + fileVersion;

This comment has been minimized.

Copy link
@mikemccand

mikemccand Aug 4, 2016

Contributor

Can we make this a real check instead of assert?

This comment has been minimized.

Copy link
@mikemccand

mikemccand Aug 4, 2016

Contributor

Oh nevermind, CodecUtil.checkHeader is already doing the real check. You really do not trust anyone ;)

This comment has been minimized.

Copy link
@bleskes

bleskes Aug 4, 2016

Author Member

hehe. yeah. Left over. will remove.

try (FileChannel channel = factory.open(checkpointFile, options)) {
checkpoint.write(channel);
Channels.writeToChannel(byteOutputStream.toByteArray(), channel);
// no need to force metadata, file size stays the same

This comment has been minimized.

Copy link
@mikemccand

mikemccand Aug 4, 2016

Contributor

Can you improve the comment by also saying that "... and we did the full fsync when we first created the file, so the directory entry doesn't change as well"?

out.writeLong(offset);
out.writeInt(numOps);
out.writeLong(generation);
static Checkpoint readNonChecksummed(DataInput in) throws IOException {

This comment has been minimized.

Copy link
@mikemccand

mikemccand Aug 4, 2016

Contributor

Maybe just a single read method instead of readChecksummed and readNonChecksummed? They do the same thing? In the future, we'll have to also pass in the version to read so it can read any version-specific things ...

This comment has been minimized.

Copy link
@bleskes

bleskes Aug 4, 2016

Author Member

I was going for a model where the versioning switch is not in these methods and have one method per "version". I think it's easier to clean up when code becomes old and is not needed any more.

This comment has been minimized.

Copy link
@mikemccand

mikemccand Aug 4, 2016

Contributor

OK that makes sense.

@@ -109,4 +158,50 @@ public int hashCode() {
result = 31 * result + Long.hashCode(generation);
return result;
}

/** A copy of {@link OutputStreamIndexOutput} without the intermediate buffering */
static class DirectOutputStreamIndexOutput extends IndexOutput {

This comment has been minimized.

Copy link
@mikemccand

mikemccand Aug 4, 2016

Contributor

Why not just use OutputStreamIndexOutput? Is its buffering somehow problematic?

This comment has been minimized.

Copy link
@bleskes

bleskes Aug 4, 2016

Author Member

Because it bugged me that we do buffering when we write to an in-memory buffer anyway. I'm not sure why that class does buffering when you can give it a buffering stream, if you want buffering? Alternatively we should allow to set the buffer size to 0 , in which cases the CRC will not be wrapped.

};
final String resourceDesc = "checkpoint(path=\"" + checkpointFile + "\", gen=" + checkpoint + ")";
try (final DirectOutputStreamIndexOutput indexOutput =
new DirectOutputStreamIndexOutput(resourceDesc, checkpointFile.toString(), byteOutputStream)) {

This comment has been minimized.

Copy link
@mikemccand

mikemccand Aug 4, 2016

Contributor

You could maybe use ByteArrayDataOutput (writes directly to a fixed byte[], but since we know how many bytes we will write here that should be OK) instead? Or OutputStreamDataOutput?

This comment has been minimized.

Copy link
@bleskes

bleskes Aug 4, 2016

Author Member

Oh the jungle :) ByteArrayDataOutput doesn't implement OutputStream which (Direct)OutputStreamIndexOutput require as a parameter. Same goes for OutputStreamDataOutput.

This comment has been minimized.

Copy link
@s1monw

s1monw Aug 4, 2016

Contributor

maybe its simpler to do this:

try (FileChannel channel = factory.open(checkpointFile, options)) {
  IndexOutput indexOutput = new OutputStreamIndexOutput (resourceDesc, checkpointFile.toString(), java.nio.channels.Channels.newOutputStream(channel), BUFFER_SIZE);

 // ...

}

This comment has been minimized.

Copy link
@bleskes

bleskes Aug 4, 2016

Author Member

I tried that too :(

Sadly IndexOutput doesn't allow you to flush it's internal buffer, so you have to close it , at which point your channel is closed, so you can't fsync it...

This comment has been minimized.

Copy link
@s1monw

s1monw Aug 4, 2016

Contributor

hmm the close method here looks like this:

@Override
  public void close() throws IOException {
    try (final OutputStream o = os) {
      // We want to make sure that os.flush() was running before close:
      // BufferedOutputStream may ignore IOExceptions while flushing on close().
      // We keep this also in Java 8, although it claims to be fixed there,
      // because there are more bugs around this! See:
      // # https://bugs.openjdk.java.net/browse/JDK-7015589
      // # https://bugs.openjdk.java.net/browse/JDK-8054565
      if (!flushedOnClose) {
        flushedOnClose = true; // set this BEFORE calling flush!
        o.flush();
      }
    }
  }

so I am not sure why it close the channel

This comment has been minimized.

Copy link
@bleskes

bleskes Aug 4, 2016

Author Member

because the os in a try with resources clause - so it's closed by that.

This comment has been minimized.

Copy link
@s1monw

s1monw Aug 4, 2016

Contributor

oh yeah I missed that :/

This comment has been minimized.

Copy link
@s1monw

s1monw Aug 4, 2016

Contributor

ok I think ultimately we should just have a non-closeing FilterOutputStream to prevent this?

This comment has been minimized.

Copy link
@bleskes

bleskes Aug 4, 2016

Author Member

sadly non closing is not enough. you have to use an intermediate in memory buffer in order to make sure you write to the underlying channel in one go. If not, the crc calculation which flushes will write the first chunk and then the crc will be written as a second chunks. I had that too and guess - tests signaled it was wrong (good!)

@mikemccand

This comment has been minimized.

Copy link
Contributor

commented Aug 4, 2016

Thanks @bleskes, I left some comments.

@bleskes

This comment has been minimized.

Copy link
Member Author

commented Aug 4, 2016

@mikemccand @s1monw I pushed another commit addressing feedback. I switched back to vanilla OutputStreamDataOutput, which means double buffers but it seems you guys are more comfortable with that trade off (which I agree with if there is no feature coming to allow to disable the buffering in OutputStreamDataOutput)

final ByteArrayDataOutput out = new ByteArrayDataOutput(buffer);
write(out);
Channels.writeToChannel(buffer, channel);
static Checkpoint readChecksummed(DataInput in) throws IOException {

This comment has been minimized.

Copy link
@mikemccand

mikemccand Aug 4, 2016

Contributor

Maybe rename to readChecksummedV1?

This comment has been minimized.

Copy link
@bleskes

bleskes Aug 4, 2016

Author Member

will do.

@mikemccand

This comment has been minimized.

Copy link
Contributor

commented Aug 4, 2016

LGTM, I just left a minor renaming suggestion.

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Aug 4, 2016

LGTM 2

@bleskes bleskes merged commit 7010082 into elastic:master Aug 4, 2016

2 checks passed

CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci Build finished.
Details

@bleskes bleskes deleted the bleskes:translog_versioned_checkpoint branch Aug 4, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.