Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up translog interface #7564

Closed
wants to merge 1 commit into from

Conversation

Projects
None yet
4 participants
@dakrone
Copy link
Member

dakrone commented Sep 3, 2014

This has two major parts:

  • Simplifying the translog interface

Get rid of readSource() entirely, it sucked, Operations should be able
to provide the source themselves.

No more TranslogStream headers, you are now required to pass an
StreamInput or StreamOutput for all operations, which means no extra
state is needed and no need to construct new versions when detecting the
version.

The interface now exists of 2 main methods, and one helper:

// read
Translog.Operation read(StreamInput in)
// write
void write(StreamOutput out, Translog.Operation op)
// when reading a file, need to be able to read and consume the header
void seekPastHeader(StreamInput in)

  • Read and write translog op sizes in TranslogStreams

Previously we handled these integers outside of the translog stream
itself, which was very unclean because other code had to know about
reading the size, or about writing the correct header sometimes.

There is some additional code in LocalIndexShardGateway to handle the
legacy case for older translogs, because we need to read and discard the
size in order to maintain the compatibility for the streaming
operations (they did not read or write the size for 1.3.x and earlier).

Additionally, we need to handle a case where the header is truncated
when recovering from disk, so added code for that.


I'm hoping this makes working with the translog easier and less error-prone. On top of this (in a separate PR) we can add the size-reading safety to prevent rare OOMEs on corrupted translogs.

@dakrone dakrone added review labels Sep 3, 2014

@jpountz

View changes

src/main/java/org/elasticsearch/index/translog/ChecksummedTranslogStream.java Outdated
throw e;
} catch (IOException e) {
throw new TranslogCorruptedException("translog header corrupted", e);
}

This comment has been minimized.

Copy link
@jpountz

jpountz Sep 3, 2014

Contributor

Is it true that any I/O exception would mean that the translog is corrupt?

This comment has been minimized.

Copy link
@dakrone

dakrone Sep 4, 2014

Author Member

If reading the header of the translog causes an IOException, I think it's safer to abort with a corruption exception (the cause will be in the stacktrace too) than to try and continue. Additionally the corrupt index exception that Lucene's CodecUtil throws is an IOException, so I elected to wrap it in our own exception.

@s1monw

View changes

src/main/java/org/elasticsearch/index/translog/TranslogStream.java Outdated
*/
public void close() throws IOException;
public void seekPastHeader(StreamInput in) throws IOException;

This comment has been minimized.

Copy link
@s1monw

s1monw Sep 5, 2014

Contributor

instead of having this seekPastHeader can we have an StreamInput openInput(File) that encapsulates this entirely?

This comment has been minimized.

Copy link
@dakrone

dakrone Sep 8, 2014

Author Member

I agree, I think that's much nicer.

@s1monw

This comment has been minimized.

Copy link
Contributor

s1monw commented Sep 5, 2014

I like it left one comment

@s1monw s1monw removed the review label Sep 5, 2014

@dakrone

This comment has been minimized.

Copy link
Member Author

dakrone commented Sep 8, 2014

@s1monw so I'm kind of concerned about one of the changes I made here, which is that we now serialize the size of the operation when sending translog operations between nodes. If it were just a size it wouldn't be a big deal, but to get the size we create a new byte array for every operation, which seems much slower.

What do you think of keeping read and write and adding readWithoutSize and writeWithoutSize for the Transport case? Is that too messy? We could avoid the byte array entirely if we don't need the size.

@dakrone dakrone added the blocker label Sep 8, 2014

@dakrone

This comment has been minimized.

Copy link
Member Author

dakrone commented Sep 8, 2014

Also labeling this as a blocker for 1.4 because delaying it until 1.5 means adding another translog format since this changes it (see my previous comment).

@dakrone dakrone added the review label Sep 8, 2014

@dakrone

This comment has been minimized.

Copy link
Member Author

dakrone commented Sep 8, 2014

@s1monw added the NoopStreamOutput that we talked about and addressed your comment.

@s1monw

View changes

src/main/java/org/elasticsearch/common/io/stream/NoopStreamOutput.java Outdated
* A non-threadsafe StreamOutput that doesn't actually write the bytes to any
* stream, it only keeps track of how many bytes have been written
*/
public class NoopStreamOutput extends StreamOutput {

This comment has been minimized.

Copy link
@s1monw

s1monw Sep 8, 2014

Contributor

I think this can be final?

@s1monw

View changes

src/main/java/org/elasticsearch/index/gateway/local/LocalIndexShardGateway.java Outdated
TranslogStream stream = TranslogStreams.translogStreamFor(recoveringTranslogFile);
try {
in = stream.openInput(recoveringTranslogFile);
} catch (EOFException e) {

This comment has been minimized.

Copy link
@s1monw

s1monw Sep 8, 2014

Contributor

should we make this stream private here? I mean we might wanna have a dedicated exception for this

This comment has been minimized.

Copy link
@dakrone

dakrone Sep 8, 2014

Author Member

I don't understand what you mean? Make the stream private?

This comment has been minimized.

Copy link
@s1monw

s1monw Sep 8, 2014

Contributor

well if we hit EOF in stream.openInput... IMO the handling of this should be private to the stream and we should throw a dedicated exception?

This comment has been minimized.

Copy link
@dakrone

dakrone Sep 8, 2014

Author Member

Okay, I can add EndOfTranslogException if that would work for you? It's not a corrupted translog though, so it'll just be be a subclass of EOFException.

Translog.Operation operation;
try {
operation = stream.read();
if (stream instanceof LegacyTranslogStream) {

This comment has been minimized.

Copy link
@s1monw

s1monw Sep 8, 2014

Contributor

really shouldn't we just skip this in the LegacyTranslogStream

This comment has been minimized.

Copy link
@dakrone

dakrone Sep 8, 2014

Author Member

We can't because then we will be breaking compatibility with the streaming translog operations from earlier node versions.

This comment has been minimized.

Copy link
@s1monw

s1monw Sep 8, 2014

Contributor

I don't undestand, if we are in LegacyTranslogStream we can just skip the first int for all ops instead of doing the instanceof check here no? That way it only happens here as well?

This comment has been minimized.

Copy link
@dakrone

dakrone Sep 8, 2014

Author Member

If I add the in.readInt() into LegacyTranslogStream.read(StreamInput) it means receiving translog operations from 1.3.x nodes will break, because they don't send op sizes in RecoveryTranslogOperationsRequest

@s1monw

View changes

src/main/java/org/elasticsearch/index/gateway/local/LocalIndexShardGateway.java Outdated
@@ -269,7 +282,7 @@ public void recover(boolean indexShouldExists, RecoveryState recoveryState) thro
throw new IndexShardGatewayRecoveryException(shardId, "failed to recover shard", e);
} finally {
try {
IOUtils.close(stream);
IOUtils.close(in);

This comment has been minimized.

Copy link
@s1monw

s1monw Sep 8, 2014

Contributor

you can just replace with IOUtils#closeWhileHandlingExceptions

this.in.close();
public StreamInput openInput(File translogFile) throws FileNotFoundException {
// nothing to do, legacy translogs have no header
return new InputStreamStreamInput(new FileInputStream(translogFile));

This comment has been minimized.

Copy link
@s1monw

s1monw Sep 8, 2014

Contributor

are we actually useing a buffered stream here? I mean buffered streams seem to help perf wise reading 1024bytes chunks etc?

This comment has been minimized.

Copy link
@dakrone

dakrone Sep 8, 2014

Author Member

No, this doesn't use a buffered stream. I think this PR is not a good place to add a BufferedStreamInput, that's an enhancement we can do in a separate PR.

This comment has been minimized.

Copy link
@s1monw

s1monw Sep 8, 2014

Contributor

+1

@s1monw

View changes

src/main/java/org/elasticsearch/index/translog/TranslogStreams.java Outdated
/** V1, header, with per-op checksums */
public static TranslogStream V1 = new ChecksummedTranslogStream(null, false);
public static TranslogStream V1 = new ChecksummedTranslogStream();

This comment has been minimized.

Copy link
@s1monw

s1monw Sep 8, 2014

Contributor

hey can we rename this V1 to ChecksummedTranslogStream

@s1monw

View changes

src/test/java/org/elasticsearch/index/translog/TranslogVersionTests.java Outdated
@@ -125,9 +128,10 @@ public void testCorruptedTranslogs() throws Exception {
File translogFile = getResource("/org/elasticsearch/index/translog/translog-v1-corrupted-body.binary");
assertThat("test file should exist", translogFile.exists(), equalTo(true));
TranslogStream stream = TranslogStreams.translogStreamFor(translogFile);
StreamInput in = stream.openInput(translogFile);

This comment has been minimized.

Copy link
@s1monw

s1monw Sep 8, 2014

Contributor

should this be a try / with block?

@s1monw

This comment has been minimized.

Copy link
Contributor

s1monw commented Sep 8, 2014

I left some comments but mainly cosmetic

@dakrone

This comment has been minimized.

Copy link
Member Author

dakrone commented Sep 8, 2014

Updated with your feedback, except for the "stream private" comment because I don't understand what you mean.

@clintongormley clintongormley changed the title Clean up translog interface Resiliency: Clean up translog interface Sep 8, 2014

@s1monw

This comment has been minimized.

Copy link
Contributor

s1monw commented Sep 9, 2014

LGTM

@dakrone dakrone force-pushed the dakrone:translog-clean-up branch 2 times, most recently to c390caa Sep 9, 2014

Simplify translog interface
get rid of readSource() entirely, it sucked, Operations should be able
to provide the source themselves.

No more TranslogStream headers, you are now required to pass an
StreamInput or StreamOutput for all operations, which means no extra
state is needed and no need to construct new versions when detecting the
version.

Read and write translog op sizes in TranslogStreams

Previously we handled these integers outside of the translog stream
itself, which was very unclean because other code had to know about
reading the size, or about writing the correct header sometimes.

There is some additional code in LocalIndexShardGateway to handle the
legacy case for older translogs, because we need to read and discard the
size in order to maintain the compatibility for the streaming
operations (they did not read or write the size for 1.3.x and earlier).

Additionally, we need to handle a case where the header is truncated
when recovering from disk

Use a NoopStreamOutput instead of byte arrays

Instead of writing translog operations to a temporary byte array and
then writing that byte array to the stream, we now write the operation
twice, once to a No-op stream to get the size, then again to the real
size.

This trades a little more CPU usage for less memory usage.
@dakrone

This comment has been minimized.

Copy link
Member Author

dakrone commented Sep 9, 2014

Merged to 1.x and master.

@dakrone dakrone closed this Sep 9, 2014

@dakrone dakrone deleted the dakrone:translog-clean-up branch Sep 9, 2014

@jpountz jpountz removed the review label Oct 21, 2014

@clintongormley clintongormley changed the title Resiliency: Clean up translog interface Clean up translog interface Jun 7, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.