KAFKA-8570: Grow buffer to hold down converted records if it was insufficiently sized #6974

dhruvilshah3 · 2019-06-20T00:24:07Z

When the log contains out of order message formats (for example v2 message followed by v1 message) and consists of compressed batches typically greater than 1kB in size, it is possible for down-conversion to fail. With compressed batches, we estimate the size of down-converted batches using:

    private static int estimateCompressedSizeInBytes(int size, CompressionType compressionType) {
        return compressionType == CompressionType.NONE ? size : Math.min(Math.max(size / 2, 1024), 1 << 16);
    }

This almost always underestimates size of down-converted records if the batch is between 1kB-64kB in size. In general, this means we may under estimate the total size required for compressed batches.

Because of an implicit assumption in the code that messages with a lower message format appear before any with a higher message format, we do not grow the buffer we copy the down converted records into when we see a message <= the target message format. This assumption becomes incorrect when the log contains out of order message formats, for example because of leaders flapping while upgrading the message format.

…ized

hachikuji

Thanks @dhruvilshah3 . Left one comment.

hachikuji · 2019-06-20T15:40:39Z

clients/src/main/java/org/apache/kafka/common/record/RecordsUtil.java

        for (RecordBatchAndRecords recordBatchAndRecords : recordBatchAndRecordsList) {
            temporaryMemoryBytes += recordBatchAndRecords.batch.sizeInBytes();
            if (recordBatchAndRecords.batch.magic() <= toMagic) {
+                buffer = Utils.ensureCapacity(buffer, buffer.limit() + recordBatchAndRecords.batch.sizeInBytes());


Should this be buffer.position() instead of buffer.limit()?

doh, yes. thanks!

hachikuji

LGTM. Thanks for the fix!

hachikuji · 2019-06-20T23:30:43Z

retest this please

hachikuji · 2019-06-21T16:34:28Z

I filed https://issues.apache.org/jira/browse/KAFKA-8577 for the failing test case. I have seen it fail in other PRs, so I think it is not related.

…fficiently sized (#6974) When the log contains out of order message formats (for example v2 message followed by v1 message) and consists of compressed batches typically greater than 1kB in size, it is possible for down-conversion to fail. With compressed batches, we estimate the size of down-converted batches using: ``` private static int estimateCompressedSizeInBytes(int size, CompressionType compressionType) { return compressionType == CompressionType.NONE ? size : Math.min(Math.max(size / 2, 1024), 1 << 16); } ``` This almost always underestimates size of down-converted records if the batch is between 1kB-64kB in size. In general, this means we may under estimate the total size required for compressed batches. Because of an implicit assumption in the code that messages with a lower message format appear before any with a higher message format, we do not grow the buffer we copy the down converted records into when we see a message <= the target message format. This assumption becomes incorrect when the log contains out of order message formats, for example because of leaders flapping while upgrading the message format. Reviewers: Jason Gustafson <jason@confluent.io>

…fficiently sized (#7071) Backport #6974 to 1.1 When the log contains out of order message formats (for example v2 message followed by v1 message) and consists of compressed batches typically greater than 1kB in size, it is possible for down-conversion to fail. With compressed batches, we estimate the size of down-converted batches using: ``` private static int estimateCompressedSizeInBytes(int size, CompressionType compressionType) { return compressionType == CompressionType.NONE ? size : Math.min(Math.max(size / 2, 1024), 1 << 16); } ``` This almost always underestimates size of down-converted records if the batch is between 1kB-64kB in size. In general, this means we may under estimate the total size required for compressed batches. Because of an implicit assumption in the code that messages with a lower message format appear before any with a higher message format, we do not grow the buffer we copy the down converted records into when we see a message <= the target message format. This assumption becomes incorrect when the log contains out of order message formats, for example because of leaders flapping while upgrading the message format. Reviewers: Jason Gustafson <jason@confluent.io>

Grow buffer to hold down converted records if it was insufficiently s…

8a931a9

…ized

hachikuji reviewed Jun 20, 2019

View reviewed changes

Address review comment

1bf1ba9

hachikuji approved these changes Jun 20, 2019

View reviewed changes

hachikuji merged commit 5f8b289 into apache:trunk Jun 21, 2019

dhruvilshah3 deleted the downconversion-bug branch July 3, 2019 16:41

dhruvilshah3 mentioned this pull request Jul 10, 2019

KAFKA-8570: Grow buffer to hold down converted records if it was insufficiently sized #7071

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KAFKA-8570: Grow buffer to hold down converted records if it was insufficiently sized #6974

KAFKA-8570: Grow buffer to hold down converted records if it was insufficiently sized #6974

dhruvilshah3 commented Jun 20, 2019

hachikuji left a comment

hachikuji Jun 20, 2019

dhruvilshah3 Jun 20, 2019

hachikuji left a comment

hachikuji commented Jun 20, 2019

hachikuji commented Jun 21, 2019

KAFKA-8570: Grow buffer to hold down converted records if it was insufficiently sized #6974

KAFKA-8570: Grow buffer to hold down converted records if it was insufficiently sized #6974

Conversation

dhruvilshah3 commented Jun 20, 2019

hachikuji left a comment

Choose a reason for hiding this comment

hachikuji Jun 20, 2019

Choose a reason for hiding this comment

dhruvilshah3 Jun 20, 2019

Choose a reason for hiding this comment

hachikuji left a comment

Choose a reason for hiding this comment

hachikuji commented Jun 20, 2019

hachikuji commented Jun 21, 2019