KAFKA-5456: Ensure producer handles old format large compressed messages #3356

hachikuji · 2017-06-16T06:23:46Z

No description provided.

hachikuji · 2017-06-16T06:25:26Z

clients/src/main/java/org/apache/kafka/common/record/MemoryRecordsBuilder.java

@@ -709,9 +709,8 @@ public boolean hasRoomFor(long timestamp, ByteBuffer key, ByteBuffer value) {
        }

        // Be conservative and not take compression of the new record into consideration.
-        return numRecords == 0 ?
-                bufferStream.remaining() >= recordSize :


Seems like the intent in RecordAccumulator is to ensure the producer can always write a message even if it exceeds the batch size, so I just removed this check.

Maybe it should be at the top after isFull() though since we don't need the other computations in that case.

I think it's reasonable to remove this check since we can't make sure it's always correct. We should probably add a comment to the following code making it clear that it's not really an upper bound.

int size = Math.max(this.batchSize, AbstractRecords.sizeInBytesUpperBound(maxUsableMagic, key, value, headers));

We could perhaps fix part of it by taking into account the outer record header for a compressed message batch with message format < V2. It may be worth doing this to avoid confusion.

However, that still doesn't take into account the compression header by Gzip (10 bytes), LZ4 (7 bytes) or Snappy (16 bytes). That's probably OK though as hopefully compression will decrease the size of the key and value so that the record fits. If it doesn't (which may well happen), we will re-allocate the underlying buffer, which is not the end of the world.

Yes, it was because of the compression-specific headers that I didn't bother trying to account for the compressed message set overhead.

asfgit · 2017-06-16T07:23:59Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/5389/
Test PASSed (JDK 7 and Scala 2.11).

asfgit · 2017-06-16T07:30:17Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/5374/
Test PASSed (JDK 8 and Scala 2.12).

…compressed messages

asfgit · 2017-06-16T17:12:04Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/5392/
Test FAILed (JDK 8 and Scala 2.12).

asfgit · 2017-06-16T17:12:27Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/5407/
Test FAILed (JDK 7 and Scala 2.11).

asfgit · 2017-06-16T17:22:08Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/5393/
Test FAILed (JDK 8 and Scala 2.12).

asfgit · 2017-06-16T17:22:16Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/5408/
Test FAILed (JDK 7 and Scala 2.11).

ijuma · 2017-06-16T17:44:58Z

clients/src/main/java/org/apache/kafka/common/record/MemoryRecordsBuilder.java

@@ -699,6 +699,10 @@ public boolean hasRoomFor(long timestamp, ByteBuffer key, ByteBuffer value, Head
        if (isFull())
            return false;

+        // We always allow at least one record to be appended (the ByteBufferOutputStream will grow as needed)
+        if (numRecords == 0)


I think you may need to update the method javadoc.

ijuma

Thanks for the updates, LGTM if the tests pass.

apurvam

Thanks for taking this over from me. I spent a bit of time understanding the semantics of the old format and how the regression happened. It was quite educational!

asfgit · 2017-06-16T18:48:52Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/5411/
Test PASSed (JDK 7 and Scala 2.11).

asfgit · 2017-06-16T19:22:19Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/5396/
Test PASSed (JDK 8 and Scala 2.12).

More specifically, fix the case where a compressed V0 or V1 message is larger than the producer batch size. Author: Jason Gustafson <jason@confluent.io> Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk> Closes #3356 from hachikuji/KAFKA-5456 (cherry picked from commit f49697a) Signed-off-by: Ismael Juma <ismael@juma.me.uk>

hachikuji commented Jun 16, 2017

View reviewed changes

Jason Gustafson added 3 commits June 16, 2017 09:49

KAFKA-5456: Ensure producer handles old format large compressed messages

112c195

Move check for no records below isFull

9d85ce9

Account for batch header in upper bound size estimate for old format …

e49d969

…compressed messages

hachikuji force-pushed the KAFKA-5456 branch from f482ab2 to e49d969 Compare June 16, 2017 17:09

Fix compilation error in JMH benchmark

c0f65cc

Remove unneeded header

f46273f

ijuma reviewed Jun 16, 2017

View reviewed changes

ijuma approved these changes Jun 16, 2017

View reviewed changes

Update javadoc for hasRoomFor

afd27fd

apurvam approved these changes Jun 16, 2017

View reviewed changes

asfgit closed this in f49697a Jun 16, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KAFKA-5456: Ensure producer handles old format large compressed messages #3356

KAFKA-5456: Ensure producer handles old format large compressed messages #3356

hachikuji commented Jun 16, 2017

hachikuji Jun 16, 2017

ijuma Jun 16, 2017

hachikuji Jun 16, 2017

asfgit commented Jun 16, 2017

asfgit commented Jun 16, 2017

asfgit commented Jun 16, 2017

asfgit commented Jun 16, 2017

asfgit commented Jun 16, 2017

asfgit commented Jun 16, 2017

ijuma Jun 16, 2017

ijuma left a comment

apurvam left a comment

asfgit commented Jun 16, 2017

asfgit commented Jun 16, 2017

KAFKA-5456: Ensure producer handles old format large compressed messages #3356

KAFKA-5456: Ensure producer handles old format large compressed messages #3356

Conversation

hachikuji commented Jun 16, 2017

hachikuji Jun 16, 2017

Choose a reason for hiding this comment

ijuma Jun 16, 2017

Choose a reason for hiding this comment

hachikuji Jun 16, 2017

Choose a reason for hiding this comment

asfgit commented Jun 16, 2017

asfgit commented Jun 16, 2017

asfgit commented Jun 16, 2017

asfgit commented Jun 16, 2017

asfgit commented Jun 16, 2017

asfgit commented Jun 16, 2017

ijuma Jun 16, 2017

Choose a reason for hiding this comment

ijuma left a comment

Choose a reason for hiding this comment

apurvam left a comment

Choose a reason for hiding this comment

asfgit commented Jun 16, 2017

asfgit commented Jun 16, 2017