KAFKA-5340: Batch splitting should preserve magic and transactional flag #3162

hachikuji · 2017-05-29T03:29:11Z

No description provided.

hachikuji · 2017-05-29T03:29:26Z

@becketqin Maybe you can take a look at this?

asfbot · 2017-05-29T04:26:27Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/4520/
Test PASSed (JDK 7 and Scala 2.11).

asfbot · 2017-05-29T04:59:20Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/4505/
Test PASSed (JDK 8 and Scala 2.12).

hachikuji · 2017-05-31T06:37:26Z

clients/src/main/java/org/apache/kafka/common/record/MemoryRecordsBuilder.java

@@ -238,7 +238,7 @@ public RecordsInfo info() {
        }
    }

-    public void setProducerState(long producerId, short producerEpoch, int baseSequence) {
+    public void setProducerState(long producerId, short producerEpoch, int baseSequence, boolean isTransactional) {


To clarify, the reason I had to change this is that we need to close the record builder in order to split it. At that point, we don't have a producerId yet, so if isTransactional is set to true, then MemoryRecordsBuilder.close() will raise an exception. To get around that, this change ensures that we always have a producerId when we set isTransactional.

Is it worth adding a comment about this?

Yes, I think so.

I think this is a good general improvement, to set all this data right at the very end.

guozhangwang · 2017-05-31T06:52:04Z

clients/src/main/java/org/apache/kafka/clients/producer/internals/ProducerBatch.java

-            Thunk thunk = thunkIter.next();
-            if (batch == null) {
-                batch = createBatchOffAccumulatorForRecord(record, splitBatchSize);
+        while (recordBatchIter.hasNext()) {


Not sure I understand this logic: we expect memoryRecords.batches() to only have one batch but here we are expecting it to have many?

If the message format is v0 or v1, then we could have multiple batches (each record is a batch of size 1). Obviously the batch splitting is intended for batches with multiple records, but it felt a little awkward and unnecessary to restrict this function to only magic >= 2 or compression != NONE.

asfbot · 2017-05-31T07:26:44Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/4611/
Test FAILed (JDK 8 and Scala 2.12).

asfbot · 2017-05-31T07:29:35Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/4626/
Test PASSed (JDK 7 and Scala 2.11).

…ion before sending records

hachikuji · 2017-05-31T17:43:09Z

@guozhangwang Comments addressed. Please take another look.

asfbot · 2017-05-31T18:58:10Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/4643/
Test PASSed (JDK 8 and Scala 2.12).

asfbot · 2017-05-31T19:42:13Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/4658/
Test FAILed (JDK 7 and Scala 2.11).

… splitting batches

becketqin · 2017-05-31T22:19:18Z

clients/src/main/java/org/apache/kafka/clients/producer/internals/ProducerBatch.java

        RecordBatch recordBatch = recordBatchIter.next();
+        if (recordBatch.magic() < MAGIC_VALUE_V2 && !recordBatch.isCompressed())


Should we add the check in the Sender.completeBatch() as well to note call split in this case? Otherwise if the producer was sending uncompressed messages and one of the message in a batch a too large, it seems the producer will not fire the callback with correct exception.

This would probably be a rare case because a big message will typically get sent in a dedicated batch if compression is none. But it is theoretically possible if user configured the producer batch size to be larger than the max.message.size.

Yes, I think that makes sense. I will update the patch.

@becketqin Actually, in my original patch, I modified this code to handle all cases. Perhaps it would be a little simpler to revert to that behavior?

OK, I think I'm just going to add the check in Sender.completeBatch() and follow the behavior prior to KIP-126 if it is the old message format without compression.

becketqin · 2017-05-31T22:33:18Z

clients/src/main/java/org/apache/kafka/clients/producer/internals/ProducerBatch.java

@@ -218,7 +226,7 @@ private ProducerBatch createBatchOffAccumulatorForRecord(Record record, int batc
                record.key(), record.value(), record.headers()), batchSize);
        ByteBuffer buffer = ByteBuffer.allocate(initialSize);
        MemoryRecordsBuilder builder = MemoryRecords.builder(buffer, magic(), recordsBuilder.compressionType(),
-                TimestampType.CREATE_TIME, 0L, recordsBuilder.isTransactional());
+                TimestampType.CREATE_TIME, 0L);


Just curious, it seems that the transaction is the same as the parent batch before the change. Wouldn't that preserve the transactional flag?

Yes, I attempted to fix this previously, but the logic was incorrect. The builder requires that the producerId and the producer epoch are both set if isTransactional is set to true. I felt like this was a useful sanity check, so I changed the logic to pass isTransactional at the same time that we set the producer id and epoch. Does that make sense?

guozhangwang · 2017-05-31T22:42:37Z

clients/src/main/java/org/apache/kafka/clients/producer/internals/TransactionManager.java

@@ -232,6 +232,15 @@ public RuntimeException lastError() {
        return lastError;
    }

+    public synchronized boolean ensurePartitionAdded(TopicPartition tp) {
+        if (isInTransaction() && !partitionsInTransaction.contains(tp)) {
+            transitionToFatalError(new IllegalStateException("Attempted to dequeue a record batch to send " +


Would this ever happen assuming we do not have bugs? If yes then we should probably throw IllegalStateException directly to indicate a bug?

It should not happen normally, but if there is a bug in the code, then the result is basically a corrupted topic, so I felt it is worth having the check. Transitioning to the fatal error state ensures that the user will see the error and that no further progress can be made. If we just threw the exception, the Sender would simply die apparently.

Makes sense. I was following the ordinary rule for using RTE for any unexpected bugs, but this is definitely better in operations.

guozhangwang · 2017-05-31T22:47:53Z

clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java

-        if (transactionManager != null)
-            isTransactional = transactionManager.isInTransaction();
-        return MemoryRecords.builder(buffer, maxUsableMagic, compression, TimestampType.CREATE_TIME, 0L, isTransactional);
+        return MemoryRecords.builder(buffer, maxUsableMagic, compression, TimestampType.CREATE_TIME, 0L);


It seems the builder() function with isTransactional at line 377 in MemoryRecords are not externally used any more; could we remove it then?

guozhangwang · 2017-05-31T22:50:11Z

clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java

        int maxRetries = 1;
-        String topic = "testSplitBatchAndSend";
+        String topic = tp.topic();


Nice cleanup!

guozhangwang · 2017-05-31T23:00:59Z

LGTM.

apurvam

Left a minor comment, but otherwise this looks good to me. Thanks!

apurvam · 2017-05-31T23:14:01Z

clients/src/main/java/org/apache/kafka/clients/producer/internals/ProducerBatch.java

        RecordBatch recordBatch = recordBatchIter.next();
+        if (recordBatch.magic() < MAGIC_VALUE_V2 && !recordBatch.isCompressed())
+            throw new IllegalArgumentException("Batch splitting cannot be used with non-compressed messages " +


This wording could be improved: "Batch splitting cannot be used with non-compressed messages, NOR with message format versions v0 and v1"

The rewording is not quite right. Batch splitting can be used for v0 and v1 if compression is enabled.

apurvam · 2017-05-31T23:14:49Z

clients/src/main/java/org/apache/kafka/common/record/MemoryRecordsBuilder.java

@@ -238,7 +238,7 @@ public RecordsInfo info() {
        }
    }

-    public void setProducerState(long producerId, short producerEpoch, int baseSequence) {
+    public void setProducerState(long producerId, short producerEpoch, int baseSequence, boolean isTransactional) {


I think this is a good general improvement, to set all this data right at the very end.

asfbot · 2017-05-31T23:15:40Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/4664/
Test FAILed (JDK 8 and Scala 2.12).

asfbot · 2017-05-31T23:34:50Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/4679/
Test PASSed (JDK 7 and Scala 2.11).

asfbot · 2017-06-01T00:35:40Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/4685/
Test PASSed (JDK 7 and Scala 2.11).

asfbot · 2017-06-01T00:36:26Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/4669/
Test PASSed (JDK 8 and Scala 2.12).

hachikuji · 2017-06-01T02:41:56Z

Pushed a fix to address @becketqin's comments. If there are no further comments, I will merge later this evening.

becketqin · 2017-06-01T03:06:38Z

Thanks for the patch. LGTM.

asfbot · 2017-06-01T04:11:09Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/4699/
Test PASSed (JDK 7 and Scala 2.11).

asfbot · 2017-06-01T04:22:08Z

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/4683/
Test PASSed (JDK 8 and Scala 2.12).

Author: Jason Gustafson <jason@confluent.io> Reviewers: Apurva Mehta <apurva@confluent.io>, Jiangjie Qin <becket.qin@gmail.com>, Guozhang Wang <wangguoz@gmail.com> Closes #3162 from hachikuji/KAFKA-5340 (cherry picked from commit e4a6b50) Signed-off-by: Jason Gustafson <jason@confluent.io>

hachikuji commented May 31, 2017

View reviewed changes

guozhangwang reviewed May 31, 2017

View reviewed changes

hachikuji added 4 commits May 31, 2017 10:40

KAFKA-5340: Batch splitting should preserve magic and transactional flag

6ff176f

Add safety assertion to ensure partitions have been added to transact…

e36ba97

…ion before sending records

rework safety check as an assertion in TransactionManager

7a64478

Batch splitting cannot be used with v0 and v1 non-compressed messages

ac70590

hachikuji force-pushed the KAFKA-5340 branch from 9f379f1 to ac70590 Compare May 31, 2017 17:42

Add comment clarifying why producer state is not directly copied when…

745f8a5

… splitting batches

becketqin reviewed May 31, 2017

View reviewed changes

guozhangwang reviewed May 31, 2017

View reviewed changes

Remove unneeded MemoryRecords.builder variant

e824ee6

apurvam approved these changes May 31, 2017

View reviewed changes

Only split batches if compression is enabled or magic >= 2

fa46885

asfgit closed this in e4a6b50 Jun 1, 2017

		RecordBatch recordBatch = recordBatchIter.next();
		if (recordBatch.magic() < MAGIC_VALUE_V2 && !recordBatch.isCompressed())

KAFKA-5340: Batch splitting should preserve magic and transactional flag #3162

KAFKA-5340: Batch splitting should preserve magic and transactional flag #3162

Conversation

hachikuji commented May 29, 2017

hachikuji commented May 29, 2017

asfbot commented May 29, 2017

asfbot commented May 29, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asfbot commented May 31, 2017

asfbot commented May 31, 2017

hachikuji commented May 31, 2017

asfbot commented May 31, 2017

asfbot commented May 31, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guozhangwang commented May 31, 2017

apurvam left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asfbot commented May 31, 2017

asfbot commented May 31, 2017

asfbot commented Jun 1, 2017

asfbot commented Jun 1, 2017

hachikuji commented Jun 1, 2017

becketqin commented Jun 1, 2017

asfbot commented Jun 1, 2017

asfbot commented Jun 1, 2017