Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-8325: Remove batch from in-flight requests when handling MESSAG… #7176

Merged
merged 4 commits into from Aug 22, 2019

Conversation

@bob-barrett
Copy link
Contributor

commented Aug 7, 2019

…E_TOO_LARGE error

This patch fixes a bug in the handling of MESSAGE_TOO_LARGE errors. The large batch is split, the smaller batches are re-added to the accumulator, and the batch is deallocated, but it was not removed from the list of in-flight batches. When the batch was eventually expired from the in-flight batches, the producer would try to deallocate it a second time, causing an error. This patch changes the behavior to correctly remove the batch from the list of in-flight requests.

More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.

Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)
KAFKA-8325: Remove batch from in-flight requests when handling MESSAG…
…E_TOO_LARGE error

This patch fixes a bug in the handling of MESSAGE_TOO_LARGE errors. The large batch is split, the smaller batches are re-added to the accumulator, and the batch is deallocated, but it was not removed from the list of in-flight batches. When the batch was eventually expired from the in-flight batches, the producer would try to deallocate it a second time, causing an error. This patch changes the behavior to correctly remove the batch from the list of in-flight requests.

@ijuma ijuma requested a review from mumrah Aug 8, 2019

@lukestephenson
Copy link

left a comment

Thanks. LGTM

@@ -625,6 +625,7 @@ private void completeBatch(ProducerBatch batch, ProduceResponse.PartitionRespons
if (transactionManager != null)
transactionManager.removeInFlightBatch(batch);
this.accumulator.splitAndReenqueue(batch);
maybeRemoveFromInflightBatches(batch);
this.accumulator.deallocate(batch);

This comment has been minimized.

Copy link
@ijuma

ijuma Aug 10, 2019

Contributor

Would it make sense to have a method that deallocates and removes from in flight batches?

This comment has been minimized.

Copy link
@bob-barrett

bob-barrett Aug 15, 2019

Author Contributor

Yeah, I think that's reasonable. Added.

@ijuma
ijuma approved these changes Aug 15, 2019
Copy link
Contributor

left a comment

LGTM, just a couple of nits.

client.prepareResponse(new AddPartitionsToTxnResponse(0, Collections.singletonMap(tp0, Errors.NONE)));
sender.runOnce();

// create a producer batch with more than one record so it is eligible to split

This comment has been minimized.

Copy link
@ijuma

ijuma Aug 15, 2019

Contributor

Nit: eligible for splitting?

This comment has been minimized.

Copy link
@bob-barrett

bob-barrett Aug 22, 2019

Author Contributor

Fixed

accumulator.append(tp0, time.milliseconds(), "key2".getBytes(), "value2".getBytes(), null, null,
MAX_BLOCK_TIMEOUT, false).future;

sender.runOnce(); // send request

This comment has been minimized.

Copy link
@ijuma

ijuma Aug 15, 2019

Contributor

Should this comment be before the method for consistency with other comments?

This comment has been minimized.

Copy link
@bob-barrett

bob-barrett Aug 22, 2019

Author Contributor

Fixed

@bob-barrett

This comment has been minimized.

Copy link
Contributor Author

commented Aug 16, 2019

retest this please

@hachikuji
Copy link
Contributor

left a comment

LGTM

@hachikuji hachikuji merged commit e4215c1 into apache:trunk Aug 22, 2019

2 of 3 checks passed

JDK 11 and Scala 2.13 FAILURE 11621 tests run, 77 skipped, 1 failed.
Details
JDK 11 and Scala 2.12 SUCCESS 11826 tests run, 77 skipped, 0 failed.
Details
JDK 8 and Scala 2.11 SUCCESS 11826 tests run, 77 skipped, 0 failed.
Details
hachikuji added a commit that referenced this pull request Aug 22, 2019
KAFKA-8325; Remove batch from in-flight requests on MESSAGE_TOO_LARGE…
… errors (#7176)

This patch fixes a bug in the handling of MESSAGE_TOO_LARGE errors. The large batch is split, the smaller batches are re-added to the accumulator, and the batch is deallocated, but it was not removed from the list of in-flight batches. When the batch was eventually expired from the in-flight batches, the producer would try to deallocate it a second time, causing an error. This patch changes the behavior to correctly remove the batch from the list of in-flight requests.

Reviewers: Luke Stephenson, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
hzxa21 added a commit to hzxa21/kafka that referenced this pull request Aug 22, 2019
[LI-CHERRY-PICK] [e4215c1] KAFKA-8325, KAFKA-8202; Remove batch from …
…in-flight requests on MESSAGE_TOO_LARGE errors (apache#7176)

TICKET = KAFKA-8052, KAFKA-8202
LI_DESCRIPTION =
This patch fixes memory leaks in producer when batch split happens.

EXIT_CRITERIA = HASH [e4215c1]
ORIGINAL_DESCRIPTION =

This patch fixes a bug in the handling of MESSAGE_TOO_LARGE errors. The large batch is split, the smaller batches are re-added to the accumulator, and the batch is deallocated, but it was not removed from the list of in-flight batches. When the batch was eventually expired from the in-flight batches, the producer would try to deallocate it a second time, causing an error. This patch changes the behavior to correctly remove the batch from the list of in-flight requests.

Reviewers: Luke Stephenson, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
xiowu0 added a commit to linkedin/kafka that referenced this pull request Aug 27, 2019
[LI-CHERRY-PICK] [db8cb96] KAFKA-8325; Remove batch from in-flight re…
…quests on MESSAGE_TOO_LARGE errors (apache#7176)

TICKET = KAFKA-8325
LI_DESCRIPTION =

EXIT_CRITERIA = HASH [db8cb96]
ORIGINAL_DESCRIPTION =

This patch fixes a bug in the handling of MESSAGE_TOO_LARGE errors. The large batch is split, the smaller batches are re-added to the accumulator, and the batch is deallocated, but it was not removed from the list of in-flight batches. When the batch was eventually expired from the in-flight batches, the producer would try to deallocate it a second time, causing an error. This patch changes the behavior to correctly remove the batch from the list of in-flight requests.

Reviewers: Luke Stephenson, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
(cherry picked from commit db8cb96)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.