Skip to content

MINOR: refactor FetchResponse#toMessage to avoid creating a lot of un…#9818

Merged
chia7712 merged 2 commits into
apache:trunkfrom
chia7712:MINOR-9818
Jan 5, 2021
Merged

MINOR: refactor FetchResponse#toMessage to avoid creating a lot of un…#9818
chia7712 merged 2 commits into
apache:trunkfrom
chia7712:MINOR-9818

Conversation

@chia7712
Copy link
Copy Markdown
Member

@chia7712 chia7712 commented Jan 3, 2021

Instead of creating intermediate collections to batch data, we directly check the last topic in auto-generated data to collect partitions having same topic.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@chia7712
Copy link
Copy Markdown
Member Author

chia7712 commented Jan 3, 2021

@ijuma This PR is from #9469 and it brings the better refactor :)

@ijuma
Copy link
Copy Markdown
Member

ijuma commented Jan 3, 2021

Thanks for the PR. Have you checked that we get the same batching this way? The behavior won't be the same unless the iterator order is correct.

@chia7712
Copy link
Copy Markdown
Member Author

chia7712 commented Jan 4, 2021

@ijuma Thanks for your quick response!

Have you checked that we get the same batching this way?

There is a test to verify the batch order (https://github.com/apache/kafka/blob/trunk/core/src/test/scala/unit/kafka/server/FetchRequestTest.scala#L83).

The behavior won't be the same unless the iterator order is correct.

There are two rules of batching partition data.

  1. the order must be kept. the tp order from input (iterator) must be equal to tp order in FetchResponseData.
  2. batch the partitions only if we can keep the their order in FetchResponseData. For example: we should group t0-p0, t0-p1 and t1-p2 to two FetchableTopicResponses since the t0-p0 and t0-p1 can be merged. By contrast, t0-p0, t1-p1 and t0-p2 ought to generate three FetchableTopicResponses.

It seems to me the reason of generating the intermediate collections before is that reusing the data in Struct/Schema is not convenient.

Copy link
Copy Markdown
Member

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On closer inspection, I now remember that batchByTopic already relies on the ordering, so the new implementation is equivalent. LGTM overall. A nit below that should be easy to resolve.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could have a variable called previousTopic to make this code more readable.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice suggestion. will address it later.

@chia7712 chia7712 merged commit c9afd2d into apache:trunk Jan 5, 2021
@chia7712 chia7712 deleted the MINOR-9818 branch March 25, 2024 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants