Skip to content

KAFKA-19477: Sticky Assignor JMH Benchmark #20118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 9, 2025

Conversation

lucasbru
Copy link
Member

@lucasbru lucasbru commented Jul 7, 2025

The current assignor used in KIP-1071 is verbatim the assignor used on
the client-side. The assignor performance was not a big concern on the
client-side, and it seems some additional performance overhead has crept
in during the adaptation to the broker-side interfaces, so we expect it
to be too slow for groups of non-trivial size.

We base ourselves on the share-group parameters for these benchmarks:

  • Up to 1000 members - Up to 100 topics - Up to 100
    partitions per topic

Note, however, that the parameters influencing the Streams assignment
are different and more complicated compared to regular consumer groups /
share consumer groups. The assignment logic is independent of the number
of topics, but depends on the number of subtopologies. A subtopology may
read from multiple topics. We simplify this relationship by assuming one
topic per subtopology Members may be part of the same process or
separate processes. We introduce a parameter membersPerProcess to tune
two extreme configurations (1, 50).

We define 50% of the subtopologies to be stateful. Stateful
subtopologies get standby replicas assigned, if enabled. For example, if
we have 100 subtopologies with 100 partitions, we get 10,000 active
tasks and 5,000 standby tasks. 

Reviewers: Bill Bejeck bbejeck@apache.org

@lucasbru lucasbru requested a review from Copilot July 7, 2025 08:38
@lucasbru lucasbru requested review from mjsax and bbejeck July 7, 2025 08:38
@lucasbru
Copy link
Member Author

lucasbru commented Jul 7, 2025

PTAL @aliehsaeedii

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new JMH benchmark to measure the performance of the StreamsStickyAssignor under various group sizes, partition counts, subtopology counts, standby replica settings, and assignment types (full vs. incremental), along with utility methods to generate the necessary group specs and topology configurations.

  • Introduce StreamsStickyAssignorBenchmark to parameterize and run JMH benchmarks for the Streams assignor.
  • Implement simulateIncrementalRebalance to benchmark both full and incremental rebalances.
  • Add StreamsAssignorBenchmarkUtils for creating synthetic StreamsGroupMember maps and subtopology configurations.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
jmh-benchmarks/src/main/java/org/apache/kafka/jmh/assignor/StreamsStickyAssignorBenchmark.java Add JMH benchmark class for the sticky assignor
jmh-benchmarks/src/main/java/org/apache/kafka/jmh/assignor/StreamsAssignorBenchmarkUtils.java Add helper utilities for members and subtopologies
Comments suppressed due to low confidence (2)

jmh-benchmarks/src/main/java/org/apache/kafka/jmh/assignor/StreamsStickyAssignorBenchmark.java:135

  • [nitpick] The variable name updatedMemberSpec represents multiple specs; consider renaming it to updatedMemberSpecs for clarity.
        Map<String, AssignmentMemberSpec> updatedMemberSpec = new HashMap<>();

jmh-benchmarks/src/main/java/org/apache/kafka/jmh/assignor/StreamsStickyAssignorBenchmark.java:60

  • [nitpick] Adding a brief class-level Javadoc summarizing the benchmark’s purpose and parameter meanings would improve readability and maintainability.
public class StreamsStickyAssignorBenchmark {

Copy link
Member

@bbejeck bbejeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @lucasbru, overall looks good with one comment.

@Threads(1)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
public void doAssignment() {
taskAssignor.assign(groupSpec, topologyDescriber);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should either return the calculated GroupAssignment object or use a JMH Blackhole to consume it so the JIT compiler thinks it's used elsewhere and doesn't get optimized away as dead code. Probably makes more sense to just update the return type and add return taskAssignor.assign(xxx, yyy)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I get a compiler warning when just returning the value, So I added a blackhole.

@lucasbru
Copy link
Member Author

lucasbru commented Jul 8, 2025

Thanks for the review @bbejeck, updated.

I tested the change on a subset of the parameters and the results did not change. So I am assuming no (significant) effect on the benchmark outcomes.

Copy link
Member

@bbejeck bbejeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @lucasbru, LGTM

@lucasbru lucasbru force-pushed the jmh_benchmarks_base branch from 19f1638 to 69d1249 Compare July 9, 2025 09:39
@lucasbru lucasbru merged commit dabde76 into apache:trunk Jul 9, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants