-
Notifications
You must be signed in to change notification settings - Fork 14.5k
KAFKA-19477: Sticky Assignor JMH Benchmark #20118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
PTAL @aliehsaeedii |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new JMH benchmark to measure the performance of the StreamsStickyAssignor under various group sizes, partition counts, subtopology counts, standby replica settings, and assignment types (full vs. incremental), along with utility methods to generate the necessary group specs and topology configurations.
- Introduce
StreamsStickyAssignorBenchmark
to parameterize and run JMH benchmarks for the Streams assignor. - Implement
simulateIncrementalRebalance
to benchmark both full and incremental rebalances. - Add
StreamsAssignorBenchmarkUtils
for creating syntheticStreamsGroupMember
maps and subtopology configurations.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
jmh-benchmarks/src/main/java/org/apache/kafka/jmh/assignor/StreamsStickyAssignorBenchmark.java | Add JMH benchmark class for the sticky assignor |
jmh-benchmarks/src/main/java/org/apache/kafka/jmh/assignor/StreamsAssignorBenchmarkUtils.java | Add helper utilities for members and subtopologies |
Comments suppressed due to low confidence (2)
jmh-benchmarks/src/main/java/org/apache/kafka/jmh/assignor/StreamsStickyAssignorBenchmark.java:135
- [nitpick] The variable name
updatedMemberSpec
represents multiple specs; consider renaming it toupdatedMemberSpecs
for clarity.
Map<String, AssignmentMemberSpec> updatedMemberSpec = new HashMap<>();
jmh-benchmarks/src/main/java/org/apache/kafka/jmh/assignor/StreamsStickyAssignorBenchmark.java:60
- [nitpick] Adding a brief class-level Javadoc summarizing the benchmark’s purpose and parameter meanings would improve readability and maintainability.
public class StreamsStickyAssignorBenchmark {
jmh-benchmarks/src/main/java/org/apache/kafka/jmh/assignor/StreamsStickyAssignorBenchmark.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @lucasbru, overall looks good with one comment.
@Threads(1) | ||
@OutputTimeUnit(TimeUnit.MILLISECONDS) | ||
public void doAssignment() { | ||
taskAssignor.assign(groupSpec, topologyDescriber); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should either return the calculated GroupAssignment
object or use a JMH Blackhole to consume it so the JIT compiler thinks it's used elsewhere and doesn't get optimized away as dead code. Probably makes more sense to just update the return type and add return taskAssignor.assign(xxx, yyy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I get a compiler warning when just returning the value, So I added a blackhole.
Thanks for the review @bbejeck, updated. I tested the change on a subset of the parameters and the results did not change. So I am assuming no (significant) effect on the benchmark outcomes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @lucasbru, LGTM
19f1638
to
69d1249
Compare
The current assignor used in KIP-1071 is verbatim the assignor used on
the client-side. The assignor performance was not a big concern on the
client-side, and it seems some additional performance overhead has crept
in during the adaptation to the broker-side interfaces, so we expect it
to be too slow for groups of non-trivial size.
We base ourselves on the share-group parameters for these benchmarks:
partitions per topic
Note, however, that the parameters influencing the Streams assignment
are different and more complicated compared to regular consumer groups /
share consumer groups. The assignment logic is independent of the number
of topics, but depends on the number of subtopologies. A subtopology may
read from multiple topics. We simplify this relationship by assuming one
topic per subtopology Members may be part of the same process or
separate processes. We introduce a parameter membersPerProcess to tune
two extreme configurations (1, 50).
We define 50% of the subtopologies to be stateful. Stateful
subtopologies get standby replicas assigned, if enabled. For example, if
we have 100 subtopologies with 100 partitions, we get 10,000 active
tasks and 5,000 standby tasks.
Reviewers: Bill Bejeck bbejeck@apache.org