-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-5226: Fixes issue where adding topics matching a regex #3157
KAFKA-5226: Fixes issue where adding topics matching a regex #3157
Conversation
subscribed stream may not be detected by all followers until onJoinComplete returns.
This PR address situation where adding topics matching a regex subscribed stream may not be detected by all followers in ping @mjsax @dguy @guozhangwang for review |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
The test failure is |
retest this please |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Good find! About the fix itself, in
I think a better way to fix it is to modify Also if we do that in |
|
||
try { | ||
streamsLeader.start(); | ||
Thread.sleep(1000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest avoid Thread.sleep()
as much as possible in unit test as it will introduce timing-dependent flakiness. If there is any condition that we need to wait on consider using TestUtils.waitForCondition
.
Also this test seem not always testing the race condition that can cause the issue; I'd suggest make it really "modular" by adding a test for StreamThread
, in which we do not need to start the thread but just mocking its embedded StreamPartitionAssignor#onAssignment()
which gives it some assigned partition whose topic is not unknown, and then call its onPartitionAssigned
callback to check if the topology builder has updated its topology.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove the first two, but ok to leave in the one on line 352? I want to space the the creation of topics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually thinking more about making the unit test really "unit" by mocking stream partition assignor to always return unknown topic partitions; the current test case seems to be more "integration test"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is, I see your point now about the test, I'll make the changes.
@guozhangwang thanks for the review. I'll move the fix to the From what I can see we'll need to leave the to call to |
makes sense. |
@guozhangwang updates per comments.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits. Overall LGTM.
|
||
|
||
private void updateSubscribedTopics(Assignment assignment) { | ||
if (streamThread != null && streamThread.builder.sourceTopicPattern() != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could streamThread
be null
-- null
would indicate a bug and we might want to fail hard on that instead of masking it with this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right it can't, I put that in for unit tests, I'll remove and update the tests
} | ||
|
||
|
||
private void updateSubscribedTopics(Assignment assignment) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add final
also in L616
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
for (TopicPartition topicPartition : assignment.partitions()) { | ||
assignedTopics.add(topicPartition.topic()); | ||
} | ||
if (!assignedTopics.isEmpty() && !streamThread.builder.subscriptionUpdates().getUpdates().containsAll(assignedTopics)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would containsAll
not return true
if assignedTopics
is empty? (ie, can we remove the first check?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, containsAll
returns if assignedTopics
is empty, removing first check.
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
@mjsax removed null check for cc\ @guozhangwang |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update! LGTM overall, just a few minor comments.
} | ||
|
||
|
||
private void updateSubscribedTopics(final Assignment assignment) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could we change this to maybeUpdateSubscribedTopics(Set<String> topics)
with the check on streamThread.builder.sourceTopicPattern() != null
within the function so that both subscription()
and onAssignment
could consolidate on this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This will cause a rebuild of the topology in those cases where all topics are already known. But thinking about it some more, IMHO it's worth the tradeoff for simplifying the logic.
@@ -104,6 +106,12 @@ | |||
private final MockClientSupplier mockClientSupplier = new MockClientSupplier(); | |||
private final TopologyBuilder builder = new TopologyBuilder(); | |||
private final StreamsConfig config = new StreamsConfig(configProps()); | |||
private final StreamThread mockStreamThread = new StreamThread(builder, config, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why modify this test class to let partitionAssignor
set this mock stream thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some tests using onAssign
which were getting NPEs. Since in practice the embedded streamThread
should never be null, I opted to update the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
assignment = new PartitionAssignor.Assignment(topicPartitions, info.encode()); | ||
partitionAssignor.onAssignment(assignment); | ||
|
||
assertTrue(nodeToSourceTopics.get("source").size() == 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the third step (i.e. from 2 topics to 3 topics) intended for additional test coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I wanted to exercise the functionality more than once.
@guozhangwang thanks for the additional pass, updates per comments. |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
retest this please |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
previous failure was unrelated to this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. (beside the reflection comment, but this should not block the PR).
final Field | ||
nodeToSourceTopicsField = | ||
topologyBuilder.getClass().getDeclaredField("nodeToSourceTopics"); | ||
nodeToSourceTopicsField.setAccessible(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arg. Reflections. Seems like our code is not well structure if we need this. Can you file a Jira to do some follow up cleanup (or if this does not make it into 0.11, we might to the cleanup first).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
@guozhangwang @mjsax updated per comments and discussion on slack. Specifically, now we'll only rebuild the topology under two conditions
|
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
LGTM! Merged to trunk and 0.11.0. |
subscribed stream may not be detected by all followers until onJoinComplete returns. Author: Bill Bejeck <bill@confluent.io> Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com> Closes #3157 from bbejeck/KAFKA-5226_null_pointer_source_node_deserialize (cherry picked from commit 6360e04) Signed-off-by: Guozhang Wang <wangguoz@gmail.com>
subscribed stream may not be detected by all followers until
onJoinComplete returns.