New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-16565: IncrementalAssignmentConsumerEventHandler throws error when attempting to remove a partition that isn't assigned #15737
Conversation
…hen attempting to remove a partition that isn't assigned Checking that the TopicPartition is in assignment before attempting to remove it. Also added some logging and refactoring.
self.assignment.remove(tp) | ||
revoked.append(tp) | ||
else: | ||
logger.warn("Could not remove topic partition %s from assignment as it was not previously assigned to %s" % (tp, node.account.hostname)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we understand why this situation is happening? Is it related maybe to the mismatch assignment failure we've seen elsewhere in the tests? My point is just to make sure we're not hiding the real failure with this change. I wouldn't expect that the consumer would ever receive a partition to revoke if it was not previously assigned right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You’re right @lianetm, this fix could result in a sweeping the problem under the rug, so to speak. I'll change the logic so that this case still results an error, but with more information so we can debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lianetm—I changed the logging to an assert
that provides useful information for troubleshooting:
tp = _create_partition_from_dict(topic_partition)
assert tp in self.assignment, \
"Topic partition %s cannot be revoked from %s as it was not previously assigned to that consumer" % \
(tp, node.account.hostname)
self.assignment.remove(tp)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! better I believe. Do we have a Jira to investigate the failure leading to this? it's concerning (and even more if the case is that is happening with the new protocol only??)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lianetm—I will file a JIra on this in the next day or two. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed KAFKA-16623, FYI.
@lucasbru—Can you review this change to the consumer system test harness? Thanks! |
Do I understand it correctly that there is no functional change here, just logging? |
assert tp in self.assignment, \ | ||
"Topic partition %s cannot be revoked from %s as it was not previously assigned to that consumer" % \ | ||
(tp, node.account.hostname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lucasbru—this is the main functional change: ensure that an attempt to remove a partition from the local state verifies that it was previously assigned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
Checking that the
TopicPartition
is in assignment before attempting to remove it.Also added some logging and refactoring.
Committer Checklist (excluded from commit message)