Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-9885; Evict last members of a group when the maximum allowed is reached #8525

Merged
merged 2 commits into from
Apr 27, 2020

Conversation

dajac
Copy link
Contributor

@dajac dajac commented Apr 21, 2020

This PR updates the algorithm which limits the number of members within a group (group.max.size) to fix the following two issues:

  1. As described in KAFKA-9885, we found out that multiple members of a group can be evicted if the leader of the consumer offset partition changes before the group is persisted. This happens because the current evection logic always evict the first member rejoining the group.
  2. We also found out that dynamic members, when required to have a known member id, are not always limited. The caveat is that the current logic only considers unknown members and uses the group size, which does not include the so called pending members, to accept or reject a member. In this case, when they rejoins, they are not unknown member anymore and thus could bypass the limit. See testDynamicMembersJoinGroupWithMaxSizeAndRequiredKnownMember for the whole scenario.

This PR changes the logic to address the above two issues and extends the tests coverage to cover all the member types.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@dajac dajac changed the title [WIP] KAFKA-9885; Evict last members of a group when the maximum allowed is reached KAFKA-9885; Evict last members of a group when the maximum allowed is reached Apr 24, 2020
@dajac dajac marked this pull request as ready for review April 24, 2020 12:56
@dajac
Copy link
Contributor Author

dajac commented Apr 24, 2020

cc @hachikuji @abbccdda

@dajac
Copy link
Contributor Author

dajac commented Apr 24, 2020

Related to #8437.

@hachikuji
Copy link
Contributor

ok to test

Copy link
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I can fix the typo before merging.

@@ -124,6 +124,35 @@ class GroupCoordinator(val brokerId: Int,
info("Shutdown complete.")
}

/**
* Verify if the group has space to accept the joining member. The various
* criteria are explained bellow.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: bellow

@hachikuji hachikuji merged commit c5d13dc into apache:trunk Apr 27, 2020
ijuma added a commit to ijuma/kafka that referenced this pull request Apr 29, 2020
…t-for-generated-requests

* apache-github/trunk: (366 commits)
  MINOR: Improve producer test BufferPoolTest#testCloseNotifyWaiters. (apache#7982)
  MINOR: document how to escape json parameters to ducktape tests (apache#8546)
  KAFKA-9885; Evict last members of a group when the maximum allowed is reached (apache#8525)
  KAFKA-9866: Avoid election for topics where preferred leader is not in ISR (apache#8524)
  KAFKA-9839; Broker should accept control requests with newer broker epoch (apache#8509)
  KAKFA-9612: Add an option to kafka-configs.sh to add configs from a prop file (KIP-574)
  MINOR: Partition is under reassignment when adding and removing (apache#8364)
  MINOR: reduce allocations in log start and recovery checkpoints (apache#8467)
  MINOR: Remove unused foreign-key join class (apache#8547)
  HOTFIX: Fix broker bounce system tests (apache#8532)
  KAFKA-9704: Fix the issue z/OS won't let us resize file when mmap. (apache#8224)
  KAFKA-8639: Replace AddPartitionsToTxn with Automated Protocol  (apache#8326)
  MINOR: equals() should compare all fields for generated classes (apache#8539)
  KAFKA-9844; Fix race condition which allows more than maximum number of members(apache#8454)
  KAFKA-9823: Remember the sent generation for the coordinator request (apache#8445)
  KAFKA-9883: Add better error message when REST API forwards a request and leader is not known (apache#8536)
  KAFKA-9907: Switch default build to Scala 2.13 (apache#8537)
  MINOR: Some html fixes in Streams DSL documentation (apache#8503)
  MINOR: Enable fatal warnings with scala 2.13 (apache#8429)
  KAFKA-9852: Change the max duration that calls to the buffer pool can block from 2000ms to 10ms to reduce overall test runtime (apache#8464)
  ...
@dajac dajac deleted the KAFKA-9885 branch October 6, 2020 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants