Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K7657 handling thread dead state change #6091

Merged

Conversation

guozhangwang
Copy link
Contributor

While looking into KAFKA-7657, I found there are a few loopholes in this logic:

We kept a map of thread-name to thread-state and a global-thread state at the KafkaStreams instance-level, in addition to the instance state itself. stateLock is used when accessing the instance state, however when we are in the thread state change callback, we are accessing both the thread-states as well as the instance state at the same time in the callers of setState without a lock, which is vulnerable to concurrent multi-stream threads. The fix is a) introduce a threadStatesLock in addition to the stateLock, which should always be grabbed to modify the thread-states map before the stateLock for modifying the instance level; and we also defer the checking of the instance-level state inside the setState call.

When transiting to state.RUNNING, we check if all threads are either in RUNNING or DEAD state, this is because some threads maybe dead at the rebalance period but we should still proceed to RUNNING if the rest of threads are still transiting to RUNNING.

Added unit test for 2) above. Also simplified another test as a nit change.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@guozhangwang guozhangwang merged commit 8e4799b into apache:trunk Jan 5, 2019
bbejeck pushed a commit that referenced this pull request Mar 19, 2019
…ead (#6468)

In a previous commit #6091, we've fixed a couple of edge cases and hence do not need to remove state listener anymore (before that we removed the state listener intentionally to avoid some race conditions, which has been gone for now).

Reviewers: Matthias J. Sax <mjsax@apache.org>,   Bill Bejeck <bbejeck@gmail.com>
bbejeck pushed a commit that referenced this pull request Mar 19, 2019
…ead (#6468)

In a previous commit #6091, we've fixed a couple of edge cases and hence do not need to remove state listener anymore (before that we removed the state listener intentionally to avoid some race conditions, which has been gone for now).

Reviewers: Matthias J. Sax <mjsax@apache.org>,   Bill Bejeck <bbejeck@gmail.com>
pengxiaolong pushed a commit to pengxiaolong/kafka that referenced this pull request Jun 14, 2019
While looking into KAFKA-7657, I found there are a few loopholes in this logic:

We kept a map of thread-name to thread-state and a global-thread state at the KafkaStreams instance-level, in addition to the instance state itself. stateLock is used when accessing the instance state, however when we are in the thread state change callback, we are accessing both the thread-states as well as the instance state at the same time in the callers of setState without a lock, which is vulnerable to concurrent multi-stream threads. The fix is a) introduce a threadStatesLock in addition to the stateLock, which should always be grabbed to modify the thread-states map before the stateLock for modifying the instance level; and we also defer the checking of the instance-level state inside the setState call.

When transiting to state.RUNNING, we check if all threads are either in RUNNING or DEAD state, this is because some threads maybe dead at the rebalance period but we should still proceed to RUNNING if the rest of threads are still transiting to RUNNING.

Added unit test for 2) above. Also simplified another test as a nit change.

Reviewers: John Roesler <vvcephei@users.noreply.github.com>, Matthias J. Sax <mjsax@apache.org>
pengxiaolong pushed a commit to pengxiaolong/kafka that referenced this pull request Jun 14, 2019
…ead (apache#6468)

In a previous commit apache#6091, we've fixed a couple of edge cases and hence do not need to remove state listener anymore (before that we removed the state listener intentionally to avoid some race conditions, which has been gone for now).

Reviewers: Matthias J. Sax <mjsax@apache.org>,   Bill Bejeck <bbejeck@gmail.com>
@guozhangwang guozhangwang deleted the K7657-handling-thread-dead-state-change branch April 25, 2020 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant