Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZOOKEEPER-4537: Race between SyncThread and CommitProcessor thread #1877

Closed
wants to merge 1 commit into from

Conversation

jithin23
Copy link
Contributor

@jithin23 jithin23 commented May 16, 2022

Zookeeper server can get stuck when it has just one client and the only way it recovers is due to a socket timeout or another client commit request.
Sync thread reads commitIsWaiting outside of a sync block and acts on this information in a sync block later. The actual status of commitIsWaiting can change between the time where commitIsWaiting is read and acted upon because commit thread updates it outside a sync block.
Fix here is to ensure that we read and process commitIsWaiting inside a sync block.

https://issues.apache.org/jira/browse/ZOOKEEPER-4537

@eolivelli
Copy link
Contributor

Good catch.
Do you have some stack traces dump to demonstrate the problem?

Copy link
Contributor

@symat symat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @jithin23 for your PR! Looks good. I retriggered the CI (most probably we just had some flaky tests)

@eolivelli thanks for checking! We had some discussions and a bit more background on the Jira ticket.

@symat
Copy link
Contributor

symat commented May 17, 2022

I think this race bug was introduced by ZOOKEEPER-2024 in 3.6.0, it would be important to cherrypick this to branch-3.6, branch-3.7, and branch-3.8 too.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eolivelli
Copy link
Contributor

eolivelli commented May 17, 2022

@symat can you please merge to all the active branches (apart from 3.5) ?

@symat
Copy link
Contributor

symat commented May 17, 2022

sure, I'll merge it later today. thanks for the review!

@asfgit asfgit closed this in f770467 May 17, 2022
asfgit pushed a commit that referenced this pull request May 17, 2022
Zookeeper server can get stuck when it has just one client and the only way it recovers is due to a socket timeout or another client commit request.
Sync thread reads commitIsWaiting outside of a sync block and acts on this information in a sync block later. The actual status of commitIsWaiting can change between the time where commitIsWaiting is read and acted upon because commit thread updates it outside a sync block.
Fix here is to ensure that we read and process commitIsWaiting inside a sync block.

https://issues.apache.org/jira/browse/ZOOKEEPER-4537

Author: jithin23 <jithin.girish@gmail.com>

Reviewers: Enrico Olivelli <eolivelli@apache.org>, Mate Szalay-Beko <symat@apache.org>

Closes #1877 from jithin23/ZOOKEEPER-4537

(cherry picked from commit f770467)
Signed-off-by: Mate Szalay-Beko <mszalay@cloudera.com>
asfgit pushed a commit that referenced this pull request May 17, 2022
Zookeeper server can get stuck when it has just one client and the only way it recovers is due to a socket timeout or another client commit request.
Sync thread reads commitIsWaiting outside of a sync block and acts on this information in a sync block later. The actual status of commitIsWaiting can change between the time where commitIsWaiting is read and acted upon because commit thread updates it outside a sync block.
Fix here is to ensure that we read and process commitIsWaiting inside a sync block.

https://issues.apache.org/jira/browse/ZOOKEEPER-4537

Author: jithin23 <jithin.girish@gmail.com>

Reviewers: Enrico Olivelli <eolivelli@apache.org>, Mate Szalay-Beko <symat@apache.org>

Closes #1877 from jithin23/ZOOKEEPER-4537

(cherry picked from commit f770467)
Signed-off-by: Mate Szalay-Beko <mszalay@cloudera.com>
asfgit pushed a commit that referenced this pull request May 17, 2022
Zookeeper server can get stuck when it has just one client and the only way it recovers is due to a socket timeout or another client commit request.
Sync thread reads commitIsWaiting outside of a sync block and acts on this information in a sync block later. The actual status of commitIsWaiting can change between the time where commitIsWaiting is read and acted upon because commit thread updates it outside a sync block.
Fix here is to ensure that we read and process commitIsWaiting inside a sync block.

https://issues.apache.org/jira/browse/ZOOKEEPER-4537

Author: jithin23 <jithin.girish@gmail.com>

Reviewers: Enrico Olivelli <eolivelli@apache.org>, Mate Szalay-Beko <symat@apache.org>

Closes #1877 from jithin23/ZOOKEEPER-4537

(cherry picked from commit f770467)
Signed-off-by: Mate Szalay-Beko <mszalay@cloudera.com>
@symat
Copy link
Contributor

symat commented May 17, 2022

merge is done, thank you @jithin23 for the contribution!

anuragmadnawat1 pushed a commit to anuragmadnawat1/zookeeper that referenced this pull request Nov 2, 2022
Zookeeper server can get stuck when it has just one client and the only way it recovers is due to a socket timeout or another client commit request.
Sync thread reads commitIsWaiting outside of a sync block and acts on this information in a sync block later. The actual status of commitIsWaiting can change between the time where commitIsWaiting is read and acted upon because commit thread updates it outside a sync block.
Fix here is to ensure that we read and process commitIsWaiting inside a sync block.

https://issues.apache.org/jira/browse/ZOOKEEPER-4537

Author: jithin23 <jithin.girish@gmail.com>

Reviewers: Enrico Olivelli <eolivelli@apache.org>, Mate Szalay-Beko <symat@apache.org>

Closes apache#1877 from jithin23/ZOOKEEPER-4537
anuragmadnawat1 added a commit to anuragmadnawat1/zookeeper that referenced this pull request Nov 2, 2022
Zookeeper server can get stuck when it has just one client and the only way it recovers is due to a socket timeout or another client commit request.
Sync thread reads commitIsWaiting outside of a sync block and acts on this information in a sync block later. The actual status of commitIsWaiting can change between the time where commitIsWaiting is read and acted upon because commit thread updates it outside a sync block.
Fix here is to ensure that we read and process commitIsWaiting inside a sync block.

https://issues.apache.org/jira/browse/ZOOKEEPER-4537

Author: jithin23 <jithin.girish@gmail.com>

Reviewers: Enrico Olivelli <eolivelli@apache.org>, Mate Szalay-Beko <symat@apache.org>

Closes apache#1877 from jithin23/ZOOKEEPER-4537

Co-authored-by: jithin23 <jithin.girish@gmail.com>
anurag-harness pushed a commit to anurag-harness/zookeeper that referenced this pull request Jan 13, 2023
Zookeeper server can get stuck when it has just one client and the only way it recovers is due to a socket timeout or another client commit request.
Sync thread reads commitIsWaiting outside of a sync block and acts on this information in a sync block later. The actual status of commitIsWaiting can change between the time where commitIsWaiting is read and acted upon because commit thread updates it outside a sync block.
Fix here is to ensure that we read and process commitIsWaiting inside a sync block.

https://issues.apache.org/jira/browse/ZOOKEEPER-4537

Author: jithin23 <jithin.girish@gmail.com>

Reviewers: Enrico Olivelli <eolivelli@apache.org>, Mate Szalay-Beko <symat@apache.org>

Closes apache#1877 from jithin23/ZOOKEEPER-4537
anurag-harness added a commit to anurag-harness/zookeeper that referenced this pull request Jan 13, 2023
Zookeeper server can get stuck when it has just one client and the only way it recovers is due to a socket timeout or another client commit request.
Sync thread reads commitIsWaiting outside of a sync block and acts on this information in a sync block later. The actual status of commitIsWaiting can change between the time where commitIsWaiting is read and acted upon because commit thread updates it outside a sync block.
Fix here is to ensure that we read and process commitIsWaiting inside a sync block.

https://issues.apache.org/jira/browse/ZOOKEEPER-4537

Author: jithin23 <jithin.girish@gmail.com>

Reviewers: Enrico Olivelli <eolivelli@apache.org>, Mate Szalay-Beko <symat@apache.org>

Closes apache#1877 from jithin23/ZOOKEEPER-4537

Co-authored-by: jithin23 <jithin.girish@gmail.com>
desaikomal pushed a commit to linkedin/zookeeper that referenced this pull request Jun 17, 2023
Zookeeper server can get stuck when it has just one client and the only way it recovers is due to a socket timeout or another client commit request.
Sync thread reads commitIsWaiting outside of a sync block and acts on this information in a sync block later. The actual status of commitIsWaiting can change between the time where commitIsWaiting is read and acted upon because commit thread updates it outside a sync block.
Fix here is to ensure that we read and process commitIsWaiting inside a sync block.

https://issues.apache.org/jira/browse/ZOOKEEPER-4537

Author: jithin23 <jithin.girish@gmail.com>

Reviewers: Enrico Olivelli <eolivelli@apache.org>, Mate Szalay-Beko <symat@apache.org>

Closes apache#1877 from jithin23/ZOOKEEPER-4537

(cherry picked from commit f770467)
Signed-off-by: Mate Szalay-Beko <mszalay@cloudera.com>
desaikomal pushed a commit to linkedin/zookeeper that referenced this pull request Jun 27, 2023
Zookeeper server can get stuck when it has just one client and the only way it recovers is due to a socket timeout or another client commit request.
Sync thread reads commitIsWaiting outside of a sync block and acts on this information in a sync block later. The actual status of commitIsWaiting can change between the time where commitIsWaiting is read and acted upon because commit thread updates it outside a sync block.
Fix here is to ensure that we read and process commitIsWaiting inside a sync block.

https://issues.apache.org/jira/browse/ZOOKEEPER-4537

Author: jithin23 <jithin.girish@gmail.com>

Reviewers: Enrico Olivelli <eolivelli@apache.org>, Mate Szalay-Beko <symat@apache.org>

Closes apache#1877 from jithin23/ZOOKEEPER-4537

(cherry picked from commit f770467)
Signed-off-by: Mate Szalay-Beko <mszalay@cloudera.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants