Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CURATOR-495] Fixes race in many Curator recipes which could block ZK's event thread #297

Merged
merged 3 commits into from
Dec 17, 2018

Conversation

Randgalt
Copy link
Member

Fixes race in many Curator recipes whereby a pattern was used that called "notifyAll" in a synchronized block in response to a ZooKeeper watcher callback. This created a race and possible deadlock if the recipe instance was already in a synchronized block. This would result in the ZK event thread getting blocked which would prevent ZK connections from getting repaired. This change adds a new executor (available from CuratorFramework) that can be used to do the sync/notify so that ZK's event thread is not blocked.

Fixes race in many Curator recipes whereby a pattern was used that called "notifyAll" in a synchronized block in response to a ZooKeeper watcher callback. This created a race and possible deadlock if the recipe instance was already in a synchronized block. This would result in the ZK event thread getting blocked which would prevent ZK connections from getting repaired. This change adds a new executor (available from CuratorFramework) that can be used to do the sync/notify so that ZK's event thread is not blocked.
* @param monitorHolder object to sync on and notify
* @since 4.1.0
*/
default void postSafeNotify(Object monitorHolder)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about the future...
Would it be better to make it return a CompletableFuture?
This way the caller will have a chance to know when eventually the operation has been executed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea - I'll add that

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 (non binding)

Awesome!

@cammckenzie
Copy link
Contributor

I have just run the tests and everything passed, which is a bit of a novelty! Still failed at the end with that surefire issue. I think that this is good to merge though. Nice work!

@Randgalt
Copy link
Member Author

Randgalt commented Dec 17, 2018

For the record: here’s the Surefire bug report: https://issues.apache.org/jira/browse/SUREFIRE-1612

@asfgit asfgit merged commit 2e802ef into master Dec 17, 2018
@eolivelli
Copy link
Contributor

Did you try to upgrade testng and surefire to latest and greatest version?

@Randgalt
Copy link
Member Author

Did you try to upgrade testng and surefire to latest and greatest version

Surefire was updated as part of getting the latest Apache POM. I'll play around with latest TestNG

@Randgalt
Copy link
Member Author

FYI - I just tried it with the latest TestNg and it still fails. So, I re-opened my issue there: testng-team/testng#1976

gabrywu added a commit to gabrywu/dolphinscheduler that referenced this pull request Mar 5, 2020
Technoboy- added a commit to apache/dolphinscheduler that referenced this pull request Mar 5, 2020
* upgrade curator version
issue: #2020
curator issue: apache/curator#297

* add DefaultEnsembleProvider override

* add some unit test

* add License
@tisonkun tisonkun deleted the CURATOR-495 branch March 13, 2023 07:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants