Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multithread CScheduler and reenable test #8016

Merged
merged 2 commits into from
May 10, 2016

Conversation

paveljanik
Copy link
Contributor

This fixes the deadlock in the CScheduler when there are other serviceQueues waiting for the new task to be added to the taskQueue, but schedule notifies only one of them who processes it and ends. And the other ones get stucked waiting for the new task which doesn't come in at all.

Hard to reproduce, see #6540 for the discussion.

Fixes #6540.

@maflcko maflcko added the Tests label May 6, 2016
@fanquake
Copy link
Member

fanquake commented May 6, 2016

Will also close #8005

@maflcko
Copy link
Member

maflcko commented May 6, 2016

Tested ACK

28655b8:

$ for i in {1..10000}; do timeout 1 src/test/test_bitcoin --run_test=scheduler_tests &> /dev/null ;echo $? >> /tmp/res ;done;cat /tmp/res|sort|uniq -c
   9630 0
    370 124

70af981:

$ rm /tmp/res; for i in {1..10000}; do timeout 1 src/test/test_bitcoin --run_test=scheduler_tests &> /dev/null ;echo $? >> /tmp/res ;done;cat /tmp/res|sort|uniq -c
  10000 0

Travis issue unrelated.

@laanwj
Copy link
Member

laanwj commented May 6, 2016

Thanks for fixing this!

@laanwj
Copy link
Member

laanwj commented May 6, 2016

For some reason Travis is failing on this due to python zmq problems. I don't understand why, as there is no Python (or zmq) related change here. Will try to clear Travis' caches.

@paveljanik
Copy link
Contributor Author

This is because my branch is based on the pre-python3 merge. Master is python3, thus there is no zmq module for python2...

@paveljanik paveljanik force-pushed the 20160506_multithread_CScheduler branch from 70af981 to 166e4b0 Compare May 6, 2016 18:45
@paveljanik
Copy link
Contributor Author

Rebased should solve this, I think. Maybe travis kick can help now.

@paveljanik
Copy link
Contributor Author

This helped. Hmm, shouldn't we temporary add python-zmq back to travis to workaround this?

Or should we modify the test script that requires python's zmq module to dot no anything when there is no such module installed (this could also simplify the tests setup for everyone!)? @MarcoFalke what do you think?

@maflcko
Copy link
Member

maflcko commented May 7, 2016

@paveljanik python-zmq is already present in .travis.yml, you can't add it more than once ;)

I changed it (#7851) to fail instead, so errors are detected (and not silently ignored). The travis failure was an error due to broken cache. I think @laanwj cleared the cache and the travis issue is now fixed.

@maflcko
Copy link
Member

maflcko commented May 7, 2016

ut re-ACK 166e4b0

@paveljanik
Copy link
Contributor Author

@MarcoFalke there is python3-zmq, not python-zmq...

@theuni
Copy link
Member

theuni commented May 7, 2016

Does it need to signal in the catch/rethrow case as well?

@paveljanik
Copy link
Contributor Author

@theuni I think rethrow case is there only for f throwing some exception. It is not a case for our multithread test. But in general, yes, I think there could be the same problem for exception throwing f.

@laanwj
Copy link
Member

laanwj commented May 10, 2016

@paveljanik Are you going to fix that here?

@paveljanik
Copy link
Contributor Author

No. I'm think about writing another test that will use the exception calls, but not here and not now, sorry.

@laanwj
Copy link
Member

laanwj commented May 10, 2016

No need to be sorry, thanks for your fix! just needed clarity about which state this pull is in.

utACK 166e4b0

@laanwj laanwj merged commit 166e4b0 into bitcoin:master May 10, 2016
laanwj added a commit that referenced this pull request May 10, 2016
166e4b0 Notify other serviceQueue thread we are finished to prevent deadlocks. (Pavel Janík)
db18ab2 Reenable multithread scheduler test. (Pavel Janík)
maflcko pushed a commit to maflcko/bitcoin-core that referenced this pull request Jun 9, 2016
LarryRuane pushed a commit to LarryRuane/zcash that referenced this pull request Feb 20, 2021
zkbot added a commit to zcash/zcash that referenced this pull request Apr 1, 2021
Bitcoin 0.13 locking PRs

These are locking changes from upstream (bitcoin core) release 0.13, oldest to newest (when they were merged to the master branch).
- bitcoin/bitcoin#7846
- bitcoin/bitcoin#7913
- bitcoin/bitcoin#8016
  - second commit only; first commit, test changes, are already done
- bitcoin/bitcoin#7942

This PR does not include:
 - bitcoin/bitcoin#8244 bitcoin/bitcoin@27f8126
   -  zcash requires locking `cs_main` in this instance (`getrawmempool()` calls `mempoolToJSON()`, which calls `chainActive.Height()`).
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

scheduler_tests (currently disabled) occasionally deadlocks
5 participants