Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-112202: Ensure that condition.notify() succeeds even when racing with Task.cancel() #112201

Merged
merged 8 commits into from Feb 3, 2024

Conversation

kristjanvalur
Copy link
Contributor

@kristjanvalur kristjanvalur commented Nov 17, 2023

If Condition.notify() raises an exception that could have occurred after the task was selected to be woken up, wake up another task instead, if available.

This ensures that the typical case, of waking up one task to handle something, will succeed even if the candidate task
is cancelled at the same time or otherwise wakes up with an error.

This may result in a spurious wakeup of a waiting task, which is what the Condition Variable protocol specifies, precisely because waking up tasks conservatively (rather wake up too many than too few) is the safest and simplest to implement, and missing wakeups are serious (and cause deadlocks), while extra wakeups are easily handled by application code.

IMHO trying to guarantee that we never perform any extra wakeups would complicate the code too much and be too hard to verify, particularly since that is not a guarantee that Condition Variables are supposed to provide.

The PR includes

  • Tests
  • Fix to asyncio.Condition where raising an exception out of condition.wait after having released the lock will result in a second task being awoken, if available
  • Documentation changes explaining the Condition Variable protocol and how if favors too many wakeups over to few wakeups (i.e. how "spurious wakeups" may occur)

📚 Documentation preview 📚: https://cpython-previews--112201.org.readthedocs.build/

@kristjanvalur kristjanvalur marked this pull request as draft November 17, 2023 13:43
@kristjanvalur kristjanvalur changed the title Fix race condition between asyncio.Task.cancel() (and other errors) and asyncio.Condition.notify() GH-112202: Ensure that condition.notify() succeeds even when racing with Task.cancel() Nov 17, 2023
@kristjanvalur kristjanvalur changed the title GH-112202: Ensure that condition.notify() succeeds even when racing with Task.cancel() gh-112202: Ensure that condition.notify() succeeds even when racing with Task.cancel() Nov 17, 2023
@kristjanvalur kristjanvalur marked this pull request as ready for review November 17, 2023 14:03
@gvanrossum
Copy link
Member

I'm reviewing this, but (as usual with locking changes) it is hard work (thanks for helping out here!). Plus I need clarity about the relationship with gh-111694.

@gvanrossum gvanrossum marked this pull request as draft November 20, 2023 22:01
@gvanrossum
Copy link
Member

Turned it into a draft until gh-111694 is merged.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've put off reviewing this PR until gh-111694 lands, but I had already started some doc nits, so here they are. Also note I changed this to Draft mode -- please undraft it once the other PR is merged (into main, and here).

Doc/library/asyncio-sync.rst Outdated Show resolved Hide resolved
Doc/library/asyncio-sync.rst Outdated Show resolved Hide resolved
Doc/library/asyncio-sync.rst Outdated Show resolved Hide resolved

def _notify(self, n):
idx = 0
for fut in self._waiters:
if idx >= n:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the docstring for notify_all() below mentions threads (twice). That should probably be changed to tasks. (Or coroutines, like above? Though IMO that should also be tasks -- in practice all coroutines are wrapped by tasks, and tasks are the unit of control that users are encouraged to think in terms of.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think the mention of "coroutines" is a relic from the very old days.
The locks.py discusses coroutines in a lot of the docstrings where "tasks" are more appropriate. scheduling works on Task objects, not coroutines. I can change it wholesale, do I use "Task" or "task" when doing so?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use "task" -- it doesn't really matter whether they are technically instances of asyncio.Task, and in fact IIRC even loops may override create_task() to return instances of some other class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I take it you approve of me going over the other inline docs/comments and making that correction, I'll do that in a separate commit.

Comment on lines 302 to 324
This method wakes up at most n of the coroutines waiting for the
This method wakes up n of the coroutines waiting for the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as in the docs above.

@kristjanvalur kristjanvalur force-pushed the kristjan/condition_notify branch 2 times, most recently from 0939710 to d7be0d0 Compare January 12, 2024 09:38
@kristjanvalur kristjanvalur marked this pull request as ready for review January 12, 2024 09:39
@kristjanvalur
Copy link
Contributor Author

I've rebased and re-opened this pr, will address the comments presently.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, we're close to merging this now.

I wonder if we should use "awakened" consistently instead of "awoken"?

PS. I will admit to not carefully reading through the tests, I trust you have got this right.

Doc/library/asyncio-sync.rst Outdated Show resolved Hide resolved
Doc/library/asyncio-sync.rst Outdated Show resolved Hide resolved
Lib/asyncio/locks.py Outdated Show resolved Hide resolved
Lib/asyncio/locks.py Outdated Show resolved Hide resolved
Lib/asyncio/locks.py Outdated Show resolved Hide resolved
Lib/asyncio/locks.py Outdated Show resolved Hide resolved

def _notify(self, n):
idx = 0
for fut in self._waiters:
if idx >= n:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use "task" -- it doesn't really matter whether they are technically instances of asyncio.Task, and in fact IIRC even loops may override create_task() to return instances of some other class.

kristjanvalur and others added 2 commits January 16, 2024 19:17
Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
Lib/asyncio/locks.py Outdated Show resolved Hide resolved
Lib/asyncio/locks.py Outdated Show resolved Hide resolved
@kristjanvalur
Copy link
Contributor Author

Thanks, we're close to merging this now.

I wonder if we should use "awakened" consistently instead of "awoken"?

I think you are right, I fixed it.

PS. I will admit to not carefully reading through the tests, I trust you have got this right.

I certainly hope so :). I made sure they failed without the fix, it's easy to test.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two doc nits, then I'll merge.

Lib/asyncio/locks.py Outdated Show resolved Hide resolved
Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
@gvanrossum gvanrossum merged commit 6b53d5f into python:main Feb 3, 2024
33 checks passed
@kristjanvalur
Copy link
Contributor Author

Thank you for your help, Guido!

aisk pushed a commit to aisk/cpython that referenced this pull request Feb 11, 2024
…cing with Task.cancel() (python#112201)

Also did a general cleanup of asyncio locks.py comments and docstrings.
fsc-eriker pushed a commit to fsc-eriker/cpython that referenced this pull request Feb 14, 2024
…cing with Task.cancel() (python#112201)

Also did a general cleanup of asyncio locks.py comments and docstrings.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants