Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Counter-intuitive behavior of Server.close() / wait_closed() #79033

Closed
aymericaugustin mannequin opened this issue Sep 30, 2018 · 19 comments
Closed

Counter-intuitive behavior of Server.close() / wait_closed() #79033

aymericaugustin mannequin opened this issue Sep 30, 2018 · 19 comments
Labels
release-blocker topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@aymericaugustin
Copy link
Mannequin

aymericaugustin mannequin commented Sep 30, 2018

BPO 34852
Nosy @asvetlov, @cjerdonek, @1st1

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2018-09-30.14:16:34.429>
labels = ['type-bug', '3.7', 'expert-asyncio']
title = 'Counter-intuitive behavior of Server.close() / wait_closed()'
updated_at = <Date 2021-05-30.16:58:41.511>
user = 'https://bugs.python.org/aymericaugustin'

bugs.python.org fields:

activity = <Date 2021-05-30.16:58:41.511>
actor = 'aymeric.augustin'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['asyncio']
creation = <Date 2018-09-30.14:16:34.429>
creator = 'aymeric.augustin'
dependencies = []
files = []
hgrepos = []
issue_num = 34852
keywords = []
message_count = 3.0
messages = ['326725', '326732', '394772']
nosy_count = 4.0
nosy_names = ['asvetlov', 'chris.jerdonek', 'aymeric.augustin', 'yselivanov']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue34852'
versions = ['Python 3.7']

Linked PRs

@aymericaugustin
Copy link
Mannequin Author

aymericaugustin mannequin commented Sep 30, 2018

**Summary**

  1. Is it correct for Server.wait_closed() (as implemented in asyncio) to be a no-op after Server.close()?
  2. How can I tell that all incoming connections have been received by connection_made() after Server.close()?

**Details**

After calling Server.close(), _sockets is None, which makes Server.wait_closed() a no-op: it returns immediately without doing anything (as mentioned in https://bugs.python.org/issue33727).

I'm not sure why the docs suggest to call wait_closed() after close() if it's a no-op. My best guess is: "this design supports third-party event loops that requires an asynchronous API for closing servers, but the built-in event loops don't need that". Does someone know?

I wrote a very simple server that merely accepts connections. I ran experiments where I saturate the server with incoming client connections and close it. I checked what happens around close() (and wait_closed() -- but as it doesn't do anything after close() I'll just say close() from now on.)

The current implementation appears to work as documented, assuming an rather low level interpretation of the docs of Server.close().

Stop serving: close listening sockets and set the sockets attribute to None.

Correct -- I'm not seeing any accept calls in BaseSelectorEventLoop._accept_connection after close().

The sockets that represent existing incoming client connections are left open.

Correct -- if "existing incoming client connections" is interpreted as "client connections that have gone through accept".

The server is closed asynchronously, use the wait_closed() coroutine to wait until the server is closed.

I'm seeing calls to connection_made() after close() because BaseSelectorEventLoop._accept_connection2 triggers connection_made() asynchronously with call_soon().

This is surprising for someone approaching asyncio from the public API rather than the internal implementation. connection_made() is the first contact with new connections. The concept of "an existing incoming client connection for which connection_made() wasn't called yet" is unexpected.

This has practical consequences.

Consider a server that keeps track of established connections via connection_made and connection_lost. If this server calls Server.close(), awaits Server.wait_closed(), makes a list of established connections and terminates them, there's no guarantee that all connections will be closed. Indeed, new connections may appear and call connection_made() after close() and wait_closed() returned!

wait_closed() seems ineffective for this use case.

@aymericaugustin aymericaugustin mannequin added 3.7 (EOL) end of life topic-asyncio type-bug An unexpected behavior, bug, or error labels Sep 30, 2018
@aymericaugustin
Copy link
Mannequin Author

aymericaugustin mannequin commented Sep 30, 2018

For now I will use the following hack:

    server.close()
    await server.wait_closed()
# Workaround for wait_closed() not quite doing
# what I want.
await asyncio.sleep(0)

# I believe that all accepted connections have reached
# connection_made() at this point.

@aymericaugustin
Copy link
Mannequin Author

aymericaugustin mannequin commented May 30, 2021

Would it make sense to add await asyncio.sleep(0) in Server.wait_closed() to ensure that all connections reach connection_made() before wait_closed() returns?

This would be fragile but it would be an improvement over the current behavior, wouldn't it?

@kumaraditya303
Copy link
Contributor

IMO it should wait for all connections to be closed and the current behavior seems like an oversight.

@gvanrossum
Copy link
Member

Yeah, there seem to be an unfinished thought here. Some of the logic I've discovered:

  • The only waiter ever added to self._waiters is wait_closed() itself
  • _attach() / _detach() serve as a kind of reference count, counting transports
  • The only calls are in _SelectorTransport and _ProactorBasePipeTransport
  • _detach is called when the connection is lost, and wakes up the wait_closed() call when the count reaches zero

It does seem likely that wait_closed() just shouldn't check for self._sockets and then all will be well?

Somebody should give this a try.

CC: @aaugustin @1st1

@gvanrossum
Copy link
Member

gvanrossum commented Oct 24, 2022

I think I got it (GH-79033). I don't think we should backport this, even though it's fixing a bug -- code that got the close sequence wrong but managed to get away with it might suddenly hang.

I need to think about a test for the correct behavior.

gvanrossum added a commit that referenced this issue Nov 24, 2022
It was a no-op when used as recommended (after close()).

I had to debug one test (test__sock_sendfile_native_failure) --
the cleanup sequence for the test fixture was botched.

Hopefully that's not a portend of problems in user code --
this has never worked so people may well be doing this wrong. :-(

Co-authored-by: kumar aditya
aaugustin added a commit to python-websockets/websockets that referenced this issue May 18, 2023
@1st1 1st1 reopened this Oct 8, 2023
@1st1
Copy link
Member

1st1 commented Oct 8, 2023

The committed #98582 broke wait_closed().

class Server:
    # ...

    async def wait_closed(self):
        if self._waiters is None or self._active_count == 0:
            return
        waiter = self._loop.create_future()
        self._waiters.append(waiter)
        await waiter

It now can incorrectly assume that the Server is closed and return prematurely, while Server.close() is still working.

  • self._waiters is a list of futures that wait until the server is closed
  • _active_count is used to track how many open client connection to the Server are there

Before #98582, the condition was self._sockets is None or self._waiters is None, which was interpreted as "if we don't have anymore server-side sockets for this Server object OR we've already closed it -- return".

Now the self._waiters is None or self._active_count == 0 condition reads as "if we've already closed Server OR there are no active client connections then return". So in the case where we still have an open server socket wait_closed() now happily returns, which it shouldn't.

@1st1
Copy link
Member

1st1 commented Oct 8, 2023

This appears to be a 3.12 regression and IMO the commit should be just reverted. I don't understand the motivation for the change from the commit message of 5d09d11 (gh-98582). The added test is synthetic, calling directly the underlying private API, something that we shouldn't rely too heavily on in general, aiming for higher-level functional tests instead.

cc @gvanrossum @kumaraditya303

@gvanrossum
Copy link
Member

The test that works in 3.11 but would fail in 3.12 does the following:

loop.call_soon(server.close)
await server.wait_closed()

We do have to fix that in 3.12.1.

@Dreamsorcerer
Copy link
Contributor

Dreamsorcerer commented Oct 24, 2023

Before #98582, the condition was self._sockets is None or self._waiters is None, which was interpreted as "if we don't have anymore server-side sockets for this Server object OR we've already closed it -- return".

Based on our aiohttp code, I was originally under the impression that wait_closed() simply waited for the sockets to stop listening, rather than the connections to close (see #111260).

However, I think the referenced change looks correct, because the very first thing that close() does, even before doing anything to the sockets, is self._sockets = None. Therefore, I don't see how the old if statement can be interpreted in any way other than "if close() has already been called". It doesn't seem to have anything to do with the open sockets and is a no-op if close() was previously called.

Having looked at all this, I think aiohttp shouldn't be calling wait_closed() at all.

@Dreamsorcerer
Copy link
Contributor

The test that works in 3.11 but would fail in 3.12 does the following:

loop.call_soon(server.close)
await server.wait_closed()

We do have to fix that in 3.12.1.

In 3.11, wouldn't that still be an issue, depending on timing? From my reading of the code, wait_closed() would create a waiter, then as soon as the server wakes up, it will resolve that waiter. I think it's irrelevant whether the close() method is called later or not, right?

@gvanrossum
Copy link
Member

gvanrossum commented Oct 25, 2023

@Dreamsorcerer

In 3.11, wouldn't that still be an issue, depending on timing? From my reading of the code, wait_closed() would create a waiter, then as soon as the server wakes up, it will resolve that waiter. I think it's irrelevant whether the close() method is called later or not, right?

I don't see the timing issue you refer to. The waiter is immediately (without intervening await, so the server can't run) added to self._waiters. It will only be woken up by _wakeup(), called by _detach() and close() once self._active_count reaches zero. That doesn't feel like "as soon as the server wakes up" to me.

@Dreamsorcerer
Copy link
Contributor

Right, I think some of those things could be named better. But, doesn't that still happen when connections drop to 0 normally then? i.e. If we just don't call .close(), then wait_closed() would still return if the connections are dropped, but the server is still running?

@gvanrossum
Copy link
Member

gvanrossum commented Oct 25, 2023

_detach() which is called whenever a client connection is dropped, only calls _wakeup() when _sockets is None, which is only the case once close() is called.

Maybe what you're after is the same race condition that @1st1 reported a few comments back: if wait_closed() is called before close() (e.g. immediately after loop.call_soon(server.close)) it will now return immediately (assuming there isn't already another waiter) instead waiting for the last connection to wind down.

I wonder if the fix is to change the condition at the top of wait_closed() to

        if self._sockets is None and self._active_count == 0:
            return

EDIT: Not quite. But now you've piqued my interest and I'll figure out the true condition.

@gvanrossum
Copy link
Member

gvanrossum commented Oct 25, 2023

I think I have a fix. The check at the top of wait_closed() should actually be

        if self._waiters is None:
            return

This makes it block until both conditions are true, as was the intended semantics (that we never quite got right): the server is closed and there are no active connections left. This new, simpler, check works because _wakeup() is the only place that sets _waiters to None, and it is called exactly when both conditions have become true (either in _detach() or in close()).

PR forthcoming.

@Dreamsorcerer
Copy link
Contributor

Sounds good. Another option may have been to have an asyncio.Event to signal when the server is closed, and then it could simply await on that.

@gvanrossum
Copy link
Member

Yeah, but internally an Event is implemented similarly, and this code was already there… Can you approve the PR?

willingc added a commit that referenced this issue Oct 28, 2023
* Try to fix asyncio.Server.wait_closed() again

I identified the condition that `wait_closed()` is intended
to wait for: the server is closed *and* there are no more
active connections.

When this condition first becomes true, `_wakeup()` is called
(either from `close()` or from `_detach()`) and it sets `_waiters`
to `None`. So we just check for `self._waiters is None`; if it's
not `None`, we know we have to wait, and do so.

A problem was that the new test introduced in 3.12 explicitly
tested that `wait_closed()` returns immediately when the server
is *not* closed but there are currently no active connections.
This was a mistake (probably a misunderstanding of the intended
semantics). I've fixed the test, and added a separate test that
checks exactly for this scenario.

I also fixed an oddity where in `_wakeup()` the result of the
waiter was set to the waiter itself. This result is not used
anywhere and I changed this to `None`, to avoid a GC cycle.

* Update Lib/asyncio/base_events.py

---------

Co-authored-by: Carol Willing <carolcode@willingconsulting.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 28, 2023
…GH-111336)

* Try to fix asyncio.Server.wait_closed() again

I identified the condition that `wait_closed()` is intended
to wait for: the server is closed *and* there are no more
active connections.

When this condition first becomes true, `_wakeup()` is called
(either from `close()` or from `_detach()`) and it sets `_waiters`
to `None`. So we just check for `self._waiters is None`; if it's
not `None`, we know we have to wait, and do so.

A problem was that the new test introduced in 3.12 explicitly
tested that `wait_closed()` returns immediately when the server
is *not* closed but there are currently no active connections.
This was a mistake (probably a misunderstanding of the intended
semantics). I've fixed the test, and added a separate test that
checks exactly for this scenario.

I also fixed an oddity where in `_wakeup()` the result of the
waiter was set to the waiter itself. This result is not used
anywhere and I changed this to `None`, to avoid a GC cycle.

* Update Lib/asyncio/base_events.py

---------

(cherry picked from commit 2655369)

Co-authored-by: Guido van Rossum <guido@python.org>
Co-authored-by: Carol Willing <carolcode@willingconsulting.com>
willingc added a commit that referenced this issue Oct 28, 2023
…1336) (#111424)

gh-79033: Try to fix asyncio.Server.wait_closed() again (GH-111336)

* Try to fix asyncio.Server.wait_closed() again

I identified the condition that `wait_closed()` is intended
to wait for: the server is closed *and* there are no more
active connections.

When this condition first becomes true, `_wakeup()` is called
(either from `close()` or from `_detach()`) and it sets `_waiters`
to `None`. So we just check for `self._waiters is None`; if it's
not `None`, we know we have to wait, and do so.

A problem was that the new test introduced in 3.12 explicitly
tested that `wait_closed()` returns immediately when the server
is *not* closed but there are currently no active connections.
This was a mistake (probably a misunderstanding of the intended
semantics). I've fixed the test, and added a separate test that
checks exactly for this scenario.

I also fixed an oddity where in `_wakeup()` the result of the
waiter was set to the waiter itself. This result is not used
anywhere and I changed this to `None`, to avoid a GC cycle.

* Update Lib/asyncio/base_events.py

---------

(cherry picked from commit 2655369)

Co-authored-by: Guido van Rossum <guido@python.org>
Co-authored-by: Carol Willing <carolcode@willingconsulting.com>
iritkatriel pushed a commit to iritkatriel/cpython that referenced this issue Oct 29, 2023
…GH-111336)

* Try to fix asyncio.Server.wait_closed() again

I identified the condition that `wait_closed()` is intended
to wait for: the server is closed *and* there are no more
active connections.

When this condition first becomes true, `_wakeup()` is called
(either from `close()` or from `_detach()`) and it sets `_waiters`
to `None`. So we just check for `self._waiters is None`; if it's
not `None`, we know we have to wait, and do so.

A problem was that the new test introduced in 3.12 explicitly
tested that `wait_closed()` returns immediately when the server
is *not* closed but there are currently no active connections.
This was a mistake (probably a misunderstanding of the intended
semantics). I've fixed the test, and added a separate test that
checks exactly for this scenario.

I also fixed an oddity where in `_wakeup()` the result of the
waiter was set to the waiter itself. This result is not used
anywhere and I changed this to `None`, to avoid a GC cycle.

* Update Lib/asyncio/base_events.py

---------

Co-authored-by: Carol Willing <carolcode@willingconsulting.com>
jgosmann added a commit to jgosmann/dmarc-metrics-exporter that referenced this issue Jan 7, 2024
* Send responses to all ongoing commands.
* Then close the writer so that the server can be shutdown.
  This is required so `wait_closed` can finished which has been fixed
  in Python 3.12 to actually block, see
  python/cpython#79033
jgosmann added a commit to jgosmann/dmarc-metrics-exporter that referenced this issue Jan 7, 2024
This is required so that the server can shutdown, after `wait_closed`
has been fixed in Python 3.12 to actually block.

See python/cpython#79033
@encukou
Copy link
Member

encukou commented Feb 6, 2024

The latest PR is merged. Is there more to do in this issue?

@gvanrossum
Copy link
Member

I think it's done. Closing.

aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024
…GH-111336)

* Try to fix asyncio.Server.wait_closed() again

I identified the condition that `wait_closed()` is intended
to wait for: the server is closed *and* there are no more
active connections.

When this condition first becomes true, `_wakeup()` is called
(either from `close()` or from `_detach()`) and it sets `_waiters`
to `None`. So we just check for `self._waiters is None`; if it's
not `None`, we know we have to wait, and do so.

A problem was that the new test introduced in 3.12 explicitly
tested that `wait_closed()` returns immediately when the server
is *not* closed but there are currently no active connections.
This was a mistake (probably a misunderstanding of the intended
semantics). I've fixed the test, and added a separate test that
checks exactly for this scenario.

I also fixed an oddity where in `_wakeup()` the result of the
waiter was set to the waiter itself. This result is not used
anywhere and I changed this to `None`, to avoid a GC cycle.

* Update Lib/asyncio/base_events.py

---------

Co-authored-by: Carol Willing <carolcode@willingconsulting.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-blocker topic-asyncio type-bug An unexpected behavior, bug, or error
Projects
Status: Done
Development

No branches or pull requests

5 participants