Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-109534: switch from sock_call to sock_call_ex in sock_send #114311

Closed
wants to merge 1 commit into from

Conversation

geraldog
Copy link

@geraldog geraldog commented Jan 19, 2024

While writing webcrawlers with https://github.com/sonic182/aiosonic I came across the issues best described in #109534

I had the great idea of explicitly deleting the event loop my crawler was running on, and hey - I got a traceback of the leaks, repeated ad nauseam:

OSError: [Errno 9] Bad file descriptor

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/asyncio/sslproto.py", line 692, in _process_write_backlog
    self._transport.write(chunk)
  File "/usr/lib/python3.10/asyncio/selector_events.py", line 935, in write
    self._fatal_error(exc, 'Fatal write error on socket transport')
  File "/usr/lib/python3.10/asyncio/selector_events.py", line 729, in _fatal_error
    self._force_close(exc)
  File "/usr/lib/python3.10/asyncio/selector_events.py", line 741, in _force_close
    self._loop.call_soon(self._call_connection_lost, exc)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 753, in call_soon
    self._check_closed()
  File "/usr/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Fatal error on SSL transport
protocol: <asyncio.sslproto.SSLProtocol object at 0x7fb576dadab0>
transport: <_SelectorSocketTransport closing fd=329>
Traceback (most recent call last):
  File "/usr/lib/python3.10/asyncio/selector_events.py", line 929, in write
    n = self._sock.send(data)

Notice this traceback happens with Python <= 3.10 only. Python >= 3.11 leaks forever regardless of whether the loop is deleted or not.

That got me thinking, there must be something special about the socket.send() function that isn't raising the exception... so I went looking in socketmodules.c, and lo and behold, there are multiple instances in that file where we use the sock_call() thin wrapper around sock_call_ex().

The problem is that wrapper does not set an err pointer to pass to sock_call_ex(). As a result, the exception may never be raised.

As I said, multiple instances here, this one-liner won't solve #109534 by itself but I believe it's a step forward. I have a crawler running much more stabler now with this fix, particularly when I have to abort() the underlying SSL transport.

Copy link

The following commit authors need to sign the Contributor License Agreement:

Click the button to sign:
CLA not signed

@bedevere-app
Copy link

bedevere-app bot commented Jan 19, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@geraldog
Copy link
Author

Closed in favor of #114367

@geraldog geraldog closed this Jan 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant