Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asyncio crashes when tearing down the proactor event loop #83413

Closed
mikeshardmind mannequin opened this issue Jan 6, 2020 · 16 comments
Closed

asyncio crashes when tearing down the proactor event loop #83413

mikeshardmind mannequin opened this issue Jan 6, 2020 · 16 comments
Labels
3.10 only security fixes 3.11 only security fixes 3.12 bugs and security fixes topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@mikeshardmind
Copy link
Mannequin

mikeshardmind mannequin commented Jan 6, 2020

BPO 39232
Nosy @asvetlov, @cjrh, @1st1, @zooba, @cmeyer, @mikeshardmind, @jack1142, @pepoluan
Files
  • example.py: toy code demonstrating the difference.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2020-01-06.18:29:20.714>
    labels = ['3.10', '3.8', '3.9', 'type-crash', 'expert-asyncio']
    title = 'asyncio crashes when tearing down the proactor event loop'
    updated_at = <Date 2021-08-23.12:27:41.034>
    user = 'https://github.com/mikeshardmind'

    bugs.python.org fields:

    activity = <Date 2021-08-23.12:27:41.034>
    actor = 'machine.gw'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['asyncio']
    creation = <Date 2020-01-06.18:29:20.714>
    creator = 'mikeshardmind'
    dependencies = []
    files = ['48829']
    hgrepos = []
    issue_num = 39232
    keywords = []
    message_count = 7.0
    messages = ['359448', '359515', '362174', '367932', '392261', '396339', '400132']
    nosy_count = 11.0
    nosy_names = ['asvetlov', 'cjrh', 'yselivanov', 'steve.dower', 'cmeyer', 'mikeshardmind', 'jack1142', 'rmawatson', 'pepoluan', 'lawsonjl.ornl', 'machine.gw']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue39232'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    @mikeshardmind
    Copy link
    Mannequin Author

    mikeshardmind mannequin commented Jan 6, 2020

    When using asyncio.run for an asynchronous application utilizing ssl, on windows using the proactor event loop the application crashes when the loop is closed, completely skipping a finally block in the process.

    This appears to be due to a __del__ method on transports used.

    Manual handling of the event loop close while including a brief sleep appears to work as intended.

    Both versions work fine with the selector event loop on linux.

    This appears to be a somewhat known issue already, as it's been reported to aiohttp, however both the traceback, and the differing behavior seem to indicate this is an issue with the proactor event loop.

    (On linux this still emits a resource warning without the sleep)

    While I don't mind handling the loop cleanup, it seems like this case should also emit a resource warning rather than crashing.

    If it's decided in which way this should be handled, I'm willing to contribute to or help test whatever direction the resolution for this should go.

    Traceback included below, toy version of the problem attached as code.

    Exception ignored in: <function _ProactorBasePipeTransport.__del__ at 0x0000026463039820>
    Traceback (most recent call last):
      File "C:\Users\Michael\AppData\Local\Programs\Python\Python38\lib\asyncio\proactor_events.py", line 116, in __del__
        self.close()
      File "C:\Users\Michael\AppData\Local\Programs\Python\Python38\lib\asyncio\proactor_events.py", line 108, in close
        self._loop.call_soon(self._call_connection_lost, None)
      File "C:\Users\Michael\AppData\Local\Programs\Python\Python38\lib\asyncio\base_events.py", line 715, in call_soon
        self._check_closed()
      File "C:\Users\Michael\AppData\Local\Programs\Python\Python38\lib\asyncio\base_events.py", line 508, in _check_closed       
        raise RuntimeError('Event loop is closed')
    RuntimeError: Event loop is closed

    @mikeshardmind mikeshardmind mannequin added 3.8 only security fixes topic-asyncio type-crash A hard crash of the interpreter, possibly with a core dump labels Jan 6, 2020
    @mikeshardmind
    Copy link
    Mannequin Author

    mikeshardmind mannequin commented Jan 7, 2020

    I don't know if it would be feasible to add this to asyncio, but having a way to mark a resource as needing to be deterministically cleaned up at loop close could probably solve this as well as the underlying reasons why the transports are leaning on __del__ behavior which is an implementation detail (differing on, for example, pypy) and probably improve the overall usefulness of asyncio.run as well.

    An addition like that probably requires more discussion than fixing this crash though.

    @mikeshardmind
    Copy link
    Mannequin Author

    mikeshardmind mannequin commented Feb 18, 2020

    Linking out to a real-world example where this still manages to happen after running the event loop for an entire 2 seconds waiting for transports to close themselves after finishing everything else:

    Cog-Creators/Red-DiscordBot#3560

    As well as what we're currently looking at for a temporary solution for this at this point:

    Cog-Creators/Red-DiscordBot#3566

    I looked into what would need to change to handle this in CPython, but am not confident in my ability to make such a PR after doing so, at least not without more discussion about it.

    The best solution I considered involves making the only public way to make transports be tied to an event loop which hasn't been closed yet, and ensuring the event loop keeps a reference to each of these so that it can deterministically close them at loop finalization. Searching GitHub alone found that this would break way too many things.

    If this can't be fully fixed, a solution which at least ensures this can't cause an uncatchable exception would be appreciated.

    @rmawatson
    Copy link
    Mannequin

    rmawatson mannequin commented May 2, 2020

    This is also happening in other libraries besides aiohttp. Specifically xhttp on windows. There is an open issue at encode/httpx#914

    @mikeshardmind mikeshardmind mannequin added 3.9 only security fixes 3.10 only security fixes labels Apr 13, 2021
    @zooba
    Copy link
    Member

    zooba commented Apr 28, 2021

    I'm also seeing (and annoyed by) this.

    Do we need to do anything more than suppress the error? If we're just failing to notify the event loop that we're closing because the loop is gone, does the notification even matter?

    @lawsonjlornl
    Copy link
    Mannequin

    lawsonjlornl mannequin commented Jun 22, 2021

    We are running into this all the time, ever since the Proactor became the default on Windows in 3.8.

    Usually it comes up when the program terminates due to an unhandled exception during a highly concurrent operation. The resulting cascade of RuntimeErrors often obscures the real reason for failure, which makes debugging more painful than it should be. But sometimes this cascade of RuntimeErrors will occur even when the program otherwise executes successfully. So it can be difficult to know if the program actually blew up or if it's just benign event loop vomit.

    Converting this particular error to a warning would be great, but eliminating the call to the event loop in __del__ would be even better. I understand that's easier said than done, or else ProactorBasePipeTransport wouldn't be leaning on __del__ in the first place.

    @machinegw
    Copy link
    Mannequin

    machinegw mannequin commented Aug 23, 2021

    I added some code to wait for all tasks completion before exit:
    currentTask = asyncio.current_task()
    for t in asyncio.all_tasks():
    if currentTask != t:
    await t
    and still got the exception

    Then I think it created additional thread somewhere and created an event_loop inside it. To dig it out I sub-classed threading.Thread. I also wait for all tasks in all threads before exiting:

    loops: list[asyncio.AbstractEventLoop] = []

    class MyThread(threading.Thread):
        def start(self):
            global loops
            loops.append(asyncio.get_event_loop())
            super().start()
    
    async def func1():
        async with aiohttp.ClientSession() as session:
            async with session.get('https://www.okex.com/api/v5/public/time') as resp:
                print(resp.status)
                print(await resp.json())
    
    threading.Thread = MyThread
    import aiohttp
    
    async def main():
        global loops
        loops.append(asyncio.get_running_loop())
        print(sys.argv)
        task = asyncio.create_task(func1())
        await task
        print('done.')
    
        currentTask = asyncio.current_task()
        for loop in loops:
            for t in asyncio.all_tasks(loop):
                if currentTask != t:
                    await t
    
        print('done2.')
        #await asyncio.sleep(1)
    #if __file__ == '__main__':
    asyncio.run(main())
    

    Then I found out the thread is created inside _loop.getaddrinfo: (files are from python 3.9.6)
    File aiohttp\resolver.py, line 31, in ThreadedResolver.resolve
    FILE asyncio\base_events.py, line 856, in BaseEventLoop(ProactorEventLoop).getaddrinfo

    And it is strange that another thread is created when program exit:
    FILE asyncio\base_events.py, line 563, in BaseEventLoop(ProactorEventLoop).shutdown_default_executor

    But sad it seems vscode cannot break a __del__ call. If I break somewhere else first then it would not crash:(

    @zooba
    Copy link
    Member

    zooba commented May 16, 2022

    eliminating the call to the event loop in del would be even better. I understand that's easier said than done, or else ProactorBasePipeTransport wouldn't be leaning on del in the first place.

    This seems like a great idea, but I have no idea how feasible it is.

    Do we have any active experts in asyncio who can make a judgement call on this? Suppressing the error in __del__ seems like the safer option, but it also looks like it's only called from __del__, so maybe that's not the right idea.

    @arhadthedev
    Copy link
    Member

    @zooba I have a PR fixing this (gh-92842).

    @zooba zooba added 3.11 only security fixes 3.12 bugs and security fixes and removed 3.9 only security fixes 3.8 only security fixes labels May 17, 2022
    @zooba
    Copy link
    Member

    zooba commented May 17, 2022

    @arhadthedev Thanks, that definitely would seem to explain it.

    We may be a long time past when @vstinner added those destructors but maybe he can remember whether proactor and selector loops having different self.close() vs self._sock.close() calls was intentional? Or maybe it was moved from somewhere else?

    @vstinner
    Copy link
    Member

    maybe he can remember whether proactor and selector loops having different self.close() vs self._sock.close() calls was intentional?

    I don't recall the rationale, if there was a rationale :-) The code has a long history coming from the tulip project. It even supported Python 3.3 and older.

    When you're in __del__(), the world is burning, maybe it's not a good idea to still schedule future callbacks by calling self.close() which calls self._loop.call_soon(self._call_connection_lost, None) whereas self is "being destroyed" (in fact, the GC is more complex than it, an object can "resurrect"). That's likely why I didn't call self.close() but self._sock.close() (in the selector implementation): the end of the self.close() -> self._call_connection_lost() callback chain.

    Doing the same in proactor_events.py makes sense. For me, the only important part is to log the ResourceWarning to make sure that people don't get into this code path but close explicitly their asyncio transports. Relying on the GC to close transports, for making asynchronous code reliable is... a bad idea. In asyncio, close() is asynchronous and transports have a is_closing() method!

    @zooba
    Copy link
    Member

    zooba commented May 17, 2022

    When you're in __del__(), the world is burning, maybe it's not a good idea to still schedule future callbacks by calling self.close()

    Okay, if we both agree on that part, let's take the fix for 3.11 and see whether the world breaks.

    I think it should go to 3.10, but without anyone claiming to fully understand it, I'd rather get some real world experience in 3.11 beta before backporting to a stable release.

    @mikeshardmind
    Copy link

    mikeshardmind commented May 21, 2022

    I can confirm the linked PR (gh-92842) fixes the issue as I originally described.

    I still think that relying on __del__ here is not necessarily a good thing, but in the absence of expert opinion in the discussion at the time about addressing that, I didn't bother to look much further into it and just ensured all transports in my code were closed before leaving scope (side-stepping the issue in my own code, rather than fixing it as a whole.)

    @arhadthedev
    Copy link
    Member

    This issue may be closed because the bug is fixed:

    @AlexWaygood AlexWaygood added type-bug An unexpected behavior, bug, or error and removed type-crash A hard crash of the interpreter, possibly with a core dump labels Jul 10, 2022
    @spyoungtech
    Copy link

    I'm running into a similar (but principly the same issue) when using asyncio.subprocess with pipes that are open when the script exits. If this belongs in a separate issue, let me know.

    As a simple example using a IsolatedAsyncioTestCase and pytest

    import asyncio
    from asyncio.subprocess import create_subprocess_exec
    from unittest import IsolatedAsyncioTestCase
    import subprocess
    
    class Test83413(IsolatedAsyncioTestCase):
    
        async def asyncSetUp(self) -> None:
            self.p = await create_subprocess_exec('notepad', stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
            # This will cause an error because we're not cleaning up the open pipes
    
        async def test_simple(self):
            await asyncio.sleep(1)
            assert True is False

    Behaviors observed:

    Exception in 3.11.0b5 running `pytest`
    Exception ignored in: <function BaseSubprocessTransport.__del__ at 0x000002BE556D0720>
    Traceback (most recent call last):
      File "C:\Program Files\Python311b5\Lib\asyncio\base_subprocess.py", line 125, in __del__
        _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Program Files\Python311b5\Lib\asyncio\base_subprocess.py", line 70, in __repr__
        info.append(f'stdin={stdin.pipe}')
                    ^^^^^^^^^^^^^^^^^^^^^
      File "C:\Program Files\Python311b5\Lib\asyncio\proactor_events.py", line 79, in __repr__
        info.append(f'fd={self._sock.fileno()}')
                          ^^^^^^^^^^^^^^^^^^^
      File "C:\Program Files\Python311b5\Lib\asyncio\windows_utils.py", line 102, in fileno
        raise ValueError("I/O operation on closed pipe")
    ValueError: I/O operation on closed pipe
    In 3.9.0 running `pytest`
    Exception ignored in: <function BaseSubprocessTransport.__del__ at 0x0000022937622280>
    Traceback (most recent call last):
      File "C:\Python39\lib\asyncio\base_subprocess.py", line 126, in __del__
        self.close()
      File "C:\Python39\lib\asyncio\base_subprocess.py", line 104, in close
        proto.pipe.close()
      File "C:\Python39\lib\asyncio\proactor_events.py", line 108, in close
        self._loop.call_soon(self._call_connection_lost, None)
      File "C:\Python39\lib\asyncio\base_events.py", line 746, in call_soon
        self._check_closed()
      File "C:\Python39\lib\asyncio\base_events.py", line 510, in _check_closed
        raise RuntimeError('Event loop is closed')
    RuntimeError: Event loop is closed
    Exception ignored in: <function _ProactorBasePipeTransport.__del__ at 0x0000022937627A60>
    Traceback (most recent call last):
      File "C:\Python39\lib\asyncio\proactor_events.py", line 115, in __del__
        _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
      File "C:\Python39\lib\asyncio\proactor_events.py", line 79, in __repr__
        info.append(f'fd={self._sock.fileno()}')
      File "C:\Python39\lib\asyncio\windows_utils.py", line 102, in fileno
        raise ValueError("I/O operation on closed pipe")
    ValueError: I/O operation on closed pipe
    Exception ignored in: <function _ProactorBasePipeTransport.__del__ at 0x0000022937627A60>
    Traceback (most recent call last):
      File "C:\Python39\lib\asyncio\proactor_events.py", line 115, in __del__
        _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
      File "C:\Python39\lib\asyncio\proactor_events.py", line 79, in __repr__
        info.append(f'fd={self._sock.fileno()}')
      File "C:\Python39\lib\asyncio\windows_utils.py", line 102, in fileno
        raise ValueError("I/O operation on closed pipe")
    ValueError: I/O operation on closed pipe
    Exception ignored in: <function _ProactorBasePipeTransport.__del__ at 0x0000022937627A60>
    Traceback (most recent call last):
      File "C:\Python39\lib\asyncio\proactor_events.py", line 115, in __del__
        _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
      File "C:\Python39\lib\asyncio\proactor_events.py", line 79, in __repr__
        info.append(f'fd={self._sock.fileno()}')
      File "C:\Python39\lib\asyncio\windows_utils.py", line 102, in fileno
        raise ValueError("I/O operation on closed pipe")
    ValueError: I/O operation on closed pipe

    Expected (my preferred) behavior is the simple warning that the process is being killed (the same as when no stdin/stdout/stderr arguments are provided):

    Close running child process: kill <_WindowsSubprocessTransport closed pid=23660 running>
    

    Reasonable (or more knowledgable) people could disagree on whether that's expected or not, but it would make reading my test outputs much easier :)

    @arhadthedev
    Copy link
    Member

    If this belongs in a separate issue, let me know.

    Did you run the test on Linux or macOS?

    I cannot reproduce it on Windows 10. If you used non-Windows platform that we can be sure that this is caused not by proactor_events.py (covered by this issue) but by something else (so a new PR would be suitable).

    Just for the record, here's my output for a clean build of 0771d71 (labeled as v3.11.0b5):

    C:\Users\oleg\Documents\dev\notmine\cpython>python test83413.py
    Running Release|x64 interpreter...
    
    C:\Users\oleg\Documents\dev\notmine\cpython>python --version
    Running Release|x64 interpreter...
    Python 3.11.0b5
    

    wookayin added a commit to wookayin/pynvim that referenced this issue Oct 15, 2023
    python < 3.10.6 has a bug on proactor-based pipe transport: when a
    `_ProactorBasePipeTransport` instance is garbage-collected (usually
    after the associated event loop is closed), it will raise an error.
    
    For older versions of python, we apply the same monkey-patchin to
    suppress warnings (actually these are errors).
    
    Ref: python/cpython#83413 and python/cpython#92841
    
    CI output (pytest warnings):
    
    ```
    ======================= 71 passed, 66 warnings in 6.55s =======================
     Exception ignored in: <function _ProactorBasePipeTransport.__del__ at 0x0000021A9534F700>
      Traceback (most recent call last):
        File "C:\hostedtoolcache\windows\Python\3.8.10\x64\lib\asyncio\proactor_events.py", line 115, in __del__
          _warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
        File "C:\hostedtoolcache\windows\Python\3.8.10\x64\lib\asyncio\proactor_events.py", line 79, in __repr__
          info.append(f'fd={self._sock.fileno()}')
        File "C:\hostedtoolcache\windows\Python\3.8.10\x64\lib\asyncio\windows_utils.py", line 102, in fileno
          raise ValueError("I/O operation on closed pipe")
      ValueError: I/O operation on closed pipe
      Exception ignored in: <function _ProactorBasePipeTransport.__del__ at 0x0000021A9534F700>
      Traceback (most recent call last):
        File "C:\hostedtoolcache\windows\Python\3.8.10\x64\lib\asyncio\proactor_events.py", line 115, in __del__
        File "C:\hostedtoolcache\windows\Python\3.8.10\x64\lib\asyncio\proactor_events.py", line 79, in __repr__
        File "C:\hostedtoolcache\windows\Python\3.8.10\x64\lib\asyncio\windows_utils.py", line 102, in fileno
      ValueError: I/O operation on closed pipe
    ```
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes 3.11 only security fixes 3.12 bugs and security fixes topic-asyncio type-bug An unexpected behavior, bug, or error
    Projects
    Status: Done
    Development

    No branches or pull requests

    6 participants