Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use AI_NUMERICHOST | AI_NUMERICSERV to skip getaddrinfo thread in asyncio #90980

Open
graingert mannequin opened this issue Feb 22, 2022 · 10 comments
Open

use AI_NUMERICHOST | AI_NUMERICSERV to skip getaddrinfo thread in asyncio #90980

graingert mannequin opened this issue Feb 22, 2022 · 10 comments
Labels
topic-asyncio type-feature A feature request or enhancement

Comments

@graingert
Copy link
Mannequin

graingert mannequin commented Feb 22, 2022

BPO 46824
Nosy @asvetlov, @bdarnell, @1st1, @graingert
PRs
  • gh-90980: skip getaddrinfo thread if host is already resolved, using socket.AI_NUMERIC... #31497
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2022-02-22.10:15:28.937>
    labels = ['type-feature', 'expert-asyncio']
    title = 'use AI_NUMERICHOST | AI_NUMERICSERV to skip getaddrinfo thread in asyncio'
    updated_at = <Date 2022-03-18.16:45:34.609>
    user = 'https://github.com/graingert'

    bugs.python.org fields:

    activity = <Date 2022-03-18.16:45:34.609>
    actor = 'Ben.Darnell'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['asyncio']
    creation = <Date 2022-02-22.10:15:28.937>
    creator = 'graingert'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46824
    keywords = ['patch']
    message_count = 5.0
    messages = ['413699', '413701', '413702', '415510', '415511']
    nosy_count = 4.0
    nosy_names = ['asvetlov', 'Ben.Darnell', 'yselivanov', 'graingert']
    pr_nums = ['31497']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue46824'
    versions = []

    @graingert
    Copy link
    Mannequin Author

    graingert mannequin commented Feb 22, 2022

    now that the getaddrinfo lock has been removed on all platforms the numeric only host resolve in asyncio could be moved back into BaseEventLoop.getaddrinfo

    @graingert graingert mannequin added topic-asyncio type-feature A feature request or enhancement labels Feb 22, 2022
    @asvetlov
    Copy link
    Contributor

    Could you provide more context for the proposed change?

    @graingert
    Copy link
    Mannequin Author

    graingert mannequin commented Feb 22, 2022

    hello, it's actually a bit of a round about context, but it was brought up on a tornado issue where I was attempting to port the asyncio optimization to tornado: tornadoweb/tornado#3113 (comment)

    I think it would be better to use this AI_NUMERICHOST | AI_NUMERICSERV optimization from trio everywhere instead

    @bdarnell
    Copy link
    Mannequin

    bdarnell mannequin commented Mar 18, 2022

    To summarize the justification, this patch does two things: it moves an optimization from create_connection to getaddrinfo, which makes it apply to more callers (including Tornado), and it makes the code simpler and less redundant (net reduction of 47 non-test lines in the patch).

    As far as we can tell, the reason it wasn't done this way in the first place is that at the time getaddrinfo held a global lock on some platforms, but this is no longer true. If there's still some locking in or around getaddrinfo on some platforms (or some libc implementations), this patch would be a bad idea. Is there a good way to test for that? I suppose we could set up a deliberately-slow DNS server and try to call getaddrinfo with AI_NUMERICHOST while another thread is blocked talking to that server, but that seems like a lot of test infrastructure to build out.

    @bdarnell
    Copy link
    Mannequin

    bdarnell mannequin commented Mar 18, 2022

    On MacOS in 2015, getaddrinfo was found to be much slower than inet_pton. Unless that's changed, this patch would be a performance regression on that platform. Data and benchmark script in https://groups.google.com/g/python-tulip/c/-SFI8kkQEj4/m/m1-oCMSABgAJ

    @graingert
    Copy link
    Contributor

    running the benchmark script https://gist.github.com/graingert/5097c6be997ab2a201b48b858524e163 on my Linux machine I get

    before:

    /home/graingert/projects/cpython-main/../demo/demo.py:28: DeprecationWarning: There is no current event loop
      loop = get_event_loop()
          host       port     family       type      proto    secs
       1.2.3.4          1          2          1          6     9.4
       1.2.3.4          1          0          1          6     9.7
       1.2.3.4          1          0          2         17    10.9
       1.2.3.4          1          0          1          0    14.5
       1.2.3.4          1          0          2          0    15.9
       1.2.3.4          1          0          0          0    15.8
           ::3          1         10          1          6    13.0
           ::3          1          0          1          6    13.3
    Traceback (most recent call last):
      File "/home/graingert/projects/cpython-main/../demo/demo.py", line 58, in <module>
        lookup(*info)
        ^^^^^^^^^^^^^
      File "/home/graingert/projects/cpython-main/../demo/demo.py", line 32, in lookup
        return loop.run_until_complete(loop.getaddrinfo(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/graingert/projects/cpython-main/Lib/asyncio/base_events.py", line 650, in run_until_complete
        return future.result()
               ^^^^^^^^^^^^^^^
      File "/home/graingert/projects/cpython-main/Lib/asyncio/base_events.py", line 864, in getaddrinfo
        return await self.run_in_executor(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/graingert/projects/cpython-main/Lib/concurrent/futures/thread.py", line 58, in run
        result = self.fn(*self.args, **self.kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/graingert/projects/cpython-main/Lib/socket.py", line 961, in getaddrinfo
        for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    socket.gaierror: [Errno -2] Name or service not known
    

    after:

    /home/graingert/projects/cpython/../demo/demo.py:28: DeprecationWarning: There is no current event loop
      loop = get_event_loop()
          host       port     family       type      proto    secs
       1.2.3.4          1          2          1          6     2.4
       1.2.3.4          1          0          1          6     2.4
       1.2.3.4          1          0          2         17     2.4
       1.2.3.4          1          0          1          0     2.4
       1.2.3.4          1          0          2          0     2.3
       1.2.3.4          1          0          0          0     2.7
           ::3          1         10          1          6     2.3
           ::3          1          0          1          6     2.3
    Traceback (most recent call last):
      File "/home/graingert/projects/cpython/../demo/demo.py", line 58, in <module>
        lookup(*info)
        ^^^^^^^^^^^^^
      File "/home/graingert/projects/cpython/../demo/demo.py", line 32, in lookup
        return loop.run_until_complete(loop.getaddrinfo(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/graingert/projects/cpython/Lib/asyncio/base_events.py", line 591, in run_until_complete
        return future.result()
               ^^^^^^^^^^^^^^^
      File "/home/graingert/projects/cpython/Lib/asyncio/base_events.py", line 814, in getaddrinfo
        return await self.run_in_executor(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/graingert/projects/cpython/Lib/concurrent/futures/thread.py", line 58, in run
        result = self.fn(*self.args, **self.kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/graingert/projects/cpython/Lib/socket.py", line 962, in getaddrinfo
        for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    socket.gaierror: [Errno -2] Name or service not known
    

    @graingert
    Copy link
    Contributor

    on macos:

    before https://github.com/graingert/cpython/runs/6528334512?check_suite_focus=true#step:7:1

    Run ./python.exe demo.py
    /Users/runner/work/cpython/cpython/demo.py:28: DeprecationWarning: There is no current event loop
      loop = get_event_loop()
          host       port     family       type      proto    secs
       1.2.3.4          1          2          1          6    54.6
       1.2.3.4          1          0          1          6    55.0
       1.2.3.4          1          0          2         17    52.8
       1.2.3.4          1          0          1          0    52.0
       1.2.3.4          1          0          2          0    51.2
       1.2.3.4          1          0          0          0    54.7
           ::3          1         30          1          6    52.6
           ::3          1          0          1          6    52.6
       ::3%lo0          1         30          1          6    55.5
    

    after https://github.com/graingert/cpython/runs/6528182360?check_suite_focus=true#step:7:11

    Run ./python.exe demo.py
    /Users/runner/work/cpython/cpython/demo.py:28: DeprecationWarning: There is no current event loop
      loop = get_event_loop()
          host       port     family       type      proto    secs
       1.2.3.4          1          2          1          6    21.3
       1.2.3.4          1          0          1          6    21.1
       1.2.3.4          1          0          2         17    21.1
       1.2.3.4          1          0          1          0    21.1
       1.2.3.4          1          0          2          0    21.0
       1.2.3.4          1          0          0          0    22.1
           ::3          1         30          1          6    20.9
           ::3          1          0          1          6    20.9
       ::3%lo0          1         30          1          6    23.8
    

    @graingert
    Copy link
    Contributor

    graingert commented May 20, 2022

    @graingert
    Copy link
    Contributor

    graingert commented May 20, 2022

    benchmarking _ipaddr_info graingert@dee834c seems to show that it's about the same speed as getaddrinfo with AI_NUMERICHOST:

    https://github.com/graingert/cpython/runs/6528895181?check_suite_focus=true

    Run ./python.exe demo.py
    /Users/runner/work/cpython/cpython/demo.py:28: DeprecationWarning: There is no current event loop
      loop = get_event_loop()
          host       port     family       type      proto    secs
       1.2.3.4          1          2          1          6    19.2
       1.2.3.4          1          0          1          6    19.1
       1.2.3.4          1          0          2         17    19.2
       1.2.3.4          1          0          1          0    19.0
       1.2.3.4          1          0          2          0    18.9
       1.2.3.4          1          0          0          0    18.1
           ::3          1         30          1          6    19.0
           ::3          1          0          1          6    20.2
       ::3%lo0          1         30          1          6    19.1
    

    @bdarnell

    @ezio-melotti ezio-melotti moved this to Todo in asyncio Jul 17, 2022
    @ezio-melotti ezio-melotti moved this from Todo to In Progress in asyncio Jul 17, 2022
    @ezio-melotti ezio-melotti moved this from In Progress to Todo in asyncio Jul 17, 2022
    @willingc
    Copy link
    Contributor

    Next action: Review the existing PR and determine next steps.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-asyncio type-feature A feature request or enhancement
    Projects
    Status: No status
    Status: Todo
    Development

    No branches or pull requests

    3 participants