Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loop.create_server with port=0 uses different ports for ipv4 & ipv6 #89856

Closed
jcrist mannequin opened this issue Nov 2, 2021 · 15 comments
Closed

loop.create_server with port=0 uses different ports for ipv4 & ipv6 #89856

jcrist mannequin opened this issue Nov 2, 2021 · 15 comments
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes docs Documentation in the Doc dir topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@jcrist
Copy link
Mannequin

jcrist mannequin commented Nov 2, 2021

BPO 45693
Nosy @ericvsmith, @asvetlov, @1st1, @miss-islington, @jcrist
PRs
  • bpo-45693: Document port parameter to loop.create_server #29760
  • [3.10] bpo-45693: Document port parameter to loop.create_server (GH-29760) #29762
  • [3.9] bpo-45693: Document port parameter to loop.create_server (GH-29760) #29763
  • [3.8] bpo-45693: Document port parameter to loop.create_server (GH-29760) #29764
  • Files
  • test.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-11-24.20:40:55.106>
    created_at = <Date 2021-11-02.18:32:35.033>
    labels = ['type-bug', '3.9', '3.10', '3.11', 'expert-asyncio', 'docs']
    title = '`loop.create_server` with port=0 uses different ports for ipv4 & ipv6'
    updated_at = <Date 2021-11-24.20:41:12.362>
    user = 'https://github.com/jcrist'

    bugs.python.org fields:

    activity = <Date 2021-11-24.20:41:12.362>
    actor = 'eric.smith'
    assignee = 'docs@python'
    closed = True
    closed_date = <Date 2021-11-24.20:40:55.106>
    closer = 'eric.smith'
    components = ['Documentation', 'asyncio']
    creation = <Date 2021-11-02.18:32:35.033>
    creator = 'jcristharif'
    dependencies = []
    files = ['50421']
    hgrepos = []
    issue_num = 45693
    keywords = ['patch']
    message_count = 15.0
    messages = ['405530', '405540', '405544', '405545', '405547', '405591', '405612', '405613', '405647', '405651', '406951', '406954', '406958', '406959', '406960']
    nosy_count = 6.0
    nosy_names = ['eric.smith', 'asvetlov', 'docs@python', 'yselivanov', 'miss-islington', 'jcristharif']
    pr_nums = ['29760', '29762', '29763', '29764']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue45693'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @jcrist
    Copy link
    Mannequin Author

    jcrist mannequin commented Nov 2, 2021

    To create a new server with loop.create_server that listens on all interfaces and a random port, I'd expect passing in host="", port=0 to work (per the documentation). However, as written this results in 2 different ports being used - one for ipv4 and one for ipv6. Instead I'd expect a single random port be determined once, and reused for all other interfaces.

    Running the example test code (attached) results in:

    $ python test.py
    listening on 0.0.0.0:38023
    listening on :::40899
    Traceback (most recent call last):
      File "/home/jcristharif/Code/distributed/test.py", line 36, in <module>
        asyncio.run(main())
      File "/home/jcristharif/miniconda3/envs/dask/lib/python3.9/asyncio/runners.py", line 44, in run
        return loop.run_until_complete(main)
      File "/home/jcristharif/miniconda3/envs/dask/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
        return future.result()
      File "/home/jcristharif/Code/distributed/test.py", line 30, in main
        assert len(ports) == 1, "Only 1 port expected!"
    AssertionError: Only 1 port expected!
    

    This behavior can be worked around by manually handling port=0 outside of asyncio, but as it stands naive use can result in accidentally listening on multiple ports.

    @jcrist jcrist mannequin added 3.9 only security fixes topic-asyncio type-bug An unexpected behavior, bug, or error 3.8 only security fixes 3.10 only security fixes labels Nov 2, 2021
    @ericvsmith
    Copy link
    Member

    Is there an OS interface to ensure the same port on both stacks? I don't know of one (although of course one might exist), in which case I don't know how you'd ensure they're both the same.

    @jcrist
    Copy link
    Mannequin Author

    jcrist mannequin commented Nov 2, 2021

    Is there an OS interface to ensure the same port on both stacks?

    I don't think this is needed? Right now the code processes as:

    • Expand host + port + family + flags into a list of one or more tuples of socket options (
      fs = [self._create_server_getaddrinfo(host, port, family=family,
      flags=flags)
      for host in hosts]
      infos = await tasks.gather(*fs)
      infos = set(itertools.chain.from_iterable(infos))
      )
    • Iterate through this list, creating a new socket for each option tuple, and bind to the corresponding host + port (
      af, socktype, proto, canonname, sa = res
      try:
      sock = socket.socket(af, socktype, proto)
      except socket.error:
      # Assume it's a bad family/type/protocol combination.
      if self._debug:
      logger.warning('create_server() failed to create '
      'socket.socket(%r, %r, %r)',
      af, socktype, proto, exc_info=True)
      continue
      sockets.append(sock)
      if reuse_port:
      _set_reuseport(sock)
      # Disable IPv4/IPv6 dual stack support (enabled by
      # default on Linux) which makes a single socket
      # listen on both address families.
      if (_HAS_IPv6 and
      af == socket.AF_INET6 and
      hasattr(socket, 'IPPROTO_IPV6')):
      sock.setsockopt(socket.IPPROTO_IPV6,
      socket.IPV6_V6ONLY,
      True)
      try:
      sock.bind(sa)
      )

    In this case, each call to socket.bind gets a 0 port, thus binding to a new random open port number each time.

    What I'm asking for is that if the user passes in port=0, then the port is extracted in the first call to socket.bind when looping and used for all subsequent socket.bind calls in the loop. This way we only ever choose a single random open port rather than 1 for each interface. FWIW, this is also what tornado does when port=0 is provided.

    @ericvsmith
    Copy link
    Member

    What do you do if a port is bound for IPv4, but is in use for IPv6?

    @jcrist
    Copy link
    Mannequin Author

    jcrist mannequin commented Nov 2, 2021

    Hmmm, I'd find that situation a bit surprising, but I suppose it could happen. Looks like tornado just errors, and that seems to work fine for them in practice (https://github.com/tornadoweb/tornado/blob/790715ae0f0a30b9ee830bfee75bb7fa4c4ec2f6/tornado/netutil.py#L153-L182). Binding IPv4 first might help reduce the chance of a collision, since I suspect there are more IPv4-only applications than IPv6-only.

    @asvetlov
    Copy link
    Contributor

    asvetlov commented Nov 3, 2021

    Tornado solution sounds weak; it can fail the server start if free ports are available.

    I concur with Eric, I'm not aware of an OS API call that binds both IPv4 and IPv6 to the same random port.

    @jcrist
    Copy link
    Mannequin Author

    jcrist mannequin commented Nov 3, 2021

    I'm not aware of an OS API call that binds both IPv4 and IPv6 to the same random port.

    Sure, but loop.create_server is already higher-level than a single OS API call.

    By default create_server will already bind multiple sockets if host="", host=None, or if host is a list. I'm arguing that the current behavior with port=0 in these situations is unexpected. Other libraries (like tornado) have come to the same conclusion, and have implemented logic to handle this that seems to work well in practice (though can fail, as you've pointed out).

    Is there a use case where the current behavior (binding to multiple random ports) is desired?

    @jcrist
    Copy link
    Mannequin Author

    jcrist mannequin commented Nov 3, 2021

    If you decline that a change is needed here, at the very least the current behavior of port=0 should be documented. I'd be happy to push up a fix if so.

    @asvetlov
    Copy link
    Contributor

    asvetlov commented Nov 3, 2021

    PR for documentation fix is appreciated.

    Random fails to bind the same port if free ports are available is kind of regression.

    Is tornado the only example or you are aware of other libraries with such behavior?

    @jcrist
    Copy link
    Mannequin Author

    jcrist mannequin commented Nov 3, 2021

    Is tornado the only example or you are aware of other libraries with such behavior?

    A quick survey of other language network stacks didn't turn anything up, But I also didn't find any implementations (other than asyncio & tornado) that bind multiple sockets with a single api call (as create_server does).

    I think part of the issue here is that dual IPV6 & IPV4 support is intentionally disabled in asyncio (and tornado), so two sockets are needed (one to support each interface). Other TCP implementations (e.g. both go and rust) don't disable this, so one listener == one socket. This makes comparing API designs across stacks harder - with e.g. Go it's straightforward to listen on a random port on IPV4 & IPV6 with a single TCPListener, since both can be handled by a single socket. Since this is disabled (by default) in asyncio we end up using 2 sockets and run into the issue described above.

    Also note that this issue will trigger for any address that resolves to multiple interfaces (not just host=""). For example, on osx localhost will resolve to ::1 and 127.0.0.1 by default, meaning that the following fairly straightforward asyncio code has a bug in it:

    # Start a server on localhost with a random port
    server = await loop.create_server(
        EchoServerProtocol,
        host="localhost",
        port=0
    )
    
    # Retrieve and log the port
    port = server.sockets[0].getsockname()[1]
    print(f"listening at tcp://localhost:{port}")

    As written, this looks correct enough, but on systems where localhost resolves to multiple interfaces this will accidentally listen on multiple ports (instead of one). This can be fixed with some additional logic external to asyncio, but it makes for a much less straightforward asyncio example.

    @jcrist
    Copy link
    Mannequin Author

    jcrist mannequin commented Nov 24, 2021

    Apologies for the delay here. I've pushed a documentation patch at #29760.

    @miss-islington
    Copy link
    Contributor

    New changeset d71c7bc by Jim Crist-Harif in branch 'main':
    bpo-45693: Document port parameter to loop.create_server (GH-29760)
    d71c7bc

    @ericvsmith
    Copy link
    Member

    New changeset 8cabcde by Miss Islington (bot) in branch '3.10':
    bpo-45693: Document port parameter to loop.create_server (GH-29760) (GH-29762)
    8cabcde

    @ericvsmith
    Copy link
    Member

    New changeset 151c9bf by Miss Islington (bot) in branch '3.9':
    bpo-45693: Document port parameter to loop.create_server (GH-29760) (GH-29763)
    151c9bf

    @ericvsmith
    Copy link
    Member

    Thanks for the PR, @jcristharif.

    @ericvsmith ericvsmith added docs Documentation in the Doc dir 3.11 only security fixes and removed 3.8 only security fixes labels Nov 24, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 only security fixes docs Documentation in the Doc dir topic-asyncio type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants