-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
asyncio.base_events.create_connection doesn't handle scoped IPv6 addresses #79726
Comments
loop.create_connection doesn't handle ipv6 RFC4007 addresses right since 3.7 TEST CASE # 3.6 handles everything fine # 3.7 and later fails Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib/python3.7/asyncio/base_events.py", line 576, in run_until_complete
return future.result()
File "/usr/lib/python3.7/asyncio/base_events.py", line 951, in create_connection
raise exceptions[0]
File "/usr/lib/python3.7/asyncio/base_events.py", line 938, in create_connection
await self.sock_connect(sock, address)
File "/usr/lib/python3.7/asyncio/selector_events.py", line 475, in sock_connect
return await fut
File "/usr/lib/python3.7/asyncio/selector_events.py", line 480, in _sock_connect
sock.connect(address)
OSError: [Errno 22] Invalid argument CAUSE Upon asyncio.base_events.create_connection _ensure_resolved is called twice, first time here: _ensure_resolved calls getaddrinfo, but in 3.7 implementation changed: % python3.6 -c 'import socket;print(socket.getaddrinfo("fe80::1%lo",12345)[0][4])' % python3.7 -c 'import socket;print(socket.getaddrinfo("fe80::1%lo",12345)[0][4])' _ensure_connect only considers host and port parts of the address tuple: In case of 3.7 first call to _ensure_resolved returns In case of 3.6 both first and second call to _ensure_resolved return |
While the 3.7+ getaddrinfo isn't the best human representation of an IPv6 address, I believe it does make the most sense to keep it that way. The issue stems from the refactoring of the underlying socketmodule.c handling of IPv4/IPv6 addresses with dedicated make_ipv4_addr and make_ipv6_addr functions which returns proper tuples of: The actual issue is that _ensure_resolved naively assumes IPv4 and truncates the address to its first 2 members I'd suggest passing the remaining elements of address in a packed *args or an optional flowinfo=0, scopeid=0 to _ipaddr_info since fundamentally, that's the method stripping valuable information. |
I think the root cause of this bug is a bit of confusion. The "customer-facing" asyncio API, create_connection(), takes two arguments: host and port. The lower-level API that actually deal with connecting sockets, socket.connect() and loop.sock_connect(), takes one argument: the address tuple. These are *not the same thing*, despite an IPv4 address tuple having two elements (host, port), and must not be mixed. _ensure_resolved() is the function responsible for turning host + port into an address tuple, and it does the right thing, turning host="fe80::1%lo",port=12345 into ('fe80::1', 12345, 0, 1) correctly. The mistake is taking the address tuple and passing it through _ensure_resolved() again, since that's not the correct input type for it: the only correct input type is host + port. So I think the appropriate fix for this bug is to make sure _ensure_resolved is only called once. In particular, BaseSelectorEventLoop.sock_connect() cpython/Lib/asyncio/selector_events.py Line 458 in 3bc0eba
|
Also I believe it's a good idea to change the arguments of _ensure_resolved() from (address, *, ...) to (host, port, *, ...), and go through all its usages, making sure we're not mixing host + port with address tuples everywhere in asyncio. |
Hi!, I was reading the PR. Just a little comment. I am not sure about have a difference for IPv4 and IPv6, in the sense of use a tuple for IPv4 and separate parameters for IPv6 Regards |
Hi Emmanuel, Are you referring to my PR 11403? I don't see where IPv6 uses separate parameters. |
I just noticed that in the socket module, an AF_INET address tuple is allowed to have an unresolved host name. Quote: A pair (host, port) is used for the AF_INET address family, where host is a string representing either a hostname in Internet domain notation like 'daring.cwi.nl' or an IPv4 address like '100.50.200.5', and port is an integer. https://docs.python.org/3/library/socket.html#socket-families Passing a tuple of (hostname, port) to socket.connect() successfully connects the socket (tested on Windows). Since the C struct sockaddr_in does not support hostnames, socket.connect obviously does resolution at some point, but its implementation is in C, and I haven't looked into it. BaseSelectorEventLoop.sock_connect() calls socket.connect() directly, therefore it also supports passing in a tuple of (hostname, port). I just tested ProactorEventLoop.sock_connect() on 3.7.1 on Windows, and it does not support hostnames, raising OSError: [WinError 10022] An invalid argument was supplied. I personally believe it's not a good idea to allow hostnames in address tuples and in sock.connect(). However, the socket module tries pretty hard to basically accept any (host, port) tuple as address tuples, whether host is an IPv4 address, IPv6 address or host name, so that's probably not going to change. |
Oh wait, there's also this in asyncio docs for loop.sock_connect: Changed in version 3.5.2: address no longer needs to be resolved. sock_connect will try to check if the address is already resolved by calling socket.inet_pton(). If not, loop.getaddrinfo() will be used to resolve the address. https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.sock_connect So this is where the current bug comes from! My PR 11403 basically undid this change. My proposal, as is probably obvious, is to undo this change and insist on passing only resolved address tuples to loop.sock_connect(). My argument is that this feature never worked properly:
Users should use create_connection() or open_connection() if they want to avoid the complexities of address resolution. If they are reaching for low_level APIs like loop.sock_connect(), they should also handle loop.getaddrinfo() themselves. |
from below: In case of 3.7 first call to _ensure_resolved returns In case of 3.6 both first and second call to _ensure_resolved return I'll have to locate the PR I made to resolve the test issue AIX was having - but it seems the address format ::1%lo is not supported everywhere. FYI: I do not believe the PR was backported into 3.6. ** Found it:
and Since in the first call - a scope of 1 is being returned - the initial "open" seems to be working as expected. Some "help" to be sure I do exactly the same tests. **** Reading through the bpo text, my change was only to skip the test because and, from memory, the information being looked for is the bit after the '%' - aka scope. On the one hand - the test is working - the information being returned does not match: ====================================================================== Traceback (most recent call last):
File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/mock.py", line 1226, in patched
return func(*args, **keywargs)
File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/test_asyncio/test_base_events.py", line 1316, in test_create_connection_ipv6_scope
sock.connect.assert_called_with(('fe80::1', 80, 0, 1))
File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/mock.py", line 838, in assert_called_with
raise AssertionError(_error_message()) from cause
AssertionError: expected call not found.
Expected: connect(('fe80::1', 80, 0, 1))
Actual: connect(('fe80::1', 80, 0, 0)) What is not clear from the test is that what "expected" says, is not the same as the first address in the code: coro = self.loop.create_connection(asyncio.Protocol, 'fe80::1%1', 80)
t, p = self.loop.run_until_complete(coro)
try:
sock.connect.assert_called_with(('fe80::1', 80, 0, 1))
_, kwargs = m_socket.socket.call_args
self.assertEqual(kwargs['family'], m_socket.AF_INET6)
self.assertEqual(kwargs['type'], m_socket.SOCK_STREAM)
finally:
t.close()
test_utils.run_briefly(self.loop) # allow transport to close 'fe80::1%1' <> 'fe80::1' - and maybe, on AIX - the initial connection failed. (or maybe it has to have succeeded, or the failure message would look different). I am not 'experienced' with IPv6 and scope. |
On 22/05/2019 10:43, Michael Felt wrote:
From what I have just read (again) - scope seems to be a way to indicate Further, getsockname() (and getpeername()) seem to be more for after a And, as this has been added - what breaks in Python when "scopeid" is I am thinking, if adding a scopeid is a way to assign an IPv6 address to If I understand why this is needed I may be able to come up with a way Regards, Michael |
AFAIK the reason why scope id is required for IPv6 is that every IPv6 Michael Felt <report@bugs.python.org>于2019年5月22日 周三18:08写道:
|
With regards to the failing test, it looks like the test basically boils down to testing whether loop.getaddrinfo('fe80::1%1', 80, type=socket.SOCK_STREAM) returns (<socket.AF_INET6>, <socket.SOCK_STREAM>, *, *, ('fe80::1', 80, 0, 1)). This feels like a dangerous assumption to make, since it's tied to the operating system's behavior. Maybe AIX's getaddrinfo() in fact does not resolve scoped addresses correctly; maybe it only resolves scope ids correctly for real addresses that actually exist on the network; Maybe AIX assigns scope ids differently and do not use small integers; etc. |
In hindsight, maybe the message could have been better, BUT - is it relevant? commit 413118e
FYI: Re: the example below - I would have thought the scopeid would be showing on en1, not en2 - and I am also wondering, if the scopeid is "%1" AIX ignores it. (also, I masked my global ipv6 address). michael@x071:[/home/michael]netstat -ni So, I looked at another server with two interfaces - here only one has a IPv6 address root@x064:[/home/root]netstat -in And, after I activate IPv6 on the second interface - I see a scopeid-like representation: Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll I would have to guess at this point, but to simplify, it seems that AIX resolves addresses differently (rather than say 'not correctly') and maybe requires specific conditions. If relevant - I can provide the output from Debian on POWER. But it seems AIX is only using a "ADDRv6%scopeid" when there at least two interfaces defined. +++++++++ root@x066:[/data/prj/python/python3-3.8]./python -m test test_asyncio
Run tests sequentially
0:00:00 [1/1] test_asyncio
/data/prj/python/git/python3-3.8/Lib/test/support/__init__.py:1627: RuntimeWarning: coroutine 'AsyncMockMixin._mock_call' was never awaited
gc.collect()
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Future exception was never retrieved
future: <Future finished exception=BrokenPipeError(32, 'Broken pipe')>
Traceback (most recent call last):
File "/data/prj/python/git/python3-3.8/Lib/asyncio/subprocess.py", line 162, in _feed_stdin
await self.stdin.drain()
File "/data/prj/python/git/python3-3.8/Lib/asyncio/streams.py", line 443, in drain
await self._protocol._drain_helper()
File "/data/prj/python/git/python3-3.8/Lib/asyncio/streams.py", line 200, in _drain_helper
await waiter
File "/data/prj/python/git/python3-3.8/Lib/asyncio/unix_events.py", line 661, in _write_ready
n = os.write(self._fileno, self._buffer)
BrokenPipeError: [Errno 32] Broken pipe
Future exception was never retrieved
future: <Future finished exception=BrokenPipeError(32, 'Broken pipe')>
Traceback (most recent call last):
File "/data/prj/python/git/python3-3.8/Lib/asyncio/subprocess.py", line 162, in _feed_stdin
await self.stdin.drain()
File "/data/prj/python/git/python3-3.8/Lib/asyncio/streams.py", line 443, in drain
await self._protocol._drain_helper()
File "/data/prj/python/git/python3-3.8/Lib/asyncio/streams.py", line 200, in _drain_helper
await waiter
File "/data/prj/python/git/python3-3.8/Lib/asyncio/unix_events.py", line 661, in _write_ready
n = os.write(self._fileno, self._buffer)
BrokenPipeError: [Errno 32] Broken pipe
test test_asyncio failed -- Traceback (most recent call last):
File "/data/prj/python/git/python3-3.8/Lib/unittest/mock.py", line 1226, in patched
return func(*args, **keywargs)
File "/data/prj/python/git/python3-3.8/Lib/test/test_asyncio/test_base_events.py", line 1316, in test_create_connection_ipv6_scope
sock.connect.assert_called_with(('fe80::1', 80, 0, 1))
File "/data/prj/python/git/python3-3.8/Lib/unittest/mock.py", line 838, in assert_called_with
raise AssertionError(_error_message()) from cause
AssertionError: expected call not found.
Expected: connect(('fe80::1', 80, 0, 1))
Actual: connect(('fe80::1', 80, 0, 0)) FYI: I have IPv6 interfaces defined on this server (x066) - but only one. And I tried changing fe80::1%1 to fe80::1%2, etc, but the end result is similar: AssertionError: expected call not found. Hope this helps! |
I don't have an AIX lying around to test so would you mind just running the test on The IPv6 Scoped Address Architecture RFC clearly indicates that Also, it would be worthwhile to ensure that the patches mentioned by IBM https://www-01.ibm.com/support/docview.wss?uid=isg1IV52116 are applied on the machine running the test. |
On 24/05/2019 19:59, Erwan Le Pape wrote:
root@x067:[/home/root]python3 -c 'import socket; I have not yet checked if the patches mentioned are installed. This is a system I am testing PyInstaller, and the python3 version is 3.6.8. OS-Level is: 7100-03-05-1524, but it was built on a different version of +++++++ This is the system I have the buildbot on: buildbot@x064:[/home/buildbot/aixtools-master]./python
buildbot@x064:[/home/buildbot/aixtools-master]./python -c 'import +++++ re the patches mentioned. a) not applicable for the systems above - both are AIX 7.1. b) the AIX 6.1 TL7 I build with says: root@x066:[/]instfix -i | grep IV52116 |
On 24/05/2019 19:59, Erwan Le Pape wrote:
p.s. I used an actual address: buildbot@x064:[/home/buildbot/aixtools-master]netstat -ni |
Thanks for testing that. It's good that you used an actual address because that eliminates the possibility that AIX doesn't handle addresses it doesn't really know about. On the other hand, even when properly specified to a real scoped IPv6 address, I'm looking at cpython/master for the socketmodule implementation: cpython/Modules/socketmodule.c Line 6400 in 6dbbe74
getaddrinfo https://github.com/python/cpython/blob/master/Modules/socketmodule.c#L1294 is makesockaddr which actually creates the 4-tuple returned as the last element of the getaddrinfo tuples.The fourth element (ie. the scope ID) is clearly a->sin6_scope_id which should contain the scope ID.
At this stage, I don't know if this is a bug from the socketmodule which I doubt or if the AIX If you're still okay to proxy tests for AIX, I'll try and come up with either a simple C snippet to see what's in the returned structure or ctype the AIX |
No problem with trying out your tests. Sent from my iPhone
|
On 25/05/2019 00:19, Erwan Le Pape wrote:
I also doubt a bug in the socketmodule - my assumption is that AIX may ++ If we "accept" or "conclude" that AIX's getaddrinfo() routine is not bpo-34490 Fix test_asyncio for AIX - do not call transport.get_extra_info('sockname') ++ Further, I have a start on "send/receive" stubs in C and am trying root@x066:[/]ifconfig -a Sadly, I have a lot to learn re: IPv6 - and I expect how "host-based
|
Assuming similar configuration to the one in msg343430, a simple native getaddrinfo test to check whether any scope ID is returned.
I've explicitly tested against numeric and named interfaces to ensure that this isn't linked to AIX only handling named interfaces for scopes instead of numeric ones (although given your Since I've had to look for AIX programming documentation just to be sure, I also stumbled upon this AIX bug https://www-01.ibm.com/support/docview.wss?uid=isg1IV53671 (which is referenced by the one I mentioned previously but I missed that). It seem to apply up to 7100-03 so you should be immune nonetheless. I also noticed that it mentions SSH not working so I went and checked the OpenSSH sources to see how they handle AIX. If this is truly an "idiosyncrasy" in AIX, I'm not sure there is a better way to handle it than skipping it since it's not really a Python bug if the underlying |
On 30/05/2019 10:27, Erwan Le Pape wrote:
The 'expanded' program ... main(): int main() {
/* local addresses */
test("fe80::221:5eff:fea3:c746%0");
test("fe80::221:5eff:fea3:c746%en0");
test("fe80::f8d1:8cff:fe32:8305%2");
test("fe80::f8d1:8cff:fe32:8305%en1");
/* remote addresses */
test("fe80::f8d1:81ff:fe81:ac05%2");
test("fe80::f8d1:81ff:fe81:ac05%en1"); return 0; The conclusion seems to be that the scopeid returned is always 0 - when This seems to be:
#define EAI_NONAME 8 /* hostname nor servname not provided,
or not known */ So, %enX is not recognized - only a numerical scope. +++ Details +++ On the first server - added two addresses - they are local to platform. root@x066:[/data/prj/aixtools/tests/tcpip/socket]cc -g ex03.c -o ex03 On a second server (all addresses are 'remote now') root@x064:[/data/prj/aixtools/tests/tcpip/socket]./ex03 root@x064:[/data/prj/aixtools/tests/tcpip/socket]oslevel -s And a server with the bug - i.e., not fixed: *** In closing *** Regards, |
Guys, thank you for investigation. If you have access to AIX box it would be much easier for you. I can only look at Python buildbot statuses. |
On 30/05/2019 23:11, Andrew Svetlov wrote:
|
Good day guys, |
I did not ask back in June - but could this also be backported to 3.7. I am trying very hard to have all tests also passing on 3.7. as @asvetlov is ok with a skipped test for AIX - see https://bugs.python.org/issue35545#msg344003 I can make the backport, if needed. |
test_asyncio.test_create_connection_ipv6_scope
on AIX #14011test_asyncio.test_create_connection_ipv6_scope
on AIX (GH-14011) #14012Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: