New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socket connect doesn't work on Solaris #1252

Closed
wiggin15 opened this Issue Jul 11, 2018 · 5 comments

Comments

Projects
None yet
2 participants
@wiggin15
Contributor

wiggin15 commented Jul 11, 2018

  • gevent version: 1.3.4
  • Python version: 2.7.8
  • Operating System: Solaris (versions 10 and 11, on architectures x64 and SPARC)

Description:

We just upgraded our systems from an ancient version of gevent (1.1.2) to the latest, 1.3.4. We encountered a bug on our Solaris servers. Running socket.connect raises the following exception:
socket.gaierror: [Errno 9] service name not available for the specified socket type
for any port that is not defined in /etc/services (i.e. this works for ports like 25 but not for random ports like 8888).
Code to reproduce and traceback:

>>> from gevent.socket import socket
>>> socket().connect(("127.0.0.1", 8888))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/root/python/lib64/python2.7/site-packages/gevent-1.3.4-py2.7-solaris-2.11-sun4v.64bit.egg/gevent/_socket2.py", line 232, in connect
    r = getaddrinfo(address[0], address[1], sock.family)
  File "/root/python/lib64/python2.7/site-packages/gevent-1.3.4-py2.7-solaris-2.11-sun4v.64bit.egg/gevent/_socketcommon.py", line 196, in getaddrinfo
    return get_hub().resolver.getaddrinfo(host, port, family, socktype, proto, flags)
  File "/root/python/lib64/python2.7/site-packages/gevent-1.3.4-py2.7-solaris-2.11-sun4v.64bit.egg/gevent/resolver/thread.py", line 65, in getaddrinfo
    return self.pool.apply(_socket.getaddrinfo, args, kwargs)
  File "/root/python/lib64/python2.7/site-packages/gevent-1.3.4-py2.7-solaris-2.11-sun4v.64bit.egg/gevent/pool.py", line 159, in apply
    return self.spawn(func, *args, **kwds).get()
  File "src/gevent/event.py", line 381, in gevent._event.AsyncResult.get
  File "src/gevent/event.py", line 409, in gevent._event.AsyncResult.get
  File "src/gevent/event.py", line 399, in gevent._event.AsyncResult.get
  File "src/gevent/event.py", line 379, in gevent._event.AsyncResult._raise_exception
  File "/root/python/lib64/python2.7/site-packages/gevent-1.3.4-py2.7-solaris-2.11-sun4v.64bit.egg/gevent/threadpool.py", line 281, in _worker
    value = func(*args, **kwargs)
socket.gaierror: [Errno 9] service name not available for the specified socket type

It is worth noting that this worked with gevent 1.1.2, and that the regular socket.connect doesn't have this problem.

I traced this back to #944/#949 where getaddrinfo was changed to not pass sock.type and sock.proto. This is not supported on Solaris:

>>> import socket
>>> socket.getaddrinfo("127.0.0.1", 25, socket.SOCK_STREAM)  # recognized service, works
[(2, 2, 6, '', ('127.0.0.1', 25))]
>>> socket.getaddrinfo("127.0.0.1", 8888, socket.SOCK_STREAM)  # unrecognized service, won't work
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
socket.gaierror: [Errno 9] service name not available for the specified socket type
>>> socket.getaddrinfo("127.0.0.1", 8888, socket.SOCK_STREAM, 0, socket.IPPROTO_TCP)   # with family, socktype and proto - works
[(2, 2, 6, '', ('127.0.0.1', 8888))]

I think #949 needs to be reconsidered in order to support connections on Solaris. Perhaps check for the attribute that caused the original problem (SOCK_CLOEXEC) and behave differently in this case?

@jamadden

This comment has been minimized.

Member

jamadden commented Jul 11, 2018

In the C code for the stdlib socket, I can't find any evidence that socket.connect passes the type and protocol when it calls getaddrinfo, but it's fairly convoluted. Does the corresponding call to the stdlib socket.connect raise a gaierror, or does it work? (e.g., python -c 'from socket import socket; socket().connect(("127.0.0.1", 8888)))

Have you tried an alternate resolver (cares or dnspython)?

@jamadden jamadden added the python2 label Jul 11, 2018

@wiggin15

This comment has been minimized.

Contributor

wiggin15 commented Jul 11, 2018

The regular socket.connect doesn't have this problem so the code you provided works.
It looks like the getaddrinfo call in CPython's code (https://github.com/python/cpython/blob/master/Modules/socketmodule.c#L1133) does not pass service (port) so maybe that's why there is no problem there. CPython does pass socktype when there is a port: https://github.com/python/cpython/blob/master/Lib/socket.py#L707

@jamadden

This comment has been minimized.

Member

jamadden commented Jul 11, 2018

does not pass service (port) so maybe that's why there is no problem there

I suspect that's the root of the difference. Does this patch work for you?

@@ -229,8 +229,21 @@ class socket(object):
             return self._sock.connect(address)
         sock = self._sock
         if isinstance(address, tuple):
-            r = getaddrinfo(address[0], address[1], sock.family)
+            # address is (host, port) (ipv4) or (host, port, flowinfo, scopeid) (ipv6).
+            # Note that this doesn't work with exotic address formats like AF_TIPC
+            # on Linux.
+
+            # We don't pass the port to getaddrinfo because the C
+            # socket module doesn't either (on some systems its
+            # illegal to do that without also passing socket type and
+            # protocol). Instead we join the port back at the end.
+            host, port = address[:2]
+            r = getaddrinfo(host, None, sock.family)
             address = r[0][-1]
+            if len(address) == 2:
+                address = (address[0], port)
+            else:
+                address = (address[0], port, address[2], address[3])
@jamadden

This comment has been minimized.

Member

jamadden commented Jul 11, 2018

A more complete version of what I hope is a fix is in #1256

@wiggin15

This comment has been minimized.

Contributor

wiggin15 commented Jul 11, 2018

Thanks @jamadden . Branch issue1252 seems to do the trick on Solaris. Hopefully it does not break anything else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment