Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce two-tuple input for socket.connect() #69

Merged
merged 1 commit into from
Sep 4, 2015
Merged

Enforce two-tuple input for socket.connect() #69

merged 1 commit into from
Sep 4, 2015

Conversation

kevinconway
Copy link
Contributor

The results from socket.getaddrinfo() stored in sockaddr differ from IPv4 and IPv6. The IPv4 values are a two-tuple of (HOST, PORT). However, the IPv6 values are a four-tuple of (HOST, PORT, FLOW_INFO, SCOPE_ID). This behaviour is documented on https://docs.python.org/2/library/socket.html#socket.getaddrinfo. Unfortunately, the error thrown is not in the socket.error family which causes a stack trace rather than skipping the IPv6 connection.

The additional values returned for the IPv6 address are invalid for calls to connect which expects only a two-tuple matching the IPv4 output. There is an additional function in the socket modules, create_connection, which accounts for this issue. However, this function would not produce the correct socket implementation should one other than the standard lib be given.

Forcing the tuple length to two will cause the call to connect to raise an error in the socket.error family for IPv6 connections which, in turn, will cause the existing code to skip it and fall back on the IPv4 connection.

@awestendorf
Copy link
Member

@rocco66 Can you comment on this? It was your IPv6 support in #66 that Kevin is trying to fix. Without an IPv6 host to test against, and having only confirmed that your patch did not break IPv4, I'm inclined to revert your PR.

@kevinconway
Copy link
Contributor Author

The only way to have the connect function work with an IPv6 address is to initialize the instance with socket.AF_INET6 as the first parameter. This could be accomplished by passing in klass=partial(socket.socket, socket.AF_INET6). I'm not clear on where that would take place if I wanted to set it since it must be passed in during the connect call of the Transport. The examples only show use of the string keyword shortcuts for using transports.

If it's possible to bind the SocketTransport to only the standard lib socket we could also make use of socket.create_connection.

@kevinconway
Copy link
Contributor Author

I was thinking about this some more. If we assume that any value given as klass is an API compatible socket implementation we should be able to automatically switch between IPv4 and IPv6 connections as available in the environment. For example, here is a potential alternate implementation:

    def connect(self, (host, port), klass=socket.socket):
        '''
        Connect assuming a host and port tuple.
        '''
        self._host = "%s:%s" % (host, port)
        infos = socket.getaddrinfo(host, port, 0, 0, socket.IPPROTO_TCP)
        if not infos:
            raise socket.error("getaddrinfo returns an empty list")
        # Sort addresses such that the new IPv6 address is first.
        for _family, _socktype, _proto, _canonname, sockaddr in sorted(infos, key=lambda info: len(info[-1])):

            try:
                # Check if the IP is a valid IPv6 address. Fallback to IPv4 if not.
                socket.inet_pton(socket.AF_INET6, sockaddr[0])
                self._sock = klass(socket.AF_INET6)
            except socket.error:
                 self._sock = klass()

            self._sock.setblocking(True)
            self._sock.settimeout(self.connection._connect_timeout)
            if self.connection._sock_opts:
                for k, v in self.connection._sock_opts.iteritems():
                    family, type = k
                    self._sock.setsockopt(family, type, v)
            self._sock.connect(sockaddr[:2])

            # After connecting, switch to full-blocking mode.
            self._sock.settimeout(None)

This will cause connections to default to IPv6 if there is a valid IPv6 address resolved from the given hostname/IP address. Otherwise it will fall back to an IPv4 connection if found. If no connections are found the existing behaviour of raising an exception is preserved.

Let me know if this implementation is preferable and I can change the patch.

@kevinconway
Copy link
Contributor Author

Hey folks. I added a patch to the pull request that actually allows for IPv6. The first patch simply fixed IPv4 connections when an IPv6 address is present. This patch prioritizes IPv6 if it exists with a fallback to IPv4.

I'm looking at adding some tests for the new behaviour. I noticed the tests used a value of ('host', 5309) for connections. Do you have special entries in your hosts file to account for this? The tests fail when I run them because that host and IP cannot be resolved. I can modify the values to ('localhost', 5309), but that also requires that something listen on that port. Any suggestions?

@awestendorf
Copy link
Member

The tests are using Chai for mocking. I like to use sensible but not-normal values in expectations and mocks so it's easier to see how parameters influence the test. The ('host', 5309) tuple is matched here

I don't have an IPv6 host to test bu I can look over your changes and confirm that they still operate for IPv4. What I see looks good so far.

@kevinconway
Copy link
Contributor Author

@awestendorf Are the tests for the project passing as is? It looks like #66 caused a regression in the socket_transport tests by adding a call to the socket module's socket.getaddrinfo() function. This function fails if the hostname and port do not resolve to an actual TCP listener. The test values of ('host', 5309) won't resolve unless you happen to have rabbitmq running and 'host' resolves to some address. It looks like we'll need to either monkey patch the call to socket.getaddrinfo() or add getaddrinfo as an additional arg similar to klass for socket. Any preference?

Either way, putting socket.getaddrinfo() into our control for testing will allow us to write tests to simulate IPv6 vs IPv4 availability in unit tests.

Current test output from master:

======================================================================
ERROR: test_connect_with_klass_arg (tests.unit.transports.socket_transport_test.SocketTransportTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/vagrant/venv/lib/python2.7/site-packages/chai/chai.py", line 63, in wrapper
    func(self, *args, **kwargs)
  File "/home/vagrant/patch/haigha/tests/unit/transports/socket_transport_test.py", line 71, in test_connect_with_klass_arg
    self.transport.connect(('host', 5309), klass=klass)
  File "/home/vagrant/patch/haigha/haigha/transports/socket_transport.py", line 32, in connect
    infos = socket.getaddrinfo(host, port, 0, 0, socket.IPPROTO_TCP)
gaierror: [Errno -2] Name or service not known

======================================================================
ERROR: test_connect_with_no_klass_arg (tests.unit.transports.socket_transport_test.SocketTransportTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/vagrant/venv/lib/python2.7/site-packages/chai/chai.py", line 63, in wrapper
    func(self, *args, **kwargs)
  File "/home/vagrant/patch/haigha/tests/unit/transports/socket_transport_test.py", line 49, in test_connect_with_no_klass_arg
    self.transport.connect(('host', 5309))
  File "/home/vagrant/patch/haigha/haigha/transports/socket_transport.py", line 32, in connect
    infos = socket.getaddrinfo(host, port, 0, 0, socket.IPPROTO_TCP)
gaierror: [Errno -2] Name or service not known

----------------------------------------------------------------------

@awestendorf
Copy link
Member

Nope, looks like the tests were broken in master. I fixed them and will be looking into Travis for CI.

@kevinconway
Copy link
Contributor Author

I added tests to demonstrate the behaviour when there is only an IPv6 address available as well as when both are available and priority is given to the IPv6 address.

@vitaly-krugl
Copy link

Guys, just wanted to give you my kudos for the disciplined approach to unit tests in haigha. This is sorely lacking in some of the competing packages.

continue
else:
raise exc
self._sock.connect(sockaddr[:2])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinconway Reading the docs for getaddrinfo, I see this:

family (a (address, port) 2-tuple for AF_INET, a (address, port, flow info, scope id) 4-tuple for AF_INET6), and is meant to be passed to the socket.connect() method.

That sounds to me like this change is incorrect and sockaddr should be passed as getaddrinfo returned it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flowinfo and scopeid portions can be omitted in any of the socket functions. It's allowed for backwards compat reasons, but there is a caution in the docs that it may cause problems later on. Next chance I get I'll pluck the slice out and see how it affects the feature.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _sock.connect(...) needs to be done in the loop until we connect or exhaust all returned addresses. The reasoning is that a hostname may resolve to multiple addresses that may be associated with multiple network interfaces on the server, but there is no guarantee that the AMQP broker will be bound to and listening on all those interfaces.

@vitaly-krugl
Copy link

Is any of this related to issue #76?

TypeError: getsockaddrarg() takes exactly 2 arguments (4 given)

Thanks

@kevinconway
Copy link
Contributor Author

Yes, I believe this patch addresses that issue.

self._sock = klass()
infos = socket.getaddrinfo(host, port, 0, 0, socket.IPPROTO_TCP)
if not infos:
raise socket.error("getaddrinfo returns an empty list")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid raising socket.error exceptions. Those belong to the builtin socket module, and raising them in haigha adds to confusion when debugging. socket.getaddrinfo() is supposed to raise socket.gaierror per https://docs.python.org/2/library/socket.html.

For example, if it fails to resolve a hostname, you get this exception:

>>> import socket
>>> socket.getaddrinfo("goglezzzz.com", 5672)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

So, we shouldn't need to worry about getting an empty list, but if we must assert, then a simple assert infos, "getaddrinfo returned an empty list" would be better.

@vitaly-krugl
Copy link

@kevinconway @awestendorf: I coded up this example of how connectivity and error reporting (raising of exceptions) should work well: https://github.com/vitaly-krugl/inetpy/blob/master/inetpy/connect.py. See connect_tcp and the underlying connect_from_addr_infos for details. This code is based on the example in https://docs.python.org/2/library/socket.html.

This addresses the ordering, error-reporting, exception raising, and connectivity issues that I raised in my code review feedback.

You are welcome to reuse my implementation in Haigha.

Best,
Vitaly

@vitaly-krugl
Copy link

@kevinconway, I created the PR #80 to demonstrate what I had in mind. It's fully functional, compliant with ordering rules, attempts connection with every compatible address until it succeeds or exhausts them all, and doesn't fake socket.error exceptions.

Please feel free to pull it in directly.

Best,
Vitaly

@kevinconway
Copy link
Contributor Author

@vitaly-krugl @awestendorf

Ready for another round of reviewing. I believe I've incorporated all of the feedback from @vitaly-krugl. The code no longer reorders the address info, now uses the results of getaddrinfo to construct the socket rather than guessing the protocol, and raises an internal haigha exception rather than a socket error on failure.

Give the patch another look when you both get a chance and let me know what you think.

@@ -29,26 +34,33 @@ def connect(self, (host, port), klass=socket.socket):
Connect assuming a host and port tuple.
'''

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to document exceptions that this method may reasonably raise, so callers know what to expect

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinconway, please document the exceptions that this method may reasonably raise, so callers know what to expect. This will probably be something like this:

    def connect(self, (host, port), klass=socket.socket):
        '''
        Connect assuming a host and port tuple.

        :raises socket.gaierror: error resolving address
        :raises socket.error: error configuring or connecting socket
        '''

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 100% for documenting code like this. I just need @awestendorf to sign off on a doc format. The sphinx format you've suggested is fairly ubiquitous, but there are also other standards, like napolean, that achieve the same but look different. I see sphinx format used in the message potion of the code base, but it's so minimal I want to make sure that's actually a standard chosen by the project owner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a preference and never attempted to correctly document all the methods in this project. I see sphinx more often than not and that's the only reason I'd suggest it.

exc = socket.error("getaddrinfo returns an empty list")
for _family, _socktype, _proto, _canonname, sockaddr in infos:

for info in socket.getaddrinfo(host, port, 0, 0, socket.IPPROTO_TCP):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getaddrinfo can raise socket.gaierror. Since it appears that the new code wishes to raise ConnectionError when the connect method fails, for consistency, it should raise ConnectionError in this case, too. For example:

try:
    socket.getaddrinfo(host, port, 0, 0, socket.IPPROTO_TCP)
except socket.gaierror as ex:
    self.connection.logger.exception('getaddrinfo failed')
    raise ConnectionError('getaddrinfo failed: %r' % (ex,))


for info in infos:
    family, socktype, proto, _, sockaddr = info
    . . . 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that we need to raise a custom error though, the stdlib exceptions do a fine job here and are easily handled by the caller. I think in all cases, the connect is directly called by the user through Connection.connect, and if that is documented to raise socket exceptions then I think we should stick with that.

@vitaly-krugl
Copy link

@kevinconway, thanks for the update. Please see my code review feedback.

@kevinconway
Copy link
Contributor Author

@vitaly-krugl: I appreciate the time you're putting in to your reviews. I've only read over it briefly, but it looks like good feedback. I'll read more deeply when I get a chance and roll out an update.

@awestendorf
Copy link
Member

@vitaly-krugl Seconded, thank you for the very thorough review. I agree with all your notes and added some comments where appropriate. Thank you @kevinconway for continuing to iterate on this, the final result will be well worth it. Next up, TLS 😄

@kevinconway
Copy link
Contributor Author

@vitaly-krugl @awestendorf
Hey team. Just rolled out an update based on the feedback you both left. When you get an opportunity give the patch another review. Once we're at a stable point I'll squash the commits and push again before the merge.

@vitaly-krugl
Copy link

Thanks @kevinconway, I provided feedback on several areas, but would you mind fixing that up and squashing all your changes into a single commit before further code review? This would help the reviewer detect problems that might be introduced during squashing.

Also, please run pylint and address findings related to your changes, if any. Thx!

self._sock = klass(family, socktype, proto)
self._sock.settimeout(self.connection._connect_timeout)
if self.connection._sock_opts:
for key, value in self.connection._sock_opts.iteritems():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be neater and still fit in 80 columns:

                for (level, optname), value in self.connection._sock_opts.iteritems():
                    self._sock.setsockopt(level, optname, value)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I misread https://www.python.org/dev/peps/pep-3113/ and was considering this syntax deprecated. However, it only applies to the use of tuples as function parameters. I'll adjust this.

Adjust the connect logic to cycle through all available addresses
discovered from socket.getaddrinfo() in their natural order.

Add unit tests for IPv6 connections.

Cleanup legacy connection code.

Adjust variable names to match documentation for setsockopt
call and remove redundant setblocking call.

Add sphinx docstring to connect method
@vitaly-krugl
Copy link

Hi @kevinconway, I am off the grid for the next few days. I will review the changes next week, unless @awestendorf beats me to it. Best.

@vitaly-krugl vitaly-krugl mentioned this pull request Jul 7, 2015
self._sock = klass(family, socktype, proto)
self._sock.settimeout(self.connection._connect_timeout)
if self.connection._sock_opts:
_sock_opts = self.connection._sock_opts

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local variable _sock_opts should not begin with an underscore; should be just sock_opts

@vitaly-krugl
Copy link

Thanks @kevinconway, I completed the code review and we're almost there. Please address my new feedback above regarding:

  • _sock_opts
  • any_order (x 2)
  • "Creation and configuration of a socket apparently can also fail"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants