Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server only listens on IPv4 #61

Closed
WhyNotHugo opened this issue Jun 26, 2021 · 16 comments
Closed

Server only listens on IPv4 #61

WhyNotHugo opened this issue Jun 26, 2021 · 16 comments

Comments

@WhyNotHugo
Copy link
Contributor

It seems that the server only listens on IPv4, but httpserver.host returns localhost.

This will resolve to ::1 on a dual-stack system, so connecting to it will fail. httpserver.host should either return an IPv4 address, or the server should listen on both.

On most OSs, listening to just IPv6 implicitly listens on IPv4, so it might just be a matter of listening on the latter.

@csernazs
Copy link
Owner

Hi!

I've received your issue and will provide a fix very soon. I'm sorry for the delay, was on holiday for a couple of days and didn't have time to work on this.
Probably I can come up with a fix on this today or tomorrow latest.

@WhyNotHugo
Copy link
Contributor Author

No worries, we all have lives.

I think a simple fix might be to just change the default host to 127.0.0.1. Listening on IPv6 would be cleanest though.

@csernazs
Copy link
Owner

I wanted to understand this issue so I looked at werkzeug's code. It has a naive algorithm to detect the address family:
https://github.com/pallets/werkzeug/blob/main/src/werkzeug/serving.py#L606

So if the host specified does not contain : (colon) then it is assumed to be IPv4 only (with the special case for unix domain sockets), no address family lookup is done there by getaddrinfo...

As you suggested there are two possible fixes:

  1. I could specify the IPv6 loopback address as the default but that would break systems running with IPv6 disabled and I don't want to break any code. Adding auto-detection is a seems to be very hard to do properly on both Linux and Windows.
  2. Specifying 127.0.0.1 seems to be ok...
  3. Looking up the bound address in the server and returning that IP address in url_for, seems to be ok...

But the real issue with changing the default hostname is the following: if someone uses TLS the certificate may contain the hostname and if it is specified as "localhost", it won't work with "127.0.0.1" as the certificate contains the hostname. If it was made properly and contains 127.0.0.1 and localhost it will work but if only "localhost" is there, it will fail... For example test_ssl.py fails in pytest-httpserver's test suite.
The main issue here is that the same hostname is used in url_for which is used for the server to bind to...

Currently I have no good idea to fix this issue, as changing the default host may break TLS. I could release 2.0.0 but 1.0.0 has been released recently and releasing potentially breaking releases is not a good sign. I'm also thinking about writing a howto chapter about it or updating the documentation but that's also a half solution. Let me think on this for a while - or if you have ideas, please share it.

Last but not least, I couldn't reproduce it in github actions, even if I added test test running on windows.

For the record, I've just realized that this issue have been reported in PR #52 by @tenuki, but I wasn't able to understand it that time.

@WhyNotHugo
Copy link
Contributor Author

WhyNotHugo commented Jun 28, 2021 via email

@csernazs
Copy link
Owner

With a quick search I found this bug from 2015: pallets/werkzeug#33
Interestingly, it is the opposite as our case. :)

I think werkzeug should require an IP address, and not a hostname for bind (if it makes decisions on the hostname string, then it is an issue) or call getaddrinfo() but that's a more complicated story.

One idea which came up since my last comment is the following:

  1. server is bound to "localhost", so TLS will be intact, no default hostname will be changed
  2. first call to url_for obtains the hostname specified for the server. If it is "localhost", it tries to connect it with a simple connect() call
    1. if it is successful, then everything can go, it creates urls with "localhost" text. It assumes that name resolution will happen in the client software in the same way as the standard socket API's gethostbyname() (if not, we can't do anything)
    2. if it cannot connect to localhost, it can try with IP addresses "127.0.0.1" or "::1". If any of them successful, it will use that for URL. One "minor" issue is that the client being tested needs to support IPV6 if it returns "::1" which is not always the case. Port numbers should be specified differently in the case of IPv6 for example. In this case, it will break TLS but TLS would not work with "localhost" due to the previous step failed.

Alternatively, there would be just a check implemented in the url_for function and report the issue in a human friendly way, and do not do any heuristics. But I'm not sure if this should be responsibility of url_for.

@csernazs
Copy link
Owner

I think I've found the solution 💡

In pytest-httpserver, I'll do a gethostbyname() for the provided hostname and pass the resulted IP address to werkzeug.

So, if gethostbyname() call...

  • ...fails: it would fail with the implicit gethostbyname() call at bind() also ➡️ it won't break anything which is working now
  • ...results IPv4 address (eg. 127.0.0.1) ➡️ werkzeug will bind to IPv4 (AF_INET)
  • ...results IPv6 address (eg. ::1) ➡️ werkzeug will bind to IPv6 (AF_INET6) as it will find the colon (:) in the host parameter

As the http client will also resolve the name with the same mechanism, it will use the same IP address the server bound so the connection will be successful.

If there's no objection about it, I would proceed with this.

@WhyNotHugo
Copy link
Contributor Author

Huh, that's actually a really smart and simple approach. I like it.

@csernazs
Copy link
Owner

There's one little issue with my "solution": gethostbyname() is for IPv4 only. I forgot that...
There's getaddrinfo() which does name resolution for both IPv4 and IPv6 but it returns both addresses, and I cannot decide which one the client will connect to, eg. which one the server should bind. According to its documentation a client wanting to connect to a remote host should try the addresses one by one, it seems that in this issue this is not happening as it would find the server on 127.0.0.1.

Could you share a bit more information about the issue? What is the environment and which HTTP client are you using? Having a reproduction on my side would be very helpful.

@WhyNotHugo
Copy link
Contributor Author

If you run this unit test with httpserver.host instead of 127.0.0.1 it fails:

https://github.com/pimutils/vdirsyncer/blob/1a1f6f078882a8b7d1bd768a4a6544da8eec6cb1/tests/system/utils/test_main.py#L51-L69

I'm using aiohttp for the request. I think just using aiohttp with nothing special should reproduce the failure. My setup is dualstack.

@csernazs
Copy link
Owner

Thanks, I have a reproduction! Looking at it.
Interestingly a very minimal setup worked for me (create server with socket api for AF_INET, and use aiohttp with "localhost"), but your case triggers the issue.

@csernazs
Copy link
Owner

Interestingly, only the second request made in line 69 (https://github.com/pimutils/vdirsyncer/blob/1a1f6f078882a8b7d1bd768a4a6544da8eec6cb1/tests/system/utils/test_main.py#L69 ) fails with the connection error, the first passes.
If I remove that 2nd request by commenting that line and the pytest.raises line from the code, the test passes and the first request completes.

In aiohttp's resolver, it resolves both addresses (ipv4 and ipv6) for localhost:
https://github.com/aio-libs/aiohttp/blob/d55728da7486e0295a11bf2378a2eceb8fe39c64/aiohttp/resolver.py#L27
And my guess is that it tries with the other address when there's some error, and as the second request tests an invalid http request (with an invalid TLS, if I'm not mistaken) then it passes on the ipv6 address.

But I'm still looking.

@csernazs
Copy link
Owner

I think I have a better understanding of this issue.

In your test, you are testing with a bogus TLS setup. In the test a ServerFingerprintMismatch exception is expected, which is a ServerConnectionError, defined in aiohttps's client_exceptions.py. This is important. :)

There's a retry logic in aiohttp which iterates on the resolved addresses (for localhost it contains 127.0.0.1 and ::1). If it cannot connect and catches ServerConnectionError, it tries with the next one (https://github.com/aio-libs/aiohttp/blob/d55728da7486e0295a11bf2378a2eceb8fe39c64/aiohttp/connector.py#L992). When all hosts have been tried and none of them where successful, the last exception is raised.

I think that in the second request of your test case aiohttp fails with 127.0.0.1, it raises ServerFingerprintMismatch, which is a ServerConnectionError, so it moves on to ::1 address. This also fails but not with ServerFingerprintMismatch, so it makes the test case failed. This is also reflected in the traceback where the last_exc is re-raised, which is the exception received for the last connection attempt.

By fixing the address to 127.0.0.1 as you did, the list of IP addresses is only one so the last_exc exception is the one the test is expecting.

I think that it is an edge case and it does not means that pytest-httpserver does not work on dual-stack systems. It means that if you want to test for a TLS error (or some other error which raises an exception derived from ServerConnectionError) that fails with a completely different error due to the re-trying implemented in aiohttp.

What can be done in pytest-httpserver:

  • listening on both ipv4 and ipv6 by setting IPV6_V6ONLY socket option to 0 (there's a python API for it in newer python interpreters but I think I can backport it). This doesn't work on my linux for some reason I couldn't find out.
  • listening on both ipv4 and ipv6 by two different servers: it adds complexity as handlers needs to be registered/cleaned up twice, server stop time doubles, two threads needs to be started instead of one
  • changing the default host to 127.0.0.1, it may cause issues with TLS as I described previously
  • changing the default host to 127.0.0.1 only when TLS is not enabled: this would not solve your issue
  • adding a documentation entry on this issue: this can be done easily

And probably more. I'll think about it.

@WhyNotHugo
Copy link
Contributor Author

Ah, thanks for the research here, that was quite a tricky one.

I think listening in dualstack is always a good choice if it doesn't introduce complexity or other issues. But it sounds like in my use case, it's best to specify 127.0.0.1 (and probably leave a comment to this explanation!).

csernazs added a commit that referenced this issue Jul 6, 2021
There was a strange issue with dual-stack, so the documentation is updated.

Fixes #61
@csernazs
Copy link
Owner

csernazs commented Jul 6, 2021

Could you please look at the text I added to howto.rst? If there's some error or I missed something, I'm happy to fix it.

@WhyNotHugo
Copy link
Contributor Author

WhyNotHugo commented Jul 17, 2021

I think that's clear enough.

It's a very curious caveat though, and I think pretty much all systems nowadays are dualstack (even if they have public IPv6 connectivity). But given that I can't think of a way to completely get rid of it, that sounds good to me.

@csernazs
Copy link
Owner

Thanks!

I want to note that pytest-httpserver does work on dual stack system (mine is dual stack, I presume that the github CI is also dual-stack), as long as the client implements retrying logic which falls back to 127.0.0.1 (requests does, aiohttp does).
This issue appears when you want to test for errors as in such case it is not possible to distinguish between the retry caused errors and the "error I want to test", which is in your code is a TLS error - this is what I documented.

As discussed above, changing the default host to 127.0.0.1 is likely to cause errors in clients which use TLS connections whose certificate is issued for the "localhost" and does not issued for "127.0.0.1".
I think I tried various ways to solve this issue, but unfortunately none of them seems to be feasible.

csernazs added a commit that referenced this issue Jul 22, 2021
There was a strange issue with dual-stack, so the documentation is updated.

Fixes #61
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants