-
-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nixos/networking: use one line per IP in /etc/hosts #119236
base: master
Are you sure you want to change the base?
Conversation
$ python3 -c 'import socket; print(socket.gethostbyaddr("::1"))'
('localhost', [], ['::1']) That is actually expected, see #76542:
IIRC I didn't like the current behaviour either and I'm open to changing it but that obviously requires careful consideration as it could cause many regressions (#76542 already caused more regressions than expected :o).
That should be fixable, e.g. if
Oh, that's bad :o
Yeah unfortunately these things are pretty difficult as the code and documentation around this is pretty outdated and it's easy to miss some consequences/regressions.
AFAIK |
From hosts(5) (emphasis mine): > For each host a *single* line should be present with the following > information: Prior to this change, my hosts file looked like this: 127.0.0.1 localhost ::1 localhost 127.0.0.2 atuin.qyliss.net atuin ::1 atuin.qyliss.net atuin After this change, it looks like this: 127.0.0.1 localhost ::1 localhost atuin.qyliss.net atuin 127.0.0.2 atuin.qyliss.net atuin Having multiple lines for the same IP breaks glibc's gethostbyaddr. The easiest way to demonstrate this is with Python, but a simplified C program is provided at the end of this message too. $ python3 -c 'import socket; print(socket.gethostbyaddr("::1"))' ('localhost', [], ['::1']) With this fix applied: $ python3 -c 'import socket; print(socket.gethostbyaddr("::1"))' ('localhost', ['atuin.qyliss.net', 'atuin'], ['::1']) As a higher level example, socket.getfqdn() will return 'localhost' without this change, and 'atuin.qyliss.net' with it. This was responsible for my Mailman instance sending mail with @localhost in the Message-Id. C program: #include <err.h> #include <netdb.h> #include <sysexits.h> #include <stdio.h> int main(void) { struct in6_addr addr = { 0 }; addr.s6_addr[sizeof addr.s6_addr - 1] = 1; // ::1 struct hostent *host = gethostbyaddr(&addr, sizeof addr, AF_INET6); if (!host) err(EX_OSERR, "gethostbyaddr: %s", hstrerror(h_errno)); printf("name: %s\n", host->h_name); size_t n; for (n = 0; host->h_aliases[n]; n++); printf("aliases (%zu):", n); for (size_t i = 0; i < n; i++) printf(" %s", host->h_aliases[i]); printf("\n"); }
> Having multiple lines for the same IP breaks glibc's gethostbyaddr.
```console
$ python3 -c 'import socket; print(socket.gethostbyaddr("::1"))'
('localhost', [], ['::1'])
```
That is actually expected, see #76542:
Maybe I'm missing something in #76542, but it seems to be that it's
about making sure that ::1's canonical hostname is always localhost. I
do not propose changing this. I just want the FQDN and hostname to show
up as aliases. Are you saying it's intentional that they don't?
> After this change, hostname -f will return localhost.
That should be fixable, e.g. if `/etc/hosts` looks like this (the canonical hostname / FQDN has to come first):
```
127.0.0.1 atuin.qyliss.net atuin localhost
::1 localhost atuin.qyliss.net atuin localhost
```
Wouldn't that reintroduce the problems #76542 was trying to fix? The
other two distros I've looked at (Debian and Void) also make localhost
the canonical name for 127.0.0.1 FWIW.
> It's important we make this change so that /etc/hosts is actually
> valid, but we need to make hostname -f work as well.
AFAIK `/etc/hosts` is already valid (just a bit strange), though IIRC
I later discovered one hostname function that returned multiple
matches due to it (maybe `localhost` and the FQDN for
`127.0.0.1`/`::1`? - not sure anymore).
I read the man page I quoted earlier as saying that it's invalid, but
there's definitely some ambiguity.
|
Yes, that's correct.
Oh, sorry, I didn't read the second example carefully enough (
Unfortunately yes. IIRC it's not possible to add the aliases in the second case without breaking either
I'm reading "should" as a recommendation (while invalid = parsing fails, etc.) - or how RFCs use it:
But I get what you mean, I'm just using/weighting (in)valid differently. |
> Are you saying it's intentional that they don't?
Unfortunately yes. IIRC it's not possible to add the aliases in the
second case without breaking either `hostname -f` or `::1` not
resolving back to `localhost`. The only solution that I see
potentially working is `::1 localhost atuin.qyliss.net` - i.e. with
only the FQDN, not the hostname (`atuin`) - because IIRC `hostname -f`
will use `gethostname()` to get `atuin` and then determine the FQDN
using `gethostbyname("atuin")` (i.e. `gethostbyname(gethostname())`).
Well, one solution would be to switch to a hostname implementation that
works properly, like the one in inetutils. I haven't done a lot of
research into what the other differences are. Do you know anything
about that or have any opinions on it?
|
On my system [0]: I just realized that we still haven't updated to the new 2.0 release. The previous one (1.9.4) is from the end of 2011. (cc @matthewbauer) |
The current state of things is very headache-inducing. I have a couple of VPSs, both under the same domain:
Here's what Python thinks is going on: >>> import socket
>>> socket.gethostname()
'host1'
>>> socket.getfqdn()
'localhost'
>>> socket.getfqdn('host1')
'localhost'
>>> socket.getfqdn('host1.example.com')
'localhost'
>>> socket.getfqdn('example.com')
'host1'
>>> socket.getfqdn('host2.example.com')
'host2.example.com'
>>> socket.getfqdn('nixos.org')
'nixos.org' This violates several assumptions, namely:
Here are the results when run on the second machine: >>> import socket
>>> socket.gethostname()
'host2'
>>> socket.getfqdn('host1')
'host1.example.com'
>>> socket.getfqdn('host1.example.com')
'host1.example.com'
>>> socket.getfqdn('example.com')
'host1.example.com'
>>> socket.getfqdn('host2.example.com')
'localhost' And here's my laptop's perspective: >>> import socket
>>> socket.gethostname()
'laptop'
>>> socket.getfqdn('host1.example.com')
'host1.example.com'
>>> socket.getfqdn('example.com')
'host1.example.com'
>>> socket.getfqdn('host2.example.com')
'host2.example.com' All three of these should be giving the same results. |
I'm not quite sure what's the state of this PR, considering its draft status and #119236 (comment). @alyssais is there a need to switch to another hostname implementation? |
FWIW, I was able to get this working correctly by removing {
networking.hostFiles = mkForce [];
} Is there a good reason why we're even filling A quirk I ran into with this setup is that PTR queries for the machine's public IP address return only the hostname, but I think this might be a bug with |
Florian Klink ***@***.***> writes:
I'm not quite sure what's the state of this PR, considering its draft
status and
#119236 (comment).
@alyssais is there a need to switch to another hostname implementation?
Yes, the hostname implementation that we're using does the wrong thing,
and will report an incorrect value for --fqdn if we fix the hosts file
like this PR does.
I haven't pursued this further, because I haven't had the energy to push
through a change like "switch our default hostname implementation".
|
nss-myhostname only applies on system with nscd enabled, and there's valid reasons to not have it enabled. In that case, Also, there's some Go binaries that don't use nss at all, but parse |
Seems with resolved enabled, |
I marked this as stale due to inactivity. → More info |
Motivation for this change
From hosts(5) (emphasis mine):
Prior to this change, my hosts file looked like this:
After this change, it looks like this:
Having multiple lines for the same IP breaks glibc's gethostbyaddr.
The easiest way to demonstrate this is with Python, but a simplified C
program is provided at the end of this message too.
$ python3 -c 'import socket; print(socket.gethostbyaddr("::1"))'
('localhost', [], ['::1'])
With this fix applied:
$ python3 -c 'import socket; print(socket.gethostbyaddr("::1"))'
('localhost', ['atuin.qyliss.net', 'atuin'], ['::1'])
As a higher level example, socket.getfqdn() will return 'localhost'
without this change, and 'atuin.qyliss.net' with it. This was
responsible for my Mailman instance sending mail with @localhost in
the Message-Id.
But! This exposes a problem. After this change, hostname -f will
return localhost. Worse, this won't be caught by the "hostname" NixOS
test, because that installs inetutils, which comes with its own hostname
implementation that will override the default one.
So I'm not sure what to do here. It's important we make this change so
that /etc/hosts is actually valid, but we need to make hostname -f work
as well.
Our options as I see it are:
Either way, we should make sure the hostname test actually uses the
hostname implementation that NixOS uses by default.
Maybe there's something else? cc @primeos @flokli, who fixed hostname -f
before.
Things done
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)nix path-info -S
before and after)Also CCing the people who helped me debug this: @puckipedia
@leahneukirchen.