Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requiring hostname to be a single domain label is fairly heavy handed for some networks #94011

Closed
grahamc opened this issue Jul 27, 2020 · 38 comments

Comments

@grahamc
Copy link
Member

grahamc commented Jul 27, 2020

PR #76542 changed the hostname to be a short domain in 993baa5 and enforce this by validation. This has broken large corporate users whose networks by convention use the FQDN as their hostname, and who also have decades of history and infrastructure built around this.

I think this is a case where "one size fits all" doesn't work so well, and I'm not sure this particular point is something we want to risk breaking / losing users over.

The PR references man 5 hostname, which says:

The hostname may be a free-form string up to 64 characters in length; however, it is recommended that it consists only of 7-bit ASCII lower-case characters and no spaces or dots, and limits itself to the format allowed for DNS domain name labels, even though this is not a strict requirement.

This point about not being a strict requirement, I think, should not be made in to a strict requirement at our level.

@grahamc
Copy link
Member Author

grahamc commented Jul 27, 2020

cc @primeos, @flokli, @zimbatm, @vcunat

@andir
Copy link
Member

andir commented Jul 27, 2020

Just allow users to set just the hostname as FQDN.

I think hostname vs fqdn is a relic that should not be used most of the time. Especially with the combination of search names (which usually provide for nice privacy leaks…). I for my part just pretend there is only the hostname and that is always a FQDN. There is no domain name a server belongs to. It might be the wrong thing to do on paper but in reality I do not care what a server thinks it's name is (except for things like SMTP handshakes where another party wants an in-band confirmation).

@grahamc
Copy link
Member Author

grahamc commented Jul 27, 2020

I've reverted the relevant commit in #94022 -- take a look?

@primeos
Copy link
Member

primeos commented Jul 27, 2020

This has broken large corporate users whose networks by convention use the FQDN as their hostname, and who also have decades of history and infrastructure built around this.

This is of course a problem that we'd like to avoid (as with any breaking changes) but tbh I don't really understand that argument. Couldn't they just easily revert the relevant commit in their fork?

This point about not being a strict requirement, I think, should not be made in to a strict requirement at our level.

Anyway, that's certainly a valid argument. And in #76542 it was only ever made into a strict requirement since NixOS also provides networking.domain which makes our case a bit different. But since only networking.hostName affects the kernel's node name I'm ok if we wouldn't want to enforce it for that reason (and to allow additional characters).

But we also need to consider that allowing dots in networking.hostName makes some NixOS implementations and checks more difficult and can lead to non-obvious configurations issues that can be hard to find (I think there was a comment about Postfix in the PR but I couldn't find it anymore).

@primeos
Copy link
Member

primeos commented Jul 27, 2020

This has broken large corporate users whose networks by convention use the FQDN as their hostname, and who also have decades of history and infrastructure built around this.

@grahamc just to better understand this (if you have time): What's the main problem here? Is this about the Linux kernel hostname or networking.hostName (and in that case why is reverting the commit in a fork or updating the code not an option).

@primeos
Copy link
Member

primeos commented Sep 5, 2020

@grahamc @arianvp and anyone else who wants relaxed hostname checks: I get that you are busy (as we all are) but we really need quicker and more active responses if we want to resolve this discussion before the 20.09 release.

IIRC we still don't know any technical problems apart from the comments that this might be inconvenient for existing users.
Before the final release we should also look at #94022 (comment) (NixOps), check/finalize the release-notes, and determine if we want a read-only fqdn option.

@primeos primeos added this to To Do in 20.09 Blockers via automation Sep 5, 2020
@grahamc
Copy link
Member Author

grahamc commented Sep 5, 2020 via email

@primeos
Copy link
Member

primeos commented Sep 5, 2020

@grahamc congratulations then ;) I'm happy for you :)

Regarding this issue: From the comments here and especially in #94022 it seems to me like the current "trend" is to keep the strict checks and don't change anything (IIRC), though we didn't really reach any consensus yet. So my idea would be to simply leave this issue open for further comments and see if we get any feedback/complaints regarding this during the beta release cycle.

@martinetd
Copy link
Member

martinetd commented Sep 6, 2020

Just to add a data point as a nobody user, I'm also one of these weird users who use the fqdn as their hostnames, and got "surprised" when installing a new system as 20.09pre to test -- upgrades will also all require adjustments.
I don't have a lot that depends on that but it's more than just adjusting hostnames, and I don't have much, so I can relate to whatever org stumbled into this with whatever history they have.
For example, I have attrsets with hostnames and bag of datas for wireguard autosetup and things like that which will need amending. It could quickly be messy at larger scales.

OTOH, I understand "full hostnames" can cause problems, and the error is clear enough, but if I want to shoot myself in the foot I don't see what's wrong with that? :)

Well, either way 20.09 is out soon -- I'll wait this long to decide if I want to update my scripts or not :D
Keep up the good work everyone and congratulations @grahamc!

EDIT: after reading the comments in #94022 I can understand it's difficult -- places with explicit checks in nixpkgs are annoying for everyone. Well. Happens what will happen, but a step through with a warning as suggested there would probably be appreciable for a few people.

@jonringer
Copy link
Contributor

jonringer commented Sep 25, 2020

Just as a reminder, the 20.09 release is scheduled to happen this monday, the 28th.

If this is still relevant to blocking the release, then there should be some forward movement.

A blocker meeting has still yet to be scheduled. But, if you consider this item to still warrant blocking the entirety of the nixos-20.09 release, then please post on the Feature freeze discussion issue. A template for proposing an item can be found #95475 (comment)

@0x4A6F
Copy link
Member

0x4A6F commented Sep 30, 2020

man 7 hostname states:

Each element of the hostname must be from 1 to 63 characters long and the entire hostname, including the dots, can be at most 253 characters long.  Valid  characters  for
hostnames are ASCII(7) letters from a to z, the digits from 0 to 9, and the hyphen (-).  A hostname may not start with a hyphen.

And references some rather old RFCs:

RFC1123:

   2.1  Host Names and Numbers

      The syntax of a legal Internet host name was specified in RFC-952
      [DNS:4].  One aspect of host name syntax is hereby changed: the
      restriction on the first character is relaxed to allow either a
      letter or a digit.  Host software MUST support this more liberal
      syntax.

      Host software MUST handle host names of up to 63 characters and
      SHOULD handle host names of up to 255 characters.

The current implementation violates this:

"^$|^[[:alpha:]]([[:alnum:]_-]{0,61}[[:alnum:]])?$";

Are there reasons for this implementation?

@vcunat
Copy link
Member

vcunat commented Sep 30, 2020

I'd rather restrict this particular thread just to the question whether it should/can contain dots. What exact characters to allow... doesn't seem to be a real problem right now.

@primeos
Copy link
Member

primeos commented Oct 1, 2020

The current implementation violates this:

Yes, this is known and the main reason why this issue exists.
Though man 5 hostname (form systemd) is a better reference as Linux only supports up to 64 characters for the entire hostname (including the terminating newline).

Are there reasons for this implementation?

The main discussion was in #76542 (but also #94022 and this issue).

I'd rather restrict this particular thread just to the question whether it should/can contain dots.

Yeah, I completely agree. The Linux kernel network node hostname can contain dots and this issue is about whether we want to allow this using networking.hostName or not. The reason why it currently isn't allowed is because we have networking.domain for this (and because it isn't recommended to use a FQDN, etc.).

Personally I feel like a grace period with a warning might've been a safer choice but this also comes it's own downsides.

Anyway, basically this issue lacks feedback (e.g. from beta testes) for why this is a real problem (i.e. not I used a FQDN for networking.hostName and now this doesn't work anymore / sucks; instead we're interested why the combination of networking.hostName and networking.domain doesn't work as a replacement [e.g. sysctl kernel.hostname should still be overridable via kernel.sysctl."kernel.hostname"]).

@0x4A6F
Copy link
Member

0x4A6F commented Oct 1, 2020

Sorry, the length of hostname is limited to 64, but that is not my point.
This implementation introduces too strict type requirements, if dots are disallowed.

Specifically the limitation to alphabetical characters at the start, which must be relaxed as stated in RFC 1123.
RFC 1123 updates RFC 952 and was published as Internet Standard exactly 31 years ago.
Limiting the start of hostname to alphabetic character is stated in man 5 hosts, but it is utterly outdated and not a reference on this topic (no meaningful changes as far as 2004-11-03, only referencing RFC 952).

@grahamc
Copy link
Member Author

grahamc commented Oct 2, 2020 via email

@primeos
Copy link
Member

primeos commented Oct 2, 2020

This implementation introduces too strict type requirements, if dots are disallowed.

Again, this is known and not ideal but it was accepted as a compromise for its advantages. AFAIK it would be way more relevant to know why this is a problem and which effect of networking.hostName (which is only an abstraction) causes this problem (and if this breaks anything that it shouldn't).

But I also want to point out that I'm only trying to moderate this issue (though I'll try to reduce my participation here as we don't seem to make much progress / reach any consensus). Also I'm basically fine with any outcome (but a bit biased as #94022 was IIRC mostly rejected). Would it maybe help to do another vote here (e.g. keep the strict requirement, only make it a warning, or relax the requirement)?

My inability to provide details is mostly due to time (baby) and client confidentiality.

Yeah, that's unfortunate (but obviously not your fault).

There are Perl libraries with bug reports a decade old having to do with not handling the correct approach properly.

Not sure what this means. Do they need the FQDN and cannot get it if the Linux kernel hostname doesn't contain the domain (in which case kernel.sysctl."kernel.hostname" might be a good workaround)?

@flokli
Copy link
Contributor

flokli commented Oct 2, 2020

@grahamc do you think setting kernel.sysctl."kernel.hostname", or setting a transient hostname via hostnamectl would be suffient to work this around?

I'd assume this mostly breaks "enterprise tooling" outside the NixOS ecosystem reading the hostname directly, not from the module system.

@grahamc
Copy link
Member Author

grahamc commented Oct 2, 2020

This is a great question, let me get that tested.

@worldofpeace worldofpeace added this to To do in 20.09 Blockers Oct 5, 2020
@grahamc
Copy link
Member Author

grahamc commented Oct 6, 2020

Okay, I've confirmed this works and fixes the concerns from the Kerberos / perl side:

  boot.kernel.sysctl."kernel.hostname" = "${config.networking.hostName}.${config.networking.domain}";

I wonder if this snippet should either be in the release notes, or a networking.hostnameIncludesDomain option?

@flokli
Copy link
Contributor

flokli commented Oct 6, 2020 via email

@arianvp arianvp moved this from To do to In progress in 20.09 Blockers Oct 6, 2020
@jonringer
Copy link
Contributor

jonringer commented Oct 9, 2020

Seems like there's three action items:

If this seems acceptable, then I think we can remove this as a blocker

primeos added a commit to primeos/nixpkgs that referenced this issue Oct 10, 2020
Since NixOS#76542 this workaround is required to use a FQDN as hostname. See
NixOS#94011 and NixOS#94022 for the related discussion. Due to some
potential/unresolved issues (legacy software, backward compatibility,
etc.) we're documenting this workaround [0].

[0]: NixOS#94011 (comment)
@primeos
Copy link
Member

primeos commented Oct 10, 2020

@jonringer I just drafted #100151. Could you take a look?

@primeos
Copy link
Member

primeos commented Oct 10, 2020

Maybe #100155 will also be helpful for some (but it's only indirectly related to this PR in that it helps to obtain the FQDN via a read-only NixOS option).

jonringer pushed a commit that referenced this issue Oct 10, 2020
Since #76542 this workaround is required to use a FQDN as hostname. See
#94011 and #94022 for the related discussion. Due to some
potential/unresolved issues (legacy software, backward compatibility,
etc.) we're documenting this workaround [0].

[0]: #94011 (comment)
jonringer pushed a commit to jonringer/nixpkgs that referenced this issue Oct 10, 2020
Since NixOS#76542 this workaround is required to use a FQDN as hostname. See
NixOS#94011 and NixOS#94022 for the related discussion. Due to some
potential/unresolved issues (legacy software, backward compatibility,
etc.) we're documenting this workaround [0].

[0]: NixOS#94011 (comment)

(cherry picked from commit 4a600af)
@flokli
Copy link
Contributor

flokli commented Oct 11, 2020

I feel like the initial issue has been addressed sufficiently, there's workarounds that were found, documented and added to the release notes.

There's some ongoing discussion on #100155, but that's about adding a new convenience option, which is only loosely related to this issue, and certainly not blocking 20.09.

Let's close this one.

@flokli flokli closed this as completed Oct 11, 2020
20.09 Blockers automation moved this from In progress to Done Oct 11, 2020
@grahamc
Copy link
Member Author

grahamc commented Nov 8, 2020

Another case where this has bitten me is provisioning machines in Packet where we only get "hostname" from the Packet API, and customers can only specify "hostname". However, the API-provided hostname will often include dots without intending to actually specify the domain. This is particularly true in the case of a default name. This means I can't do any "best" thing and have to manipulate the user input and potentially set the hostname to something they did not ask for.

@arianvp
Copy link
Member

arianvp commented Nov 8, 2020

I already brought up the Packet issue before (I dont know where though; maybe it was during the go no-go meeting). Because packet was behind our release process anyway we decided to not make that a blocker if I recall correctly. (Though of course that's a bit of chicken egg ; given you are the one maintaining those images and I suppose this issue is blocking you from creating newer ones :P )

@grahamc
Copy link
Member Author

grahamc commented Nov 8, 2020

Thanks. I sorted it by just replacing .s with -s, but since Packet's validation may be more or less strict than our validation, it is essentially unsafe for me to use the user-provided hostname in the system configuration.

@flokli
Copy link
Contributor

flokli commented Dec 1, 2020

Also note NixOS/systemd properly picks up hostnames (including dots in the hostname) if networking.hostName is set to an empty string.

This should be accomplishable by setting systemd.hostname= in the kernel cmdline or by receiving a hostname from DHCP (if networkd is enable, due to UseHostname= defaulting to true).

This also should work with packet nodes.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/hostname-is-not-of-type-string-matching-the-pattern/17666/1

@arianvp
Copy link
Member

arianvp commented Dec 6, 2023

Note that flakes still is broken out of the box on most hosting providers due to this. As most hosting providers push a transient hostname that is an FQDN and then your first nixos-rebuild switch --flake will just break.

I completely disagree with the take that . in hostnames is bad. I think the opposite is true. NIS Domain names are bad and should not be used. e.g. MacOS doesnt even support setdomainname() anymore.

I would really prefer this Regex to be removed and allow dots back in networking.hostName. It's a hill to die on that is extremely annoying for day to day users.

@arianvp
Copy link
Member

arianvp commented Dec 6, 2023

Also note that boot.kernel.sysctl."kernel.hostname" is not a workaround at all. As then it means you'll have a transient hostname. Which will be overriden immediately by DHCP once the network is up.

NixOS should allow setting a static hostname with FQDN.

@arianvp
Copy link
Member

arianvp commented Dec 6, 2023

One more datapoint. Given we're a systemd-based distro and hostnamed is reponsible for handling transient (and static) hostnames for users using Networkd, DHCPCD, or NetworkManager. Systemd has the following to say and I think we should use it as our authoritative source:

The static and transient hostnames must each be either a single DNS label (a string composed of 7-bit ASCII lower-case characters and no spaces or dots, limited to the format allowed for DNS domain name labels), or a sequence of such labels separated by single dots that forms a valid DNS FQDN.

@arianvp
Copy link
Member

arianvp commented Dec 6, 2023

Finally. The suggestion of just setting both hostName and domain and relying on networking.fqdn doesn't work either:

networking.domain sets the NIS domain through setdomainname() and the NIS domain is transient only. So it can change any time due to DHCP.

So if you have a DHCP server that pushes a NIS Domain name; it will change underneath your feet and your networking.fqdn will not be the same anymore as your real fqdn leading to really confusing bugs.

The docs of networking.domain are also a bit misleading in this regard. as DHCP will override the domainname regardless of whether the option is set:

    The domain.  It can be left empty if it is auto-detected through DHCP.

For example on EC2 you can have the scenario:

networking.hostName = "hello";
networking.domain = "my-domain.com";

Then fqdn evals to hello.my-domain.com

You'd expect hostname -f to return hello.my-domain.com but it actually returns hello.my.configured.domain.in.dhcp.option-set.vpc (If one configures EC2's DHCP server to broadcast the domain name over DHCP)

@zimbatm
Copy link
Member

zimbatm commented Dec 6, 2023

Good points, we should follow systemd's lead here.

@mossholderm
Copy link

Another point... Kerberos really wants the hostname to be an FQDN. It is baked into the entire authentication model, down to the system level. If you want to support enterprises, you'll need to allow Kerberos to function correctly.

@zimbatm
Copy link
Member

zimbatm commented Jan 9, 2024

Summoning @flokli

@flokli
Copy link
Contributor

flokli commented Jan 9, 2024

I agree with @arianvp 's assessment and the references provided. Unfortunately there's a lot of stuff in the nixos module system using the fqdn, and I'm not sure it'd all work, so a PR changing this would need to trace these usages.

@zimbatm
Copy link
Member

zimbatm commented Jan 15, 2024

Does any volunteer want to drive this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests