Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/networking: Add a read-only option for the FQDN #100155

Merged
merged 3 commits into from Jan 25, 2021

Conversation

@primeos
Copy link
Member

@primeos primeos commented Oct 10, 2020

This is a convenience option that can be used to quickly obtain the
configured FQDN.

Motivation for this change

See #76542 (comment). It might also help for things like #94011 (comment).
We might want to mention this in the release notes a well but we should wait for #100151 then.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.
@primeos
Copy link
Member Author

@primeos primeos commented Oct 10, 2020

@GrahamcOfBorg test hostname.explicitDomain hostname.noExplicitDomain

4

@worldofpeace
Copy link
Contributor

@worldofpeace worldofpeace commented Oct 10, 2020

Ooh, I like this. cc @jonringer I think we should for sure include this

@worldofpeace worldofpeace added this to In progress in 20.09 Blockers via automation Oct 10, 2020
@jonringer
Copy link
Contributor

@jonringer jonringer commented Oct 11, 2020

although this is convenient, I'm not sure this is great behavior. Mainly from it being nullable.

If you really needed an FQDN to be defined, wouldn't you rather want an error thrown at evalModules time?

@jonringer
Copy link
Contributor

@jonringer jonringer commented Oct 11, 2020

I think we should be careful about what we expose, and I would like to hear back from some others: cc @Infinisil @grahamc

@Infinisil
Copy link
Member

@Infinisil Infinisil commented Oct 11, 2020

Looks fine to me. Code that uses this can still check if it's null and throw an appropriate error then. Or it can not check for null and get a type error instead when it's used.

@jonringer
Copy link
Contributor

@jonringer jonringer commented Oct 11, 2020

from a user's perspective, I believe I would prefer:

assert failed: please provide a valid value for networking.hostname and networking.domain

over

cannot coerce null value to string

and I'm not even sure if a null value would produce an error, may just be coerced to ""

@jonringer
Copy link
Contributor

@jonringer jonringer commented Oct 11, 2020

again, this is largely my take on nullability or subtle coercing. I think it's less surprising to get an assertion error plainly saying what assumptions are incorrect, instead of getting some coerce-tion error at runtime that's 1 to n steps removed from the actual issue.

EDIT: to @primeos 's credit, he did make the naming explicit that it's null. But often times the way that these errors arise in production are much less desirable. In which some chain of assumptions were made.

@primeos
Copy link
Member Author

@primeos primeos commented Oct 11, 2020

although this is convenient, I'm not sure this is great behavior. Mainly from it being nullable.

Yeah, that was also my main concern. It's clearly not ideal as we don't have any nice null safety like e.g. Kotlin. So AFAIK any of the following can happen:

  • The user checks for null (would be the most safe but this isn't enforced and requires more code when using it)
  • The user gets the cannot coerce null value to string error and analyzing the trace might be a PITA
  • It's used as value or default value for another option that might be of the same type. This is probably the most dangerous case as the FQDN could be omitted without an error (though at least if the FQDN is really required the option shouldn't be nullable).

It's also important to note that networking.fqdn will be null with the default configuration due to networking.domain.

If you really needed an FQDN to be defined, wouldn't you rather want an error thrown at evalModules time?

I guess this depends on the usage. E.g. some modules might optionally take an FQDN and either skip the configuration directive or provide a default value in case of null (but again this could also result in unexpected consequences). Personally I'm not really sure what's better.

@flokli
Copy link
Contributor

@flokli flokli commented Oct 11, 2020

If the FQDN attribute is meant as a convenience option, requiring to do some checking for not being null kinda defeats its purpose. Also, joining hostName and domain manually isn't a lot of typing ;-)

On the other hand, silently skipping the domain and only returning networking.hostName is also dangerous - Just by reading its option name I'd imply it wouldn't silently fallback to that if the domain isn't set.

Can we do some exploration in the current NixOS modules code?

If we introduce such an option, it should fail loudly, and maybe with a meaningful and traceable error message.

Things like options.services.dnscrypt-wrapper.providerName could make use of such an option.

@Infinisil
Copy link
Member

@Infinisil Infinisil commented Oct 11, 2020

You could just replace the null with a throw "better error"

@jonringer
Copy link
Contributor

@jonringer jonringer commented Oct 11, 2020

I would rather write:

if config.networking.hasFqdn then config.networking.fqdn else config.networking.hostName

as it reads more naturally than

if config.networking.fqdn == null then config.networking.hostName else config.networking.fqdn

in which the conditions in which the fqdn is null aren't readily apparent.

To be completely honest, I haven't had the need to use a fqdn as part of a module yet. So I'm not sure what path is the "most ergonomic".

@aneeshusa
Copy link
Contributor

@aneeshusa aneeshusa commented Oct 11, 2020

I defined a convenience fqdn attr like this PR in my system fork when I pulled in the original change so +1 on this change overall! I agree with @flokli's comment that since this is for convenience, I'd prefer use fdqn directly without checking in my own setup (where I know I always have a domain setup) - see my use cases below.

I also like @Infinisil's throw "better error" approach.

Since networking.hostName cannot be null, for modules which need to have different behavior if an fqdn does not exist, my 2c would be to prefer if config.networking.domain == null then <fallback behavior> else config.networking.fqdn - which is the same condition they would be checking today.

My main use case in my system configs is to to set up locations in a nginx vhost for the machine's own fqdn. I have many services that are reversed proxied to from various paths - e.g. prometheus exporters under /prometheus, caldav server on /caldav, git web server on /git. Another one is to let prometheus know what its external web URL is, which could also apply to other web applications. These all are generally "give me an easy reference to a DNS name that will resolve to the machine" where I prefer a hard error on lack of domain over having to check for null since a raw host name can't be resolved via DNS.

@flokli
Copy link
Contributor

@flokli flokli commented Oct 11, 2020

I'd also prefer reading config.networking.fqdn throwing a nice error, and code like if config.networking.domain == null then <fallback behavior> else config.networking.fqdn in modules/configuration that can implement a fallback.

@jonringer
Copy link
Contributor

@jonringer jonringer commented Oct 12, 2020

Since this isn't "fixing" a particular usecase, but rather an enhancement. I would like to remove this from 20.09 blockers. cc @worldofpeace

@worldofpeace worldofpeace removed this from In progress in 20.09 Blockers Oct 12, 2020
@worldofpeace
Copy link
Contributor

@worldofpeace worldofpeace commented Oct 12, 2020

Sure.

@worldofpeace worldofpeace added this to the 20.09 milestone Oct 12, 2020
@worldofpeace
Copy link
Contributor

@worldofpeace worldofpeace commented Oct 12, 2020

Cool on backporting, though.

@primeos primeos force-pushed the nixos-add-fqdn-option branch from 9f07f38 to 594343d Oct 12, 2020
@primeos
Copy link
Member Author

@primeos primeos commented Oct 12, 2020

Ok, I switched from null to throw.

Just as a heads-up/disclaimer: if config.networking.domain == null then <fallback behavior> else config.networking.fqdn won't always be enough as the hostname can be set to an empty string in which case it will be obtained dynamically via DHCP.

Example error message:

[root@friday:/var/nixpkgs]# nixos-rebuild dry-build
building the system configuration...
error: The FQDN is required but cannot be determined. Please make sure that
both networking.hostName and networking.domain are set properly.

(use '--show-trace' to show detailed location information)

Also: Apparently the defaultText gets escaped: Default: "\${networking.hostName}.\${networking.domain}" (literalExample might help)

This is a convenience option that can be used to quickly obtain the
configured FQDN.
@primeos primeos force-pushed the nixos-add-fqdn-option branch from 594343d to 971f0b4 Oct 12, 2020
Copy link
Contributor

@jonringer jonringer left a comment

LGTM

flokli
flokli approved these changes Oct 12, 2020
Copy link
Contributor

@flokli flokli left a comment

SGTM - however, we shouldn't backport this to 20.09 IMHO.

Let it sink in unstable, let's see what parts of the module system make use of it, so we have the time to do necessary changes on it - if we see the need for it.

@jonringer
Copy link
Contributor

@jonringer jonringer commented Oct 12, 2020

good point @flokli maybe we should test pilot this in unstable.

@FRidh FRidh removed this from the 20.09 milestone Dec 20, 2020
@FRidh FRidh added this to the 21.03 milestone Dec 20, 2020
cole-h
cole-h approved these changes Jan 14, 2021
Copy link
Member

@cole-h cole-h left a comment

Diff LGTM.

If we want to test pilot this in unstable, I feel we should do so sooner rather than later (e.g. the closer we get to releasing 21.05, the less time we have for people to run with this option and find issues).

(Read: this should probably be merged, but I'm not confident enough to do so despite the approvals.)

@flokli
Copy link
Contributor

@flokli flokli commented Jan 15, 2021

I gave this a try locally by cherry-picking this into my local checkout.

I then manually edited nixos/modules/services/networking/smokeping.nix and modified the default of services.smokeping.hostName to be config.networking.fqdn (cause I assume that'd be a location where people would use this option)

I then did a nixos-rebuild switch -I nixpkgs=/path/to/checkout, but got the evaluation failure:

while evaluating the attribute 'default' at /home/flokli/dev/nixos/nixpkgs-outbox/nixos/modules/tasks/network-interfaces.nix:404:7:
The FQDN is required but cannot be determined. Please make sure that
both networking.hostName and networking.domain are set properly.

However, my system configuration sets both networking.hostName, as well as networking.domain. I assume the default of networking.fqdn (which is at network-interfaces.nix:404) is evaluated to early, before hostName and domain is set…

@primeos
Copy link
Member Author

@primeos primeos commented Jan 15, 2021

I then manually edited nixos/modules/services/networking/smokeping.nix and modified the default of services.smokeping.hostName to be config.networking.fqdn (cause I assume that'd be a location where people would use this option)

Interesting... I haven't confirmed this yet but after thinking a bit about this it might/should be due to the lack of using defaultText (assuming the configuration.nix man page (etc.) is built without considering networking.* - not sure if that's the case though)?
Update: Even with defaultText the evaluation for nix-build nixos/release.nix -A manual.x86_64-linux still fails... :o
Update2: Nvm, the default value of services.smokeping.hostName is used for other defaults in the smokeping module (that's why defaultText isn't enough).

@primeos
Copy link
Member Author

@primeos primeos commented Jan 15, 2021

@flokli ok, so the SmokePing example was a pretty good choice as it shows some issues this could cause. In that case defaultText wasn't enough because another option lacks it as well (which I consider a bug/oversight, but such chains can be pretty difficult to track down):

      ownerEmail = mkOption {
        type = types.str;
        default = "no-reply@${cfg.hostName}";
        example = "no-reply@yourdomain.com";
        description = "Email contact for owner";
      };

(this option would need something like defaultText = "no-reply@\${hostName}"; as the module uses in other places)

The fact that this results in an evaluation error is a good thing IMO as this allows us to spot such issues via CI (and even before when testing the changes). But the obvious downside is that the error (message) is pretty difficult to understand when modifying/writing a module (vs. when using the module as a user).

@flokli
Copy link
Contributor

@flokli flokli commented Jan 17, 2021

Given this is mostly intended to be used as tooling in other NixOS modules/configuration we definitely need more documentation on how it can be used. I still don't fully grasp how I'd have to update the smokeping module ;-)

@primeos
Copy link
Member Author

@primeos primeos commented Jan 17, 2021

@flokli I've added a PoC example for SmokePing that should work (at least it passes nix-build nixos/release.nix -A manualHTML.x86_64-linux).

Given this is mostly intended to be used as tooling in other NixOS modules/configuration we definitely need more documentation on how it can be used.

Not sure what to do here as this issue doesn't only affect the new networking.fqdn option. IIRC setting the default of a NixOS option to another NixOS option is discouraged (and IIRC this used to rebuild the various manuals if the user changes the other NixOS option, which would be very undesirable; but then again this wouldn't make much sense with the current behaviour where it failed due to the default of networking.domain suggesting that changing it wouldn't matter when building the manuals).

Has anyone else any ideas or feedback regarding this?

@flokli
Copy link
Contributor

@flokli flokli commented Jan 22, 2021

Let's give this some testing in unstable.

@flokli
Copy link
Contributor

@flokli flokli commented Jan 22, 2021

Hold on, editorconfig is not happy:

Run echo "$PR_DIFF" | xargs ./bin/editorconfig-checker -disable-indent-size
nixos/modules/services/networking/smokeping.nix:
	244: Wrong indent style found (tabs instead of spaces)
	245: Wrong indent style found (tabs instead of spaces)
	246: Wrong indent style found (tabs instead of spaces)
	247: Wrong indent style found (tabs instead of spaces)
	248: Wrong indent style found (tabs instead of spaces)
	249: Wrong indent style found (tabs instead of spaces)
	250: Wrong indent style found (tabs instead of spaces)
	251: Wrong indent style found (tabs instead of spaces)
	252: Wrong indent style found (tabs instead of spaces)
	253: Wrong indent style found (tabs instead of spaces)
	254: Wrong indent style found (tabs instead of spaces)
	255: Wrong indent style found (tabs instead of spaces)

@primeos
Copy link
Member Author

@primeos primeos commented Jan 23, 2021

@ofborg test smokeping

Edit: Fixed the test.

@primeos primeos force-pushed the nixos-add-fqdn-option branch from c4cadcf to 237c20a Jan 23, 2021
Copy link
Contributor

@jonringer jonringer left a comment

diff LGTM

cc people from #94022 @grahamc @flokli @aanderse @worldofpeace

@aanderse
Copy link
Member

@aanderse aanderse commented Jan 24, 2021

It's early enough in the release cycle that this will have a good amount of time for testing still. My gut feeling is we should be careful to not use this new option in modules where the value can't be overridden... but I don't have anything hard to back that up.

I propose merge very soon to maximize testing time, if no one has any problems.

flokli
flokli approved these changes Jan 25, 2021
@flokli flokli merged commit b2f3bd4 into NixOS:master Jan 25, 2021
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

9 participants