Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: resolved: add option to toggle DNS search on hostdomain #22868

Closed

Conversation

LorbusChris
Copy link

If the domain list is empty, since c1a0727
resolved has been adding a search . line to resolv.conf in order to
ensure the FQDN of the host does not imply a DNS search domain.

In some cases it may be desirable however to enable this behavior.
In such a case the search . line should not be written to
resolv.conf.

This commit adds a SearchHostDomain= option to resolved's
configuration that can be used toggle this behavior. Defaults to no.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1874419

Beware: This is an untested Friday afternoon PR. Only RFC on this approach at this time.

@LorbusChris LorbusChris force-pushed the resolved-search-host-domain branch 2 times, most recently from 84fc834 to 406cbee Compare March 25, 2022 16:05
@LorbusChris LorbusChris changed the title RFC: resolve: add option to toggle DNS search on hostdomain RFC: resolved: add option to toggle DNS search on hostdomain Mar 25, 2022
If the domain list is empty, since systemd@c1a0727
resolved has been adding a `search .` line to `resolv.conf` in order to
ensure the FQDN of the host does not imply a DNS search domain.

In some cases it may be desirable however to enable this behavior.
In such a case the `search .` line should not be written to
`resolv.conf`.

This commit adds a `SearchHostDomain=` option to resolved's
configuration that can be used toggle this behavior. Defaults to `no`.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1874419
@poettering
Copy link
Member

What's the usecase? If you manually reconfigure resolved like this, why not just add the search domain instead? Not grokking this?

@LorbusChris
Copy link
Author

The specific need is to get rid of this ugly hack in OKD (OpenShift on Fedora CoreOS): https://github.com/openshift/okd-machine-os/blob/5686ca6a95b95437e70c79d9c72cbcdadd736f3e/overlay.d/99okd/usr/lib/systemd/system/fix-resolv-conf-search.service

There was a whole saga surrounding this, i.e. making DNS work in OKD with resolved and NM in tandem, but I'm not too familiar with the details.

AFAICT something adds the correct search domain to /run/systemd/resolve/stub-resolv.conf, but only if search . is not already present there.
I also wonder whether it might suffice to let NM manage /etc/resolv.conf alone in this case (e.g. by symlinking /etc/resolv.conf to /run/NetworkManager/resolv.conf instead - it has been said that won't work but I haven't tested that myself).

CC'ing some folks who were involved and/or might be able to explain this issue better than me: @fortinj66 @vrutkovs @dustymabe

<varlistentry>
<term><varname>SearchHostDomain=</varname></term>
<listitem><para>Takes a boolean argument. If <literal>no</literal> (the default), <command>systemd-resolved</command> will add
the line <emphasis>search .</emphasis> to <filename>/etc/systemd/resolved.conf</filename>. If <literal>yes</literal>, a search
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the line <emphasis>search .</emphasis> to <filename>/etc/systemd/resolved.conf</filename>. If <literal>yes</literal>, a search
the line <emphasis>search .</emphasis> to <filename>/run/systemd/resolve/stub-resolv.conf</filename>. If <literal>yes</literal>, a search

@LorbusChris
Copy link
Author

This PR description has a bit more reasoning in it: openshift/okd-machine-os#158

Maybe this means my above assertion is wrong and the correct search domain isn't added to resolv.conf after all in either case - it just breaks for us when search . is present, because we need the search domain to be implicitly derived from the hostname.

@poettering
Copy link
Member

Why doesn't openshift install the search domain it wants explicitly? relying on the obscure logic of glibc to derive one search domain off the hostname if it is set to an fqdn and no search domain otherwise configured is just super duper weird. It falls apart as soon as people set manual search domains anyway, as the logic is disabled in glibc then anyway.

Seriously, fix openshift on this. if you want a search domain configure a search domain like everyone else.

@LorbusChris
Copy link
Author

Ok, I think that's fair. What would be the right way to set the search domain with resolved programmatically? Do I just append a line with Domains=<mydomain> to /etc/systemd/resolved.conf?

@poettering
Copy link
Member

that really depends on where the DNS config comes from in general. DHCP? static global config? per-network networkd .network files? programmatically via NM?

@LorbusChris
Copy link
Author

LorbusChris commented Mar 29, 2022

It could be from DHCP or static global, via NM.

@LorbusChris
Copy link
Author

LorbusChris commented Mar 30, 2022

Adding some more context provided by @fortinj66 over at openshift/okd-machine-os#328 (comment):

@LorbusChris as far as I know the only case we have issues with search . is when nodes are being provisioned with UPI and kernel arguments with static IPs as there is no kernel argument to pass a DNS search parameter. (Although I suppose if DHCP didn't have it configured it could be an issue also.)

This isn't an OpenShift issue, it's a Fedora/FCOS issue and can be reproduced outside of OpenShift pretty easily.

UPI stands for User Provisioned Infrastructure (i.e. bring your own machines). I guess we can focus on this case for now. If there's a preferred way to set the search domain in such a case, please let me know.

Edit: I'll leave this PR open for now for this discussion. Please feel free to close if that's not wanted.

@poettering
Copy link
Member

UPI stands for User Provisioned Infrastructure (i.e. bring your own machines). I guess we can focus on this case for now. If there's a preferred way to set the search domain in such a case, please let me know.

Sorry, I can't parse this. Note I have no clue about OpenShift or "fcos"?

Anyway, there's nothing left to fix here? instead maybe post an issue on "fcos" (for whatever that is) or OpenShift?

@LorbusChris
Copy link
Author

LorbusChris commented Apr 6, 2022

My bad, I'm confused by this myself. FCOS stands for Fedora CoreOS (the rpm-ostree based distro).
I've had a few more conversations and this is what seems to be going on:

The machine is provisioned with static ip and hostname, which are both passed in through the ip= kernel arg. On RHEL CoreOS - which does not use systemd-resolved - the DNS search domain is then automatically derived from the hostname, which enables the machines in the cluster to discover each other. On FCOS however - which uses systemd-resolved in tandem with NetworkManager - the existence of the search . line in resolv.conf prevents this automatic derivation, and the machines are then unable to discover each other. Removing the line fixes this.

The question now is how to either
A) explicitly configure the DNS search domain at provisioning time, or
B) allow for the automatic derivation to happen on FCOS without resorting to hacking around it with https://github.com/openshift/okd-machine-os/blob/8c7935a25405c9239e310973082bd61dfeded2f3/overlay.d/99okd/usr/lib/systemd/system/fix-resolv-conf-search.service

@fortinj66
Copy link

Note that this is the code which results in the current behavior. Prior to this change "it just worked"

#17201

@poettering
Copy link
Member

@LorbusChris it sounds strange to me to rely on that behaviour.

We could probably add a feature in resolved to pick up DNS config from the kernel cmdline. Given that networkd already picks up IP config from there, it would only be natural to pick it up from there too.

or in other words, you then could add systemd.dns-search-domains=foobar.com or so on the kernel cmdline.

Would that work?

@fortinj66
Copy link

As long as that would prevent the search . from being added this would work.

The issue really is the added search . It causes issues with OpenShift pods and internal Openshift DNS search such that pods cannot resolve addresses in the cluster space.

@LorbusChris
Copy link
Author

LorbusChris commented Apr 7, 2022

@poettering a systemd.dns-search-domains= kernel cmdline arg sounds great

@poettering
Copy link
Member

can you file an RFE issue requesting that please?

@poettering
Copy link
Member

As long as that would prevent the search . from being added this would work.

Hmm? that's really no the issue here. What I am suggesting here is that openshift should set the DNS search list explicitly, instead of relying of weird automatic logic of glibc to derive a search domain for the host's hostname if it is set to an fqdn and no search domains are explicitly configured.

Hence, if we had a new kernel cmdline option you'd just add the search domain there directly, and not have to rely on that. And everything would be dandy, since you don#t rely on automatisms that will break once someone configures a manual search domain somewhere.

@LorbusChris
Copy link
Author

can you file an RFE issue requesting that please?

will do.

@LorbusChris
Copy link
Author

Filed #24103

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

3 participants