New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
systemd-resolved sometimes does not produce IPv4 addresses with Cloudflare DNS #17745
Comments
Can you turn on debug logging in resolved please, and reproduce the issue and paste the generated log output somewhere? For enabling debug logging do "systemctl edit systemd-resolved", then type:
then issue |
I tested it for about half an hour, and I cannot reproduce this now. resolved always returns IPv4 addresses. However, it sometimes doesn't return IPv6 addresses now. I'm attaching the log from one such case. Lack of IPv6 addresses is not a problem for me, but maybe this log will be helpful still. I will keep using Cloudflare DNS for now, in case the problem reproduces. |
I managed to reproduce the problem again, here are the logs:
In order to trigger the issue I had to browse different web sites for about 15 minutes before trying to resolve www.youtube.com. |
I think, I've also observed this problem with Google DNS (8.8.8.8, 8.8.4.4) at least once: ru.archive.ubuntu.com resolved to an IPv6 address, which also resulted in a "Network is unreachable" error. Normally, this host name resolves to an IPv4 address. I can also see PS: In this case, Google DNS was configured as DoT. With Cloudflare DNS the issue reproduces both in basic DNS and DoT modes the same way, so DoT is not a factor. |
I should add that this started happening after upgrade from Kubuntu 20.04 (systemd 245.4) to 20.10 (systemd 246.6). |
Is this still reproducible? All logs you attached actually show successful resolutions, i.e. nothing in those logs suggests any look-up failed |
Currently, I have Google DNS configured, and with it the issue reproduces rarely (but still does occasionally). Nothing has changed on my end, no systemd package updates.
How can the resolution fail in the calling application then? |
@poettering we have a case looking very similar - flatcar/Flatcar#374 |
@samm-git not a useful bug report for us, doesn't even mention what resolved version that is... |
@poettering i reported it here - #19118. Please let me know if anything else needed. |
OK, I think this is just use-after-free.
Looking at the code, in bus_method_resolve_hostname_complete() we hit the check Despite changes in the local transport protocol, the scenario seems the same. The big question now is: is this reproducible with a) |
@keszybz thank you for looking into that. I will try to get master running to validate. I think that systemd-resolved replacement should be just enough, no need to change everything |
It's fine to run systemd-resolved from the build directory. |
@keszybz i found that master is broken on build with some python-related errors. Problem with flatcar is that this is immutable OS, so its not that easy to hack the things. May be i will try to reproduce that with some more traditional (fedora, debian?) distro so it would be easier to hack the things and play with builds. |
@samm-git https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/66SUDZOEKPUEMX4UGI3CNKUNP6HMPJTC/ has a list of links to builds for Fedora in various versions. |
I still see this problem with the current Fedora 35 / systemd 249 (v249.9-1.fc35). Same diagnostics in the log and same observed behavior, occasional failures in resolution from apps point of view.
This request to pics.paypal.com actually failed from the browser point of view. |
I permanently do have this issue with systemd 250.3 on Arch Linux. Interestingly, it even occurs when I have my Router set as DNS server, as it runs dnsmasq querying 1.1.1.1 Why does this issue then only occur on machines with resolved? |
Not sure, but already fixed by #22132? Could you test the current git HEAD? |
The problem was reproducible without DoT too. |
@yuwata i was not using dot, so unlikely |
I can reproduce it in current Arch Linux (
Valgrind would report this then, wouldn't it? In #19118 (comment) I reproduced the bug while running systemd-resolved in valgrind and get no error messages from valgrind. Is there anything I can do to hunt down this bug? I can semi-reliably reproduce it (takes a few tries each time). |
For anyone wanting to work on this, here a tip to narrow it down: Even the DNS queries are proxied through dnsmasq, the bug still appears. So DNSmasq relays the problematic answers. |
Still happening with Fedora 39 WS: Host in question is repo.maven.apache.org |
Sometimes (often immediately after boot), systemd-resolved does not produce IPv4 addresses for www.youtube.com, only IPv6 addresses. This happens when systemd-resolved is configured to use Cloudflare DNS (1.1.1.1, 1.0.0.1) in the config file and no link-specific DNS is set. As a result, applications (e.g. ping, traceroute) cannot connect with error "Network is unreachable" because IPv6 is not operational on this machine.
The problem is not stable, as after some delay systemd-resolved may start returning IPv4 addresses along with IPv6. But at other points it may stop returning IPv4 addresses again. I did not notice this problem with hosts other than www.youtube.com, although obviously I cannot know if other problematic hosts exist.
Here is a console log of one of such occurrence:
The following is the resolve output when IPv4 addresses are returned:
resolvectl status output:
There are no errors in the journal, except:
Although the above messages don't seem to correlate with the problem occurrence.
I do not see this behavior e.g. with Google DNS. Although it doesn't return as many addresses, it always produces IPv4 and IPv6.
systemd version the issue has been seen with
246
Used distribution
Kubuntu 20.10
Linux kernel version used (
uname -a
)5.8.0-29-lowlatency
CPU architecture issue was seen on
x86-64
Expected behaviour you didn't see
systemd-resolved should produce IPv4 addresses.
Unexpected behaviour you saw
systemd-resolved sometimes does not produce IPv4 addresses.
Steps to reproduce the problem
Set systemd-resolved config:
In NetworkManager, configure the network connection to not have DNS servers. E.g. set configuration mode to Automatic (only address) and leave DNS address empty.
Reboot.
After booting, immediately issue
ping www.youtube.com
andresolvectl query www.youtube.com
. Usually, this will preproduce the problem, but it may not work every time. In a dozen seconds the issue may seem to resolve (i.e. IPv4 addresses would appear), but the addresses may disappear again at a later time. I cannot tell when the addresses disappear or what causes that.The text was updated successfully, but these errors were encountered: