Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Android: DNS lookup failure for custom control servers on 1.31.40-t2aade349f-g033f7d87b43 #5698

Closed
PeterCxy opened this issue Sep 20, 2022 · 9 comments · Fixed by #6024
Closed
Labels

Comments

@PeterCxy
Copy link
Contributor

What is the issue?

Using the F-Droid build version 1.31.40-t2aade349f-g033f7d87b43 of the Android client, I am unable to connect to my custom Headscale control server. The log shows DNS lookup failures, and because the control server is not an allowed domain in the DNS fallback servers, the client is unable to resolve the IP address in any way, and is stuck in the loading state. The previous F-Droid build (1.29.194-t70f9fc8c7-gd0812b9476b) and the Play Store build (1.30.2) do not have the same issue.

A sample of the logs:

09-20 09:14:00.237 18452 18483 I com.tailscale.ipn: 13.7M/114.5M Engine created.
09-20 09:14:00.246 18452 18481 I com.tailscale.ipn: 13.9M/117.6M Start
09-20 09:14:00.396 18452 18481 I com.tailscale.ipn: 15.7M/124.9M using backend prefs for "ipn-android": Prefs{ra=true dns=true want=true exit=9 lan=true url="https://headscale.typeblog.net" host="<redacted>" Persist{lm=, o=, n=[Wjs3H] u="<redacted>"}}
09-20 09:14:00.617 18452 18469 I com.tailscale.ipn: 18.9M/142.9M Backend: logs: be:5b5b7bc8b16ef1cef056e0a0c2354bea30e171a60f6c43e1fdf70cf52e01526f fe:
09-20 09:14:00.619 18452 18482 I com.tailscale.ipn: 18.9M/142.9M control: client.Login(false, 0)
09-20 09:14:00.620 18452 18471 I com.tailscale.ipn: 18.9M/154.4M health("overall"): error: not in map poll
09-20 09:14:00.621 18452 18471 I com.tailscale.ipn: 18.9M/154.4M control: doLogin(regen=false, hasUrl=false)
09-20 09:14:00.626 18452 18483 I com.tailscale.ipn: 18.9M/154.6M SetPrefs: Prefs{ra=true dns=true want=true exit=9 lan=true url="https://headscale.typeblog.net" host="Pdx206_kddi" Persist{lm=, o=, n=[Wjs3H] u="petercxy"}}
09-20 09:14:00.629 18452 18471 I com.tailscale.ipn: trying bootstrapDNS("derp4d.tailscale.com", "134.122.94.167") for "headscale.typeblog.net" ...
09-20 09:14:01.034 18452 18481 I com.tailscale.ipn: trying bootstrapDNS("derp12c.tailscale.com", "2001:19f0:5c01:2cb:5400:3ff:fe8d:cb60") for "headscale.typeblog.net" ...
09-20 09:14:01.036 18452 18481 I com.tailscale.ipn: bootstrapDNS("derp12c.tailscale.com", "2001:19f0:5c01:2cb:5400:3ff:fe8d:cb60") for "headscale.typeblog.net" error: Get "https://derp12c.tailscale.com/bootstrap-dns?q=headscale.typeblog.net": dial tcp [2001:19f0:5c01:2cb:5400:3ff:fe8d:cb60]:443: connect: network is unreachable
09-20 09:14:01.036 18452 18481 I com.tailscale.ipn: trying bootstrapDNS("derp6.tailscale.com", "68.183.90.120") for "headscale.typeblog.net" ...
09-20 09:14:02.126 18452 18488 I com.tailscale.ipn: trying bootstrapDNS("derp9b.tailscale.com", "2001:19f0:6401:eb5:5400:3ff:fe8d:6d9b") for "headscale.typeblog.net" ...
09-20 09:14:02.128 18452 18488 I com.tailscale.ipn: bootstrapDNS("derp9b.tailscale.com", "2001:19f0:6401:eb5:5400:3ff:fe8d:6d9b") for "headscale.typeblog.net" error: Get "https://derp9b.tailscale.com/bootstrap-dns?q=headscale.typeblog.net": dial tcp [2001:19f0:6401:eb5:5400:3ff:fe8d:6d9b]:443: connect: network is unreachable
09-20 09:14:02.129 18452 18488 I com.tailscale.ipn: trying bootstrapDNS("derp11.tailscale.com", "18.230.97.74") for "headscale.typeblog.net" ...
09-20 09:14:02.740 18452 18483 I com.tailscale.ipn: trying bootstrapDNS("derp2f.tailscale.com", "2607:f740:0:3f::f4") for "headscale.typeblog.net" ...
09-20 09:14:02.743 18452 18483 I com.tailscale.ipn: bootstrapDNS("derp2f.tailscale.com", "2607:f740:0:3f::f4") for "headscale.typeblog.net" error: Get "https://derp2f.tailscale.com/bootstrap-dns?q=headscale.typeblog.net": dial tcp [2607:f740:0:3f::f4]:443: connect: network is unreachable
09-20 09:14:02.745 18452 18483 I com.tailscale.ipn: 19.4M/155.6M Received error: fetch control key: Get "https://headscale.typeblog.net/key?v=42": dial tcp: lookup invalid IP: No address associated with hostname
09-20 09:14:02.757 18452 18481 I com.tailscale.ipn: 19.4M/155.6M control: doLogin(regen=false, hasUrl=false)
09-20 09:14:02.760 18452 18481 I com.tailscale.ipn: trying bootstrapDNS("derp8c.tailscale.com", "206.189.16.32") for "headscale.typeblog.net" ...
09-20 09:14:03.147 18452 18488 I com.tailscale.ipn: trying bootstrapDNS("derp12b.tailscale.com", "2001:19f0:5c01:48a:5400:3ff:fe8d:cb5f") for "headscale.typeblog.net" ...
09-20 09:14:03.150 18452 18488 I com.tailscale.ipn: bootstrapDNS("derp12b.tailscale.com", "2001:19f0:5c01:48a:5400:3ff:fe8d:cb5f") for "headscale.typeblog.net" error: Get "https://derp12b.tailscale.com/bootstrap-dns?q=headscale.typeblog.net": dial tcp [2001:19f0:5c01:48a:5400:3ff:fe8d:cb5f]:443: connect: network is unreachable
09-20 09:14:03.151 18452 18488 I com.tailscale.ipn: trying bootstrapDNS("derp12b.tailscale.com", "45.63.71.144") for "headscale.typeblog.net" ...
09-20 09:14:03.283 18452 18481 I com.tailscale.ipn: trying bootstrapDNS("derp9c.tailscale.com", "2001:19f0:6401:fe7:5400:3ff:fe8d:6d9c") for "headscale.typeblog.net" ...
09-20 09:14:03.286 18452 18481 I com.tailscale.ipn: bootstrapDNS("derp9c.tailscale.com", "2001:19f0:6401:fe7:5400:3ff:fe8d:6d9c") for "headscale.typeblog.net" error: Get "https://derp9c.tailscale.com/bootstrap-dns?q=headscale.typeblog.net": dial tcp [2001:19f0:6401:fe7:5400:3ff:fe8d:6d9c]:443: connect: network is unreachable
09-20 09:14:03.287 18452 18481 I com.tailscale.ipn: trying bootstrapDNS("derp9b.tailscale.com", "144.202.67.195") for "headscale.typeblog.net" ...
09-20 09:14:03.458 18452 18483 I com.tailscale.ipn: trying bootstrapDNS("derp5.tailscale.com", "2001:19f0:5801:10b7:5400:2ff:feaa:284c") for "headscale.typeblog.net" ...
09-20 09:14:03.461 18452 18483 I com.tailscale.ipn: bootstrapDNS("derp5.tailscale.com", "2001:19f0:5801:10b7:5400:2ff:feaa:284c") for "headscale.typeblog.net" error: Get "https://derp5.tailscale.com/bootstrap-dns?q=headscale.typeblog.net": dial tcp [2001:19f0:5801:10b7:5400:2ff:feaa:284c]:443: connect: network is unreachable
09-20 09:14:03.464 18452 18483 I com.tailscale.ipn: 19.9M/155.6M Received error: fetch control key: Get "https://headscale.typeblog.net/key?v=42": dial tcp: lookup invalid IP: No address associated with hostname

The connection failures to IPv6 fallback DNS can be safely ignored (I think). And here is a sample of the logs when it works (on older / Play Store builds):

09-20 09:40:26.671  4196  4241 I com.tailscale.ipn: 14.1M/100.2M Start
09-20 09:40:26.801  4196  4241 I com.tailscale.ipn: 15.9M/107.7M using backend prefs for "ipn-android": Prefs{ra=true dns=true want=false url="https://headscale.typeblog.net" host="Pdx206_kddi" Persist=nil}
09-20 09:40:26.907  4196  4241 I com.tailscale.ipn: 17.0M/118.4M Backend: logs: be:85902109a272bca64521a3f034f859a55a59b04cf9e62fdeff3a054b938bb0fb fe:
09-20 09:40:26.908  4196  4229 I com.tailscale.ipn: 17.0M/118.7M Switching ipn state NoState -> NeedsLogin (WantRunning=false, nm=false)
09-20 09:40:26.910  4196  4229 I com.tailscale.ipn: 17.0M/119.1M blockEngineUpdates(true)
09-20 09:40:26.911  4196  4229 I com.tailscale.ipn: 17.0M/118.7M health("overall"): error: state=NeedsLogin, wantRunning=false
09-20 09:40:26.911  4196  4229 I com.tailscale.ipn: 17.0M/119.1M SetPrefs: Prefs{ra=true dns=true want=false url="https://headscale.typeblog.net" host="Pdx206_kddi" Persist=nil}
09-20 09:40:26.912  4196  4241 I com.tailscale.ipn: 17.0M/120.2M wgengine: Reconfig: configuring userspace WireGuard config (with 0/0 peers)
09-20 09:40:26.913  4196  4241 I com.tailscale.ipn: 17.0M/120.2M wgengine: Reconfig: configuring router
09-20 09:40:26.933  4196  4241 I com.tailscale.ipn: 17.3M/136.6M wgengine: Reconfig: configuring DNS
09-20 09:40:26.934  4196  4241 I com.tailscale.ipn: 17.3M/136.6M dns: Set: {DefaultResolvers:[] Routes:{} SearchDomains:[] Hosts:0}
09-20 09:40:26.936  4196  4241 I com.tailscale.ipn: 17.3M/137.1M dns: Resolvercfg: {Routes:{} Hosts:0 LocalDomains:[]}
09-20 09:40:26.936  4196  4241 I com.tailscale.ipn: 17.3M/137.1M dns: OScfg: {Hosts:[] Nameservers:[] SearchDomains:[] MatchDomains:[]}
09-20 09:40:28.418  4196  4235 I com.tailscale.ipn: 19.9M/141.5M StartLoginInteractive: url=false
09-20 09:40:28.419  4196  4235 I com.tailscale.ipn: 19.9M/141.5M control: client.Login(false, 2)
09-20 09:40:28.506  4196  4235 I com.tailscale.ipn: 22.1M/141.9M control: LoginInteractive -> regen=true
09-20 09:40:28.507  4196  4235 I com.tailscale.ipn: 22.1M/141.9M control: doLogin(regen=true, hasUrl=false)
09-20 09:40:28.984  4196  4241 I com.tailscale.ipn: 18.3M/144.2M control: control server key from https://headscale.typeblog.net: ts2021=[rLE7X], legacy=[e/ZgC]
09-20 09:40:28.985  4196  4241 I com.tailscale.ipn: 18.3M/144.2M control: Generating a new nodekey.
09-20 09:40:28.987  4196  4241 I com.tailscale.ipn: 18.4M/144.2M control: RegisterReq: onode= node=[c75GI] fup=false
09-20 09:40:30.025  4196  4241 I com.tailscale.ipn: 18.6M/144.4M control: RegisterReq: got response; nodeKeyExpired=false, machineAuthorized=false; authURL=true
09-20 09:40:30.026  4196  4241 I com.tailscale.ipn: 18.6M/144.4M control: AuthURL is <redacted>
09-20 09:40:30.027  4196  4241 I com.tailscale.ipn: 18.6M/144.4M Received auth URL: https://headscale.ty...

Note that in this case, the fallback resolvers do not seem to be used whatsoever. In both cases, the DefaultResolvers array is empty.

Steps to reproduce

  1. Install Tailscale version 1.31.40-t2aade349f-g033f7d87b43 from F-Droid
  2. Try to connect to a custom control server
  3. Observe DNS resolution failure in adb logs

Are there any recent changes that introduced the issue?

No response

OS

Android

OS version

AOSP 13

Tailscale version

1.31.40-t2aade349f-g033f7d87b43

Bug report

No response

@DentonGentry
Copy link
Contributor

When DNS lookup is failing, do you see logcat messages like this:

2022-10-01 01:00:00.409416873 +0000 UTC: getDnsConfigFromLinkProperties:
2022-10-01 01:00:00.409887291 +0000 UTC: getDnsServersFromSystemProperties:
2022-10-01 01:00:00.410162315 +0000 UTC: getDnsServersFromNetworkInfo:

Does it find any DNS servers?

@eriol
Copy link

eriol commented Oct 19, 2022

@DentonGentry thanks for the hint!

I have a similar setup (with headscale as control node) and I started experiencing the issue described by @PeterCxy after updating the latest tailscale android version from Play Store: 1.32.0-tfc688fe02-g13fc35a8bd5.

Looking at logcat logs while trying to log in from android I did not see messages like these:

2022-10-01 01:00:00.409416873 +0000 UTC: getDnsConfigFromLinkProperties:
2022-10-01 01:00:00.409887291 +0000 UTC: getDnsServersFromSystemProperties:
2022-10-01 01:00:00.410162315 +0000 UTC: getDnsServersFromNetworkInfo:

only messages like the ones reported by @PeterCxy.

Downgrading to F-Droid build 1.29.194-t70f9fc8c7-gd0812b9476b solved the issue for me.

Hope this help.

@PeterCxy
Copy link
Contributor Author

I can confirm too that there is no logcat messages like what is given and the issue seems to be present for builds after 1.31 with self-hosted Headscale servers (because with the official control plane, it will fall back to the DERP servers for name resolution and everything will be fine)

@foxtrot
Copy link

foxtrot commented Oct 22, 2022

I'm running into this issue on Android 13 with the Play Store app too.

I believe I'm also experiencing this on MacOS via the App Store app and the https://pkgs.tailscale.com bundles. Any version before 1.32.0 works, but any version after and including 1.32.0 fails to connect to the custom control plane. On Mac, I see

IPNExtension: Received error: fetch control key: Get "https://my.headscale:443/key?v=46": dial tcp: lookup invalid IP: no such host
IPNExtension    trying bootstrapDNS("derp3.tailscale.com", "2400:6180:0:d1::67d:8001") for "my.headscale" ...
IPNExtension    bootstrapDNS("derp3.tailscale.com", "2400:6180:0:d1::67d:8001") for "my.headscale" error: Get "https://derp3.tailscale.com/bootstrap-dns?q=my.headscale": dial tcp [2400:6180:0:d1::67d:8001]:443: connect: no route to host
(Repeats infinitely, with lots of different DERP servers and IPs)

The domain I'm using for the control server does in fact resolve and works with the earlier releases as mentioned.

@PeterCxy
Copy link
Contributor Author

PeterCxy commented Oct 22, 2022

Update: It seems that this error only happens when IPv6 is not available, at least on my devices. My home Wi-Fi has IPv6, so it works fine with home Wi-Fi, but when outside on data or on a Wi-Fi AP without IPv6, it fails with the error log given in the original bug report. I'm not sure whether this is related to all the connection failures in bootstrapDNS, because I'm assuming those should fail either way, with or without a IPv6 connection? (because self-hosted Headscale servers are not whitelisted by DERP nodes)

EDIT: However, in the IPv6 environment, the bootstrapDNS logs simply do not show up, but I still do not have getDnsConfigFromLinkProperties and friends. I still have

10-22 14:29:16.471 18879 18902 I com.tailscale.ipn.debug: 20.8M/135.4M dns: Set: {DefaultResolvers:[] Routes:{} SearchDomains:[] Hosts:0}
10-22 14:29:16.472 18879 18902 I com.tailscale.ipn.debug: 20.8M/135.4M dns: Resolvercfg: {Routes:{} Hosts:0 LocalDomains:[]}
10-22 14:29:16.475 18879 18902 I com.tailscale.ipn.debug: 20.8M/135.4M dns: OScfg: {Nameservers:[] SearchDomains:[] MatchDomains:[] Hosts:[]}

when it is working with an IPv6 address and DNS

@PeterCxy
Copy link
Contributor Author

PeterCxy commented Oct 22, 2022

I believe that those getDnsConfigFromLinkProperties should NOT be expected to be present, because we have not even logged in yet and the compileConfig() function returns right here before even calling GetBaseConfig. But even if this is removed and GetBaseConfig is called, the issue is still not resolved here, even though GetBaseConfig is able to find some system DNS servers.

@foxtrot
Copy link

foxtrot commented Oct 22, 2022

Update: It seems that this error only happens when IPv6 is not available, at least on my devices. My home Wi-Fi has IPv6, so it works fine with home Wi-Fi, but when outside on data or on a Wi-Fi AP without IPv6, it fails with the error log given in the original bug report.

Confirming this behaviour here too. My home ISP doesn't have IPv6 enabled yet, and fails. If I switch to my phones 5G connection (which does have IPv6), the client works as expected.

PeterCxy added a commit to PeterCxy/tailscale that referenced this issue Oct 22, 2022
On Android, the system resolver can return IPv4 addresses as IPv6-mapped
addresses (i.e. `::ffff:a.b.c.d`). After the switch to `net/netip`
(19008a3), this case is no longer handled and a response like this will
be seen as failure to resolve any IPv4 addresses.

Handle this case by simply calling `Unmap()` on the returned IP when it
is a 4-in-6 address. Fixes tailscale#5698.
PeterCxy added a commit to PeterCxy/tailscale that referenced this issue Oct 22, 2022
On Android, the system resolver can return IPv4 addresses as IPv6-mapped
addresses (i.e. `::ffff:a.b.c.d`). After the switch to `net/netip`
(19008a3), this case is no longer handled and a response like this will
be seen as failure to resolve any IPv4 addresses.

Handle this case by simply calling `Unmap()` on the returned IP when it
is a 4-in-6 address. Fixes tailscale#5698.

Signed-off-by: Peter Cai <peter@typeblog.net>
@PeterCxy
Copy link
Contributor Author

Following the IPv6 clue, it turns out that this is another case of unexpected system resolver behavior (returning IPv4 mapped in IPv6, i.e. ::ffff:a.b.c.d). This should be fixed with PR #6024.

PeterCxy added a commit to PeterCxy/tailscale that referenced this issue Oct 23, 2022
On Android, the system resolver can return IPv4 addresses as IPv6-mapped
addresses (i.e. `::ffff:a.b.c.d`). After the switch to `net/netip`
(19008a3), this case is no longer handled and a response like this will
be seen as failure to resolve any IPv4 addresses.

Handle this case by simply calling `Unmap()` on the returned IP when it
is a 4-in-6 address. Fixes tailscale#5698.

Signed-off-by: Peter Cai <peter@typeblog.net>
PeterCxy added a commit to PeterCxy/tailscale that referenced this issue Oct 23, 2022
On Android, the system resolver can return IPv4 addresses as IPv6-mapped
addresses (i.e. `::ffff:a.b.c.d`). After the switch to `net/netip`
(19008a3), this case is no longer handled and a response like this will
be seen as failure to resolve any IPv4 addresses.

Handle this case by simply calling `Unmap()` on the returned IPs. Fixes tailscale#5698.

Signed-off-by: Peter Cai <peter@typeblog.net>
bradfitz pushed a commit that referenced this issue Oct 23, 2022
On Android, the system resolver can return IPv4 addresses as IPv6-mapped
addresses (i.e. `::ffff:a.b.c.d`). After the switch to `net/netip`
(19008a3), this case is no longer handled and a response like this will
be seen as failure to resolve any IPv4 addresses.

Handle this case by simply calling `Unmap()` on the returned IPs. Fixes #5698.

Signed-off-by: Peter Cai <peter@typeblog.net>
DentonGentry pushed a commit that referenced this issue Oct 26, 2022
On Android, the system resolver can return IPv4 addresses as IPv6-mapped
addresses (i.e. `::ffff:a.b.c.d`). After the switch to `net/netip`
(19008a3), this case is no longer handled and a response like this will
be seen as failure to resolve any IPv4 addresses.

Handle this case by simply calling `Unmap()` on the returned IPs. Fixes #5698.

Signed-off-by: Peter Cai <peter@typeblog.net>
(cherry picked from commit 4597ec1)
@DentonGentry
Copy link
Contributor

Android app 1.32.2 has been released in the Play Store containing this fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants