DNS resolution not working after turning exit node #3842

cepera-ang · 2022-01-30T09:54:50Z

What is the issue?

After updating from Tailscale 1.18 to Tailscale 1.20.2 I no longer can use exit node functionality. I have ubuntu cloud machine as exit node (named vpn) and a windows machine. After enabling exit node on windows I get all DNS requests going to 100.100.100.100 and dying in timeout. The same requests from older version work flawlessly.

For ex, windows, 1.20.2, exit node off:

λ nslookup github.com
Server:  one.one.one.one
Address:  1.1.1.1

Non-authoritative answer:
Name:    github.com
Address:  140.82.121.3

Windows, 1.20.2, exit node on:

λ nslookup github.com
Server:  UnKnown
Address:  100.100.100.100

DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.
DNS request timed out.
    timeout was 2 seconds.

Linux, 1.18.2, exit node on or off, whatever (same result) :

user@user-pi:~$ nslookup github.com 100.100.100.100
Server:         100.100.100.100
Address:        100.100.100.100#53

Non-authoritative answer:
Name:   github.com
Address: 140.82.121.4

Linux, 1.20.2, exit node on:

user@user-pi:~$ tailscale version
1.20.2
  tailscale commit: 312750ddd288cf4073cfaef56a45102b9c1e8421
  other commit: 2c164d9c7443e2f3014fa54ea45e946b35152680
  go version: go1.17.6-tse44d304e54
user@user-pi:~$ nslookup github.com 100.100.100.100
;; connection timed out; no servers could be reached

Well, anyway, it seems like 100.100.100.100 not working anywhere in 1.20.2 for me.

I see some changes related to DNS and exit nodes in release notes. Is there some configuration I have to do, in order to get this working again?

Steps to reproduce

No response

Are there any recent changes that introduced the issue?

Updated Tailscale everywhere to the latest version.

OS

Linux, Windows

OS version

Ubuntu 20.04.3 LTS (GNU/Linux 5.11.0-1027-oracle aarch64), Microsoft Windows [Version 10.0.19044.1466]

Tailscale version

1.20.2

Bug report

BUG-c2f835af9713719097081eaf7976601903d023065d119901ad8e2e1799922664-20220130093427Z-bd88e452804a0817

The text was updated successfully, but these errors were encountered:

Murgeye · 2022-01-30T12:32:24Z

I had the same issue and it was a firewall issue. After setting

ufw allow in on tailscale0 to any port 64707

on the Exit node, DNS works again.

cepera-ang · 2022-01-30T13:07:17Z

Well, I have no firewall running on exit node at all. There is also nothing in the logs of exit node indicating any kind of connection attempt.

Ok, looked at logs at both sides and there are this log on the client side (tailscale on ubuntu on raspberry pi) while trying to make a lookup:

янв 30 20:54:59 user-pi tailscaled[884]: dns: error: Post "http://100.127.227.19:38798/dns-query": dial tcp 100.127.227.19:38798: connect: no route to host
янв 30 20:54:59 user-pi tailscaled[884]: open-conn-track: timeout opening (TCP 100.102.69.110:46312 => 100.127.227.19:38798) to node [Vdhdo]; online=yes, lastRecv=5s
янв 30 20:55:04 user-pi tailscaled[884]: dns: error: Post "http://100.127.227.19:38798/dns-query": dial tcp 100.127.227.19:38798: connect: no route to host
янв 30 20:55:04 user-pi tailscaled[884]: open-conn-track: timeout opening (TCP 100.102.69.110:46314 => 100.127.227.19:38798) to node [Vdhdo]; online=yes, lastRecv=10s
янв 30 20:55:04 user-pi tailscaled[884]: dns: error: Post "http://100.127.227.19:38798/dns-query": dial tcp 100.127.227.19:38798: connect: no route to host
янв 30 20:55:04 user-pi tailscaled[884]: open-conn-track: timeout opening (TCP 100.102.69.110:46316 => 100.127.227.19:38798) to node [Vdhdo]; online=yes, lastRecv=10s
янв 30 20:55:09 user-pi tailscaled[884]: open-conn-track: timeout opening (TCP 100.102.69.110:46318 => 100.127.227.19:38798) to node [Vdhdo]; online=yes, lastRecv=5s
янв 30 20:55:09 user-pi tailscaled[884]: dns: error: Post "http://100.127.227.19:38798/dns-query": dial tcp 100.127.227.19:38798: connect: no route to host
янв 30 20:55:10 user-pi tailscaled[884]: Accept: TCP{100.102.69.110:46320 > 100.127.227.19:38798} 60 ok out
янв 30 20:55:15 user-pi tailscaled[884]: open-conn-track: timeout opening (TCP 100.102.69.110:46320 => 100.127.227.19:38798) to node [Vdhdo]; online=yes, lastRecv=10s
янв 30 20:55:15 user-pi tailscaled[884]: dns: error: Post "http://100.127.227.19:38798/dns-query": dial tcp 100.127.227.19:38798: connect: no route to host
янв 30 20:55:20 user-pi tailscaled[884]: dns: error: Post "http://100.127.227.19:38798/dns-query": dial tcp 100.127.227.19:38798: connect: no route to host
янв 30 20:55:20 user-pi tailscaled[884]: open-conn-track: timeout opening (TCP 100.102.69.110:46322 => 100.127.227.19:38798) to node [Vdhdo]; online=yes, lastRecv=5s
янв 30 20:55:20 user-pi tailscaled[884]: Accept: TCP{100.102.69.110:46324 > 100.127.227.19:38798} 60 ok out
янв 30 20:55:25 user-pi tailscaled[884]: dns: error: Post "http://100.127.227.19:38798/dns-query": dial tcp 100.127.227.19:38798: connect: no route to host
янв 30 20:55:25 user-pi tailscaled[884]: open-conn-track: timeout opening (TCP 100.102.69.110:46324 => 100.127.227.19:38798) to node [Vdhdo]; online=yes, lastRecv=10s
янв 30 20:55:30 user-pi tailscaled[884]: portmapper: saw UPnP type WANIPConnection1 at http://192.168.88.1:2828/gateway.xml; MikroTik Router (MikroTik)
янв 30 20:55:30 user-pi tailscaled[884]: open-conn-track: timeout opening (TCP 100.102.69.110:46328 => 100.127.227.19:38798) to node [Vdhdo]; online=yes, lastRecv=5s
янв 30 20:55:30 user-pi tailscaled[884]: dns: error: Post "http://100.127.227.19:38798/dns-query": dial tcp 100.127.227.19:38798: connect: no route to host
янв 30 20:55:31 user-pi tailscaled[884]: Accept: TCP{100.102.69.110:46330 > 100.127.227.19:38798} 60 ok out
янв 30 20:55:36 user-pi tailscaled[884]: dns: error: Post "http://100.127.227.19:38798/dns-query": dial tcp 100.127.227.19:38798: connect: no route to host
янв 30 20:55:36 user-pi tailscaled[884]: open-conn-track: timeout opening (TCP 100.102.69.110:46330 => 100.127.227.19:38798) to node [Vdhdo]; online=yes, lastRecv=10s
янв 30 20:55:41 user-pi tailscaled[884]: open-conn-track: timeout opening (TCP 100.102.69.110:46332 => 100.127.227.19:38798) to node [Vdhdo]; online=yes, lastRecv=5s
янв 30 20:55:41 user-pi tailscaled[884]: dns: error: Post "http://100.127.227.19:38798/dns-query": dial tcp 100.127.227.19:38798: connect: no route to host
янв 30 20:55:41 user-pi tailscaled[884]: Accept: TCP{100.102.69.110:46334 > 100.127.227.19:38798} 60 ok out
янв 30 20:55:46 user-pi tailscaled[884]: open-conn-track: timeout opening (TCP 100.102.69.110:46334 => 100.127.227.19:38798) to node [Vdhdo]; online=yes, lastRecv=10s
янв 30 20:55:46 user-pi tailscaled[884]: dns: error: Post "http://100.127.227.19:38798/dns-query": dial tcp 100.127.227.19:38798: connect: no route to host
янв 30 20:55:51 user-pi tailscaled[884]: dns: error: Post "http://100.127.227.19:38798/dns-query": dial tcp 100.127.227.19:38798: connect: no route to host
янв 30 20:55:51 user-pi tailscaled[884]: open-conn-track: timeout opening (TCP 100.102.69.110:46336 => 100.127.227.19:38798) to node [Vdhdo]; online=yes, lastRecv=5s
янв 30 20:55:52 user-pi tailscaled[884]: Accept: TCP{100.102.69.110:46338 > 100.127.227.19:38798} 60 ok out
янв 30 20:55:52 user-pi tailscaled[884]: portmapper: saw UPnP type WANIPConnection1 at http://192.168.88.1:2828/gateway.xml; MikroTik Router (MikroTik)

Strangely, pinging that ip address works just fine:

user@user-pi:~$ ping 100.127.227.19
PING 100.127.227.19 (100.127.227.19) 56(84) bytes of data.
64 bytes from 100.127.227.19: icmp_seq=1 ttl=64 time=97.5 ms
64 bytes from 100.127.227.19: icmp_seq=2 ttl=64 time=97.6 ms
^C
--- 100.127.227.19 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 97.545/97.578/97.611/0.033 ms

Trace also works:

user@user-pi:~$ tracepath 100.127.227.19
 1?: [LOCALHOST]                      pmtu 1280
 1:  vpn.***.beta.tailscale.net          98.382ms !H
 1:  vpn.***.beta.tailscale.net          98.104ms !H
     Resume: pmtu 1280

However, telnet shows:

user@user-pi:~$ telnet  100.127.227.19  38798
Trying 100.127.227.19...
telnet: Unable to connect to remote host: No route to host

Never encountered anything like that, honestly. No route to host for telnet and pinging just fine, what?

cepera-ang · 2022-01-30T13:20:35Z

There is another interesting piece of information in logs:

янв 30 21:19:23 user-pi tailscaled[884]: dns: error: Post "http://100.127.227.19:38798/dns-query": dial tcp 100.127.227.19:38798: connect: no route to host

янв 30 21:19:23 user-pi tailscaled[884]: open-conn-track: timeout opening (TCP 100.102.69.110:46430 => 100.127.227.19:38798); target node [Vdhdo] in netmap but unknown to wireguard

bradfitz · 2022-01-30T17:56:29Z

in netmap but unknown to wireguard is definitely weird. Also in logs:

2022-01-30 17:33:42.0712399 +0800 +0800: IPv4 packet with disallowed source address from [Vdhdo]
2022-01-30 17:33:53.0831781 +0800 +0800: IPv4 packet with disallowed source address from [Vdhdo]
2022-01-30 17:34:16.3743425 +0800 +0800: IPv4 packet with disallowed source address from [Vdhdo]

Hopefully I'll find time to investigate soon.

We're finding a bunch of host operating systems/firewalls interact poorly with peerapi. We either get ICMP errors from the host or users need to run commands to allow the peerapi port: #3842 (comment) ... even though the peerapi should be an internal implementation detail. Rather than fight the host OS & firewalls, this change handles the server side of peerapi entirely in netstack (except on iOS), so it never makes its way to the host OS where it might be messed with. Two main downsides are: 1) netstack isn't as fast, but we don't really need speed for peerapi. And actually, with fewer trips to/from the kernel, we might actually make up for some of the netstack performance loss by staying in userspace. 2) tcpdump / Wireshark etc packet captures will no longer see the peerapi traffic. Oh well. Crawshaw's been wanting to add packet capture server support to tailscaled, so we'll probably do that sooner now. A future change might also then use peerapi for the client-side (except on iOS). Updates #3842 (probably fixes, as well as many exit node issues I bet) Change-Id: Ibc25edbb895dc083d1f07bd3cab614134705aa39 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>

bradfitz · 2022-01-31T17:31:41Z

@JayWStapleton was investigating https://forum.tailscale.com/t/exit-node-on-oracle-oci/1662/7 and could reproduce the connect: no route to host.

It turned out to be ICMP errors from the exit node:

16:01:26.301186 IP 100.79.194.93 > 100.81.38.34: ICMP host 100.79.194.93 unreachable - admin prohibited, length 68
16:01:43.916036 IP 100.79.194.93 > 100.81.38.34: ICMP host 100.79.194.93 unreachable - admin prohibited, length 68
16:01:44.411897 IP 100.79.194.93 > 100.81.38.34: ICMP host 100.79.194.93 unreachable - admin prohibited, length 68

We can just handle peerapi entirely in netstack, though: I sent #3851.

We're finding a bunch of host operating systems/firewalls interact poorly with peerapi. We either get ICMP errors from the host or users need to run commands to allow the peerapi port: #3842 (comment) ... even though the peerapi should be an internal implementation detail. Rather than fight the host OS & firewalls, this change handles the server side of peerapi entirely in netstack (except on iOS), so it never makes its way to the host OS where it might be messed with. Two main downsides are: 1) netstack isn't as fast, but we don't really need speed for peerapi. And actually, with fewer trips to/from the kernel, we might actually make up for some of the netstack performance loss by staying in userspace. 2) tcpdump / Wireshark etc packet captures will no longer see the peerapi traffic. Oh well. Crawshaw's been wanting to add packet capture server support to tailscaled, so we'll probably do that sooner now. A future change might also then use peerapi for the client-side (except on iOS). Updates #3842 (probably fixes, as well as many exit node issues I bet) Change-Id: Ibc25edbb895dc083d1f07bd3cab614134705aa39 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>

cepera-ang · 2022-01-31T22:12:59Z

Great, I use Ubuntu on OCI too.

We're finding a bunch of host operating systems/firewalls interact poorly with peerapi. We either get ICMP errors from the host or users need to run commands to allow the peerapi port: #3842 (comment) ... even though the peerapi should be an internal implementation detail. Rather than fight the host OS & firewalls, this change handles the server side of peerapi entirely in netstack (except on iOS), so it never makes its way to the host OS where it might be messed with. Two main downsides are: 1) netstack isn't as fast, but we don't really need speed for peerapi. And actually, with fewer trips to/from the kernel, we might actually make up for some of the netstack performance loss by staying in userspace. 2) tcpdump / Wireshark etc packet captures will no longer see the peerapi traffic. Oh well. Crawshaw's been wanting to add packet capture server support to tailscaled, so we'll probably do that sooner now. A future change might also then use peerapi for the client-side (except on iOS). Updates #3842 (probably fixes, as well as many exit node issues I bet) Change-Id: Ibc25edbb895dc083d1f07bd3cab614134705aa39 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>

bradfitz · 2022-01-31T23:10:06Z

Okay, the fix is in the latest unstable build, in version 1.21.43 or later.

Can somebody try it out? @cepera-ang?

We're not sure yet whether we'll backport it to the 1.20.x branch yet. First we want to see how many people's problems it fixes.

cepera-ang · 2022-01-31T23:13:56Z

Yep, quick testing show that it's working now (updated only the exit node)

mrzv · 2022-01-31T23:38:43Z

Same here. Updating to that unstable build on the exit node fixed the problem.

bradfitz · 2022-01-31T23:50:35Z

@mrzv, which OS was your exit node before?

And @Murgeye, once you update to that build, you won't need your ufw firewall updates, as we no longer even give the host operating system a chance to see this traffic, so it can't be blocked.

mrzv · 2022-01-31T23:51:50Z

Linux (ArchLinux to be precise)

Murgeye · 2022-02-02T08:49:41Z

Can confirm that the ufw rules are unnecessary in the current unstable build.

We're finding a bunch of host operating systems/firewalls interact poorly with peerapi. We either get ICMP errors from the host or users need to run commands to allow the peerapi port: #3842 (comment) ... even though the peerapi should be an internal implementation detail. Rather than fight the host OS & firewalls, this change handles the server side of peerapi entirely in netstack (except on iOS), so it never makes its way to the host OS where it might be messed with. Two main downsides are: 1) netstack isn't as fast, but we don't really need speed for peerapi. And actually, with fewer trips to/from the kernel, we might actually make up for some of the netstack performance loss by staying in userspace. 2) tcpdump / Wireshark etc packet captures will no longer see the peerapi traffic. Oh well. Crawshaw's been wanting to add packet capture server support to tailscaled, so we'll probably do that sooner now. A future change might also then use peerapi for the client-side (except on iOS). Updates #3842 (probably fixes, as well as many exit node issues I bet) Change-Id: Ibc25edbb895dc083d1f07bd3cab614134705aa39 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> (cherry picked from commit bd90781) + edits (and cherry picked part of commit f3c0023)

cepera-ang · 2022-02-13T14:31:59Z

I have another connectivity issue. I tried to install pi-hole to my vpn server and I unable to connect to its web-interface, nor to tcp DNS resolver. Interestingly, I can connect to ssh and tailscaled simple web-server. I tried from windows and linux machines, same story.

user@user-pi:~$ curl vpn
curl: (7) Failed to connect to vpn port 80: No route to host
user@user-pi:~$ curl vpn:38798
<html>
<meta name="viewport" content="width=device-width, initial-scale=1">
<body>
<h1>Hello, Sergey Mushinskiy (100.102.69.110)</h1>
This is my Tailscale device. Your device is pi.
<p>You are the owner of this node.
user@user-pi:~$

Logs on client BUG-eafc8fc920530480e584258f1a87593e4f3a50ef100996de93f63d2df5d6a318-20220213143122Z-e9516f2deb31fbe1:

фев 13 22:27:48 user-pi tailscaled[2556]: open-conn-track: timeout opening (TCP 100.102.69.110:35120 => 100.127.227.19:80) to node [Vdhdo]; online=yes, lastRecv=5s

Log on the server BUG-497beb7817e8ae7ece3867e0ee1b898b86d2248109879b958ca2faeb2d1d1d82-20220213143146Z-a23a7c7bfafcd8ea

Feb 13 14:27:43 vpn tailscaled[1114]: Accept: TCP{100.102.69.110:35120 > 100.127.227.19:80} 60 tcp ok

DentonGentry · 2022-02-13T15:09:43Z

@cepera-ang I moved that last comment into a new issue, tailscale/tailscale-www#975

DentonGentry · 2022-03-27T05:48:31Z

Fixed in 1.22

kim0 · 2022-09-11T14:49:19Z

It seems I'm see'ing this return. I'm on v1.30.0 MacOS.

DentonGentry · 2022-09-11T15:34:17Z

Please open a new bug, with details. It is unlikely you are experiencing the same root cause.

cepera-ang added bug Bug needs-triage labels Jan 30, 2022

bradfitz added dns exit-node Exit node related labels Jan 30, 2022

bradfitz mentioned this issue Jan 31, 2022

ipn/ipnlocal, wgengine/netstack: use netstack for peerapi server #3851

Merged

This was referenced Feb 1, 2022

Tailscale on iOS and Android lost connection when using exit node after updating latest tailscale on VPS server. #3787

Closed

No connectivity after connecting to exit node #3757

Closed

DentonGentry mentioned this issue Feb 3, 2022

DNS resolution fails after activating an exit node #3285

Closed

DentonGentry mentioned this issue Feb 6, 2022

With "Override local DNS" set and using an exit node, all DNS resolution fails on Linux #3060

Closed

DentonGentry added L2 Few Likelihood P2 Aggravating Priority level T5 Usability Issue type and removed needs-triage labels Feb 6, 2022

DentonGentry closed this as completed Mar 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DNS resolution not working after turning exit node #3842

DNS resolution not working after turning exit node #3842

cepera-ang commented Jan 30, 2022

Murgeye commented Jan 30, 2022 •

edited

Loading

cepera-ang commented Jan 30, 2022

cepera-ang commented Jan 30, 2022

bradfitz commented Jan 30, 2022

bradfitz commented Jan 31, 2022

cepera-ang commented Jan 31, 2022

bradfitz commented Jan 31, 2022

cepera-ang commented Jan 31, 2022

mrzv commented Jan 31, 2022

bradfitz commented Jan 31, 2022

mrzv commented Jan 31, 2022

Murgeye commented Feb 2, 2022

cepera-ang commented Feb 13, 2022

DentonGentry commented Feb 13, 2022

DentonGentry commented Mar 27, 2022

kim0 commented Sep 11, 2022

DentonGentry commented Sep 11, 2022

DNS resolution not working after turning exit node #3842

DNS resolution not working after turning exit node #3842

Comments

cepera-ang commented Jan 30, 2022

What is the issue?

Steps to reproduce

Are there any recent changes that introduced the issue?

OS

OS version

Tailscale version

Bug report

Murgeye commented Jan 30, 2022 • edited Loading

cepera-ang commented Jan 30, 2022

cepera-ang commented Jan 30, 2022

bradfitz commented Jan 30, 2022

bradfitz commented Jan 31, 2022

cepera-ang commented Jan 31, 2022

bradfitz commented Jan 31, 2022

cepera-ang commented Jan 31, 2022

mrzv commented Jan 31, 2022

bradfitz commented Jan 31, 2022

mrzv commented Jan 31, 2022

Murgeye commented Feb 2, 2022

cepera-ang commented Feb 13, 2022

DentonGentry commented Feb 13, 2022

DentonGentry commented Mar 27, 2022

kim0 commented Sep 11, 2022

DentonGentry commented Sep 11, 2022

Murgeye commented Jan 30, 2022 •

edited

Loading