Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wgengine/router: enable ip forwarding on gokrazy #11408

Merged
merged 1 commit into from
Apr 5, 2024

Conversation

joneskoo
Copy link
Contributor

Only on Gokrazy, set sysctls to enable IP forwarding so subnet routing and advertised exit node works.

Implements #11405

@joneskoo
Copy link
Contributor Author

joneskoo commented Mar 13, 2024

The idea is to enable forwarding when we're advertising routes or exit node.

I tested that:

  • it does enable sysctls on gokrazy when I have --advertise-routes=....
  • it does not enable sysctls on gokrazy when --advertise-routes is not set.

It's supposed to be no-op for everything except gokrazy, and if no routes are being advertised.

There's no persisting previous state of sysctl but that would be kind of pointless on gokrazy as they're appliances and generally rebooted on every change, keeping only /perm.

It was a bit hard to find the right place to inject this in the codebase, hopefully this is good enough or sufficient for Tailscale staff to take this over.

@joneskoo
Copy link
Contributor Author

joneskoo commented Mar 13, 2024

I know there's a bit of duplication now with sysctls being done three different ways (across the Tailscale repo) after this change, hopefully what I did is the most generic and you can later consider refactoring the rest to use a shared writeSysctl function and move it in an appropriate place in your codebase. I didn't want to take opinion where it should go or refactor the existing usages, keeping this change minimal.

I'd imagine you would want to later follow up this by having all of the sysctls try writing file and if that fails fall back to sysctl command, for both read and write. It's not relevant for gokrazy though.

wgengine/router/router_linux.go Outdated Show resolved Hide resolved
wgengine/router/router_linux.go Outdated Show resolved Hide resolved
@bradfitz
Copy link
Member

bradfitz commented Apr 1, 2024

LGTM. Couple small comments above.

Only on Gokrazy, set sysctls to enable IP forwarding so subnet routing
and advertised exit node works.

Fixes tailscale#11405

Signed-off-by: Joonas Kuorilehto <joneskoo@derbian.fi>
@bradfitz bradfitz merged commit fe0cfec into tailscale:main Apr 5, 2024
46 checks passed
@damdo
Copy link

damdo commented Apr 11, 2024

Is this expected to also fix #10918 ?

@joneskoo
Copy link
Contributor Author

Is this expected to also fix #10918 ?

I don't know what to expect but based on looking at the description there, I think that configuration should work.

Can you test the version in main, e.g. with replace directive in go.mod for each of the tailscale modules:

replace tailscale.com => /path/to/tailscale.com/tailscale

@irbekrm
Copy link
Contributor

irbekrm commented Apr 11, 2024

Can you test the version in main

I think this will also be released as part of 1.64 Tailscale release later today, which could make testing easier

@bradfitz
Copy link
Member

@damdo, can you try it now, with 1.64.0?

@damdo
Copy link

damdo commented Apr 12, 2024

@bradfitz Yes tailscale{,d} 1.64.0 (thanks to this PR I guess), fixes #10918 (exit node issue) for me

Proof:

  1. On the gokrazy tailscale exit-node:

    gokrazy config

    "PackageConfig": {
        "tailscale.com/cmd/tailscaled": {
            "CommandLineFlags": [
            ]
        },
        "tailscale.com/cmd/tailscale": {
            "CommandLineFlags": [
                "up",
                "--advertise-routes=192.168.0.0/16",
                "--advertise-exit-node"
            ]
        }
    },
    

    versioning and status

    /tmp/breakglass2213421165 # tailscaled --version
    1.64.0-ERR-BuildInfo
      go version: go1.22.2
    
    /tmp/breakglass2213421165 # tailscale --version
    1.64.0-ERR-BuildInfo
      go version: go1.22.2
      
    /tmp/breakglass2213421165 # ps -a | grep tail
       99 0         0:02 /user/tailscaled            <--- running tailscaled with the default options
      185 0         0:00 grep tail
    
  2. On client using that exit node

    $ podman run --rm -it --privileged --entrypoint=/bin/sh -e TS_AUTHKEY=$TS_AUTHKEY -v /dev/net/tun:/dev/net/tun ghcr.io/tailscale/tailscale:latest
    $ tailscaled
    
     $ podman exec -it tailscale /bin/sh
    
     # first ip and ping without connecting to tailscale/exit-node
     
     $ wget -q -O- http://ipecho.net/plain
     112.x.x.246
     
     $ ping google.com
     PING google.com (142.250.74.206) 56(84) bytes of data.
     64 bytes from fra24s02-in-f14.1e100.net (142.250.74.206): icmp_seq=1 ttl=115 time=5.43 ms
     64 bytes from fra24s02-in-f14.1e100.net (142.250.74.206): icmp_seq=2 ttl=115 time=5.36 ms
     64 bytes from fra24s02-in-f14.1e100.net (142.250.74.206): icmp_seq=3 ttl=115 time=5.37 ms
     
     # then through the above tailscale exit-node
     $ tailscale up --accept-dns=true --exit-node=100.x.x.x --auth-key=tskey-auth-x.x
     
     $ wget -q -O- http://ipecho.net/plain
     97.x.x.48
     
     $ ping google.com
     PING google.com (216.58.204.142) 56(84) bytes of data.
     64 bytes from par21s05-in-f142.1e100.net (216.58.204.142): icmp_seq=1 ttl=116 time=28.3 ms
     64 bytes from par21s05-in-f142.1e100.net (216.58.204.142): icmp_seq=2 ttl=116 time=27.1 ms
    

@damdo
Copy link

damdo commented Apr 12, 2024

@bradfitz @joneskoo
But now I get a warning in the logs (see screenshot).
Screenshot 2024-04-12 at 10 03 15

@irbekrm
Copy link
Contributor

irbekrm commented Apr 12, 2024

@damdo what is the output of sysctl -a|grep net.ipv4.ip_forward on the node that's meant to advertize the subnet routes? Also, could you attach a debug log? Do you see any warnings from here in the node logs?
Also- do you just see a warning or does the subnet router also not work?

@damdo
Copy link

damdo commented Apr 12, 2024

@irbekrm the node meant to advertize subnet router/exit-node works as intended.
Also sysctl is set correctly.

/tmp/breakglass2213421165 # sysctl -a|grep net.ipv4.ip_forward
net.ipv4.ip_forward = 1
net.ipv4.ip_forward_update_priority = 1
net.ipv4.ip_forward_use_pmtu = 0
/tmp/breakglass2213421165 # sysctl -a|grep net.ipv6.conf.all.forwarding
net.ipv6.conf.all.forwarding = 1

Also if I ssh and down/up tailscale I don't get the "IP forwarding disabled, ..." warning.

/tmp/breakglass2213421165 # /user/tailscale down
/tmp/breakglass2213421165 # /user/tailscale status
Tailscale is stopped.
/tmp/breakglass2213421165 # /user/tailscale up --advertise-routes=192.168.0.0/16 --advertise-exit-node
Warning: UDP GRO forwarding is suboptimally configured on eth0, UDP forwarding throughput capability will increase with a configuration change.
See https://tailscale.com/s/ethtool-config-udp-gro

My suspicion is that the check-ip-forwarding logic is executed early when the enableIPForwarding logic for gokrazy hasn't been applied yet. That or there is a race.

Things eventually work though.

Another separate issue is the other warning "couldn't check system's UDP GRO forwarding configuration". Not sure if they are related. For the record this doesn't go away after a tailscale down/up, contrary to the other warning.

@irbekrm
Copy link
Contributor

irbekrm commented Apr 12, 2024

My suspicion is that the check-ip-forwarding logic is executed early when the enableIPForwarding logic for gokrazy hasn't been applied yet. That or there is a race.

I had a look at the code - it looks like in your use case the IP forwarding check runs as part of tailscale up whilst the logic to enable forwarding runs when the wgengine router is first set when tailscaled starts.
I would have thought that tailscale up would not be able to succeed before tailscaled has got to that point, but maybe I'm wrong.
Do you perhaps have the full log from the start, so we can see how far the router has got before that warning?

@joneskoo
Copy link
Contributor Author

For what it's worth, I only enable subnet router and no exit node and have no issues on Raspberry Pi 3B+.

Be sure to check the usual things like accepting the advertised subnets, too.

@joneskoo
Copy link
Contributor Author

The same warning is there but it doesn't stop the subnet router from working.

image
joneskoo@oslo:~$ ip route get 192.168.8.1
192.168.8.1 dev tailscale0 table 52 src 100.112.192.104 uid 1000 
    cache 
joneskoo@oslo:~$ ping 192.168.8.1
PING 192.168.8.1 (192.168.8.1) 56(84) bytes of data.
64 bytes from 192.168.8.1: icmp_seq=1 ttl=63 time=229 ms
/ # sysctl -a|grep ip_forward
net.ipv4.ip_forward = 1
net.ipv4.ip_forward_update_priority = 1
net.ipv4.ip_forward_use_pmtu = 0
sysctl: error reading key 'net.ipv6.conf.all.stable_secret': I/O error
sysctl: error reading key 'net.ipv6.conf.default.stable_secret': I/O error
sysctl: error reading key 'net.ipv6.conf.eth0.stable_secret': I/O error
sysctl: error reading key 'net.ipv6.conf.lo.stable_secret': I/O error
sysctl: error reading key 'net.ipv6.conf.sit0.stable_secret': I/O error
sysctl: error reading key 'net.ipv6.conf.wlan0.stable_secret': I/O error

@damdo
Copy link

damdo commented Apr 15, 2024

My suspicion is that the check-ip-forwarding logic is executed early when the enableIPForwarding logic for gokrazy hasn't been applied yet. That or there is a race.

I had a look at the code - it looks like in your use case the IP forwarding check runs as part of tailscale up whilst the logic to enable forwarding runs when the wgengine router is first set when tailscaled starts.
I would have thought that tailscale up would not be able to succeed before tailscaled has got to that point, but maybe I'm wrong.
Do you perhaps have the full log from the start, so we can see how far the router has got before that warning?

@irbekrm here are the logs you asked: https://gist.github.com/damdo/2003f4e196d6af7ebe9c8fc9ffaab7ab

@irbekrm
Copy link
Contributor

irbekrm commented Apr 15, 2024

Thank you for the logs. I cannot pinpoint exactly where tailscaled is when tailscale up hits the warning, but I also think that trying to sync the two would probably not be feasible.
As @joneskoo said this is not an actual issue, just the logs being confusing - so maybe the easiest would be to just log that IP forwarding is now enabled here-ish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants