Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux broken after sleep #1812

Closed
bradfitz opened this issue Apr 28, 2021 · 7 comments
Closed

Linux broken after sleep #1812

bradfitz opened this issue Apr 28, 2021 · 7 comments

Comments

@bradfitz
Copy link
Member

bradfitz commented Apr 28, 2021

Reported in both 1.6.0 and head at 2840afa. Logs from HEAD below:

Before the sleep:
BUG-877c6f6675eab123ce0c856d79ecac6e6271d0e5dfd1d7b90acecec2d87658ef-20210428022721Z-b35f60aa36470974

After the sleep:
BUG-877c6f6675eab123ce0c856d79ecac6e6271d0e5dfd1d7b90acecec2d87658ef-20210428022901Z-8bb77466dd102463

We at least notice the time jump now:

2021-04-27 19:27:21.190347052 -0700 PDT: user bugreport: BUG-877c6f6675eab123ce0c856d79ecac6e6271d0e5dfd1d7b90acecec2d87658ef-20210428022721Z-b35f60aa36470974
...
2021-04-27 19:27:26.891760817 -0700 PDT: netmap diff: (none)
(Brad: the sleep, no log lines omitted here)
2021-04-27 19:28:00.294813806 -0700 PDT: monitor: time jumped (probably wake from sleep); synthesizing major change event
2021-04-27 19:28:00.295019883 -0700 PDT: LinkChange: major, rebinding. New state: interfaces.State{defaultRoute= ifs={xxxxxxx v4=true v6global=true}
2021-04-27 19:28:00.295428973 -0700 PDT: magicsock: link change rebound port from 41641 to 41641
2021-04-27 19:28:00.295461815 -0700 PDT: magicsock: closing connection to derp-2 (rebind), age 44s
2021-04-27 19:28:00.29548898 -0700 PDT: magicsock: 0 active derp conns
2021-04-27 19:28:00.295561317 -0700 PDT: health("overall"): error: not connected to home DERP region 2
2021-04-27 19:28:00.295670984 -0700 PDT: magicsock: starting endpoint update (link-change-major)
2021-04-27 19:28:00.295918742 -0700 PDT: magicsock: adding connection to derp-2 for home-keep-alive
2021-04-27 19:28:00.295963663 -0700 PDT: magicsock: 1 active derp conns: derp-2=cr0s,wr0s
2021-04-27 19:28:00.296006196 -0700 PDT: derphttp.Client.Recv: connecting to derp-2 (sfo)
2021-04-27 19:28:00.296026019 -0700 PDT: peer keys: [WX3km]
2021-04-27 19:28:00.296036925 -0700 PDT: v1.7.0-2840afab peers: 144256/59264
2021-04-27 19:28:00.296206875 -0700 PDT: netcheck: probePortMapServices: failed to look up gateway address
2021-04-27 19:28:00.298806323 -0700 PDT: Accept: TCP{100.102.230.23:50414 > 100.75.69.73:3000} 52 ok out
2021-04-27 19:28:00.29889969 -0700 PDT: magicsock: disco: send, starting discovery for [WX3km] (d:c44e2846e7b28308)
2021-04-27 19:28:00.299496791 -0700 PDT: magicsock: disco: d:717ee1c2ce37284f->d:c44e2846e7b28308 ([WX3km], derp-2) sent call-me-maybe
2021-04-27 19:28:00.299561199 -0700 PDT: magicsock: disco: failed to send *disco.Ping to 10.46.0.6:41641: write udp4 0.0.0.0:41641->10.46.0.6:41641: sendto: network is unreachable
2021-04-27 19:28:00.299587345 -0700 PDT: magicsock: disco: failed to send *disco.Ping to 178.128.12.126:41641: write udp4 0.0.0.0:41641->178.128.12.126:41641: sendto: network is unreachable

Not a 1.8.0 regression from 1.6.0 so not a blocker for release, but would be nice to have in 1.8.x soonish if not in 1.8.0 itself.

/cc @danderson @josharian (who were just in this code so might have things fresh in their heads)

Front logo Front conversations

@josharian
Copy link
Contributor

I've been unable to reproduce on an ubuntu vm using a variety of "sleeps" (linux sleep, suspend VM, SIGSTOP). Pawing through the code yields no hints. I'm pausing work on this for now.

@apenwarr
Copy link
Member

apenwarr commented May 4, 2021

Looking at the logs above, is the problem indicator the "sendto: network is unreachable"?

Another "network is unreachable" related problem: #1726

@josharian
Copy link
Contributor

Looking at the logs above, is the problem indicator the "sendto: network is unreachable"?

I believe so, yes.

Another "network is unreachable" related problem: #1726

Interesting. Alas my ISP doesn't support IPv6, so I'll have to go spin up a VM to investigate...

@DentonGentry DentonGentry added the L3 Some users Likelihood label May 27, 2021
@DentonGentry
Copy link
Contributor

DentonGentry commented Jun 8, 2021

Another report came in, added link from Front so it should appear in the "Front conversations" link.

This one happens on two laptops, and not on other systems running the same Ubuntu version. The code path in question may be related to ACPI or other operations relating to sleep/wake in a laptop, possibly why it couldn't be produced with a VM.

The report says, "Running "ip route show table 52" shows the expected routes, but "ip -6 route show table 52" shows an empty table (i.e. not a table doesn't exist error). If I manually copy the routes from a working machine then IPv6 works fine, but it all breaks again when Tailscale is restarted."

@apenwarr
Copy link
Member

apenwarr commented Jun 8, 2021 via email

@DentonGentry
Copy link
Contributor

Ah. Yes. The “systemd wipes out our ip rule settings” problem is: #1591

@bradfitz
Copy link
Member Author

Yes, probably a dup of #1591.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants