-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TCP flows stop transmitting after a while when RTT is very small #6113
Comments
Thanks for the detailed report. We will take a look. |
@nybidari can you take a look? |
RTT in netstack is not calculated during the connection establishment phase. It is done after we receive an ACK for the data packet. But having RTT of zero should not cause sndCwnd to be zero. It is weird that having sndCwnd of zero in one flow is causing other flows to stop sending. Also the state of the connection in TCPInfo still shows as Open even after the sndCwnd is zero and the transmission is stopped. I would expect the state to be in SACK or RTO recovery. I was not able to recreate the issue. Were you able to repro? |
I managed to create a flaky reproducer https://gist.github.com/rip-create-your-account/ade2eacc4d07f284636a3202f771c861. It creates multiple flows that send data over the loopback link. Roughly 2/3 times it does not exit cleanly because 1-3 flows stop transmitting. The number of parallel flows seems to have the biggest impact in triggering the bug. Also, I found that the program needs to run at least for a few seconds to give the bug enough time to occur. When the reproducer triggers the bug it will keep printing lines like,
because these flows stopped making progress.
This reproducer does not seem to trigger this bug but a complex integration test of mine consistently does. For now I choose to believe that it's a bug in my application code that is triggered by the bug here. On closer inspection Anyways, I think the issue is caused by RTT of 0ms that |
@rip-create-your-account I believe I'm currently running into this... I know it's been a few years and the code has changed, but I'm curious if you managed to solve this. |
We were not able to repro the bug. Are you also running into the same issue where RTT is calculated as 0ms? Can you let us know the steps to re-create the issue? Also the output of tcpip.TCPInfoOption would be helpful to know the current state of the TCP connection. |
I'm unfamiliar with gVisor, so I'm somewhat down a rabbit hole. To see the tip of this, check out the PR here: tailscale/tailscale#8106 This is able to reproduce the issue consistently. Don't check out my branch, just check out |
@nybidari I wish I had better steps that were in a closer loop, but that's the best I have so far. |
@nybidari if I lower the I'm happy to hop on Discord if it'd help. I'm still poking around in the code as well! |
@nybidari this also only happens with TCP SACK enabled, just like reported in this issue. |
Thanks for the info. Let me try to repro with the test. Will get back if I need more details. |
@nybidari this only happens when using |
Hmm, I take that back. I can get it to occur without |
I think that I managed to reproduce it with wireguard-go, so it's just gVisor, and small wrapper to create TUN device. I used the basic example of TCP server: examples_test.go. The unstable network behavior that can be observed in tailscale is mimicked with: if mathrand.Intn(100) > 98 {
return 0, os.ErrDeadlineExceeded
} You can run it with the following command, and simply observe the congestion:
After the test panics, usually transmission is stopped, and you can see the goroutine dump. I can observe many of them just waiting as Side question: does wireguard-go improperly configure a TUN device, is it a matter of RTO fine-tuning, or is it a bug indeed? We're looking forward to ensuring a continuous transmission, it was spotted while investigating issues with a SCP/SSH transmission. With |
I was able to repro the bug with your test. Will debug the issue more and try to see what is going wrong here. |
@nybidari, any info you'd be willing to share on timeline or priority from the gVisor team? I'm unsure whether to adjust our network implementation to avoid SACK or wait for a fix. |
I will look into the issue this week, I do not have a fix yet. If it is a blocking issue, then SACK can be disabled for now and re-enabled after the issue is fixed. |
Appreciate it, thank you! |
RTT value should not be zero, set the minimum RTT value to 1ms. This does not happen often and was identified while investigating http://gvisor.dev/issues/6113. Updates #6113 PiperOrigin-RevId: 536885961
Adding this here so we don't forget it: there's a suspicion that when we RTO, we might be sending the wrong packet. That packet gets sent over and over again, halting TCP progress. It could be a SACK bug, but we're not sure. |
@kevinGC thanks for the update! |
@kevinGC to close the loop here, I've been investigating the stalls @kylecarbs @mtojek and I are seeing, and my conclusion is that they are unrelated to the original issue on this thread. c.f. my commment on our repo for details. One thing I think the gVisor team might be able to help with is the limited buffer for out of order packets, but I've raised a separate issue #9153 |
The original issue here where the rtt was zero in some cases is fixed with this commit: Closing this bug. |
For a large number of TCP flows generating lots of traffic the transmitting unexpectedly stops after a short while. I'm using the gonet package to establish many TCP connections and transmit bytes over them. I'm communicating with the host Linux stack over a veth link. I have configured the tcpip.Stack with
tcpip.TCPSACKEnabled(true)
andtcpip.CongestionControlOption("cubic")
Periodically probing the state of the flows with
tcpip.TCPInfoOption
revealed that for some of the flows the RTT is0
after the transmitting stops. Such flows also haveSndCwnd:0
which explains the not sending part. Weirdly one flow having RTT of 0 is enough to cause the other flows to stop transmitting too.Before transmitting stops the flows look like this:
Tracking down the cause I found that RTT of 0 is measured at handleRcvdSegment. It seems that the measured RTT is small enough for
s.ep.timestamp() - rcvdSeg.parsedOptions.TSEcr == 0
which then means thatupdateRTO
is called with RTT of 0. This seems to confuse Cubic and others. The comment there mentions clock granularity of a millisecond for timestamps. After modifying theelapsed
value to change from 0 to 1ms I have not been able to reproduce the problem. Is this correct?The text was updated successfully, but these errors were encountered: