Ping timeout during index exchange #1751

Closed
alex2108 opened this Issue May 2, 2015 · 11 comments

Projects

None yet

5 participants

@alex2108
Member
alex2108 commented May 2, 2015

the index exchange seems to work fine, at least the gui of the laptop shows "Up to date" for a while but then the raspberry disconnects.

Before 0.11 I never saw this and as soon as the connection is established (which worked in some rare cases and which works almost always to another device that syncs less data, except both devices try to connect, then both connections constantly break after index exchange) it stays connected so I assume it's not a real network issue.

part of the log on the raspberry pi:

[NRL2J] 2015/05/02 17:10:59.590206 connections.go:95: INFO: Established secure connection to 7TQSWGR-... at 192.168.1.71:22000-79.231.115.135:52951
[NRL2J] 2015/05/02 17:11:00.624042 model.go:588: INFO: Device 7TQSWGR-... client is "syncthing v0.11.1"
[NRL2J] 2015/05/02 17:11:00.628217 model.go:593: INFO: Device 7TQSWGR-... name is "laptop"
[NRL2J] 2015/05/02 17:18:23.772963 connections.go:134: INFO: Connected to already connected device (7TQSWGR-...)
[NRL2J] 2015/05/02 17:18:28.088101 model.go:674: INFO: Connection to 7TQSWGR-... closed: EOF
[NRL2J] 2015/05/02 17:20:59.056688 connections.go:95: INFO: Established secure connection to 7TQSWGR-... at 192.168.1.71:22000-79.231.115.135:53080
[NRL2J] 2015/05/02 17:20:59.334155 model.go:588: INFO: Device 7TQSWGR-... client is "syncthing v0.11.1"
[NRL2J] 2015/05/02 17:20:59.338092 model.go:593: INFO: Device 7TQSWGR-... name is "laptop"
[NRL2J] 2015/05/02 17:28:42.376262 model.go:674: INFO: Connection to 7TQSWGR-... closed: EOF

part of the log from the laptop:

[7TQSW] 17:19:30 INFO: Established secure connection to NRL2JRW-... at 192.168.2.112:53080-78.43.167.253:22000
[7TQSW] 17:20:59 INFO: Device NRL2JRW-... client is "syncthing v0.11.1"
[7TQSW] 17:20:59 INFO: Device NRL2JRW-... name is "raspberrypi"
[7TQSW] 17:28:00 INFO: Connection to NRL2JRW-... closed: ping timeout
[7TQSW] 17:28:48 INFO: Established secure connection to NRL2JRW-... at 192.168.2.112:53174-78.43.167.253:22000
[7TQSW] 17:30:58 INFO: Device NRL2JRW-... client is "syncthing v0.11.1"
[7TQSW] 17:30:58 INFO: Device NRL2JRW-... name is "raspberrypi"
[7TQSW] 17:31:18 INFO: Connection to NRL2JRW-... closed: ping timeout
[7TQSW] 17:31:57 INFO: Established secure connection to NRL2JRW-... at 192.168.2.112:53233-78.43.167.253:22000
[7TQSW] 17:31:57 INFO: Device NRL2JRW-... client is "syncthing v0.11.1"
[7TQSW] 17:31:57 INFO: Device NRL2JRW-... name is "raspberrypi"

Connected to already connected device sometimes happens a bit more often than in the log I copied here until the raspberry notices the old connection is gone.

I also tried STTRACE=net but this does not reveal more.

Another thing that I noticed: even after I shut down syncthing on all devices except the raspberry pi there is still maximum CPU usage from syncthing for around one minute.

Any other STTRACE that could reveal what is going on here? For me it looks like some connection related stuff does not get enough CPU time and I get timeouts due to that.

@calmh
Member
calmh commented May 3, 2015

[7TQSW] 17:31:18 INFO: Connection to NRL2JRW-... closed: ping timeout

Seems like the raspberry is too slow to respond to pings. Not seen that before...

after I shut down syncthing on all devices except the raspberry pi there is still maximum CPU usage from syncthing for around one minute

Is there a large folder shared from that raspberry? It sounds like it may be "locked up" processing the index. Is it running out of memory and swapping, perhaps?

@alex2108
Member
alex2108 commented May 3, 2015

Is there a large folder shared from that raspberry? It sounds like it may be "locked up" processing the index.

Total amout is around 35GB in 50K files devided in several folders, so the index is quite big (a bit more than 10MB to transfer with compression, almost 20MB without)

Is it running out of memory and swapping, perhaps?

The highest memory usage I saw was at around 150MB and I think there was no swapping, it should have around 50MB more but I can observe that again later.
edit: observed it again, there is still enough memory.

@arneko
arneko commented May 3, 2015

I guess I am having the same issue on a much faster arm device (http://archlinuxarm.org/platforms/armv7/allwinner/cubietruck) although this has a few large folders shared (about 500 GB in total) And has about half its resources committed to other tasks.

It looks like I hardly get the index exchanged though.

[ESTGX] 13:42:53 INFO: Connection to D3ZXPXE-QTF24CJ-ANJ52B2-LEQSQ77-KFD34XO-YHHCJC5-3WKJUQH-SE4LBAR closed: ping timeout

I have transferred a mere 2 Files (Photos) over the past 24 hours. I assume it hardly gets its index exchanged. Never had this issue before.
Is there anything you need to track this down?

@alex2108
Member
alex2108 commented May 3, 2015

For me it always exchanges the whole index but it disconnects directly after that.

I had one strange thing: I changed many files (updating my local copy of the syncthing code to make the pull request this morning 😆) and after this the connection stayed, could be random, but it also could be that trying to pull all those files kept the connection open in some way. I will see if I can reproduce that.

@AudriusButkevicius
Member

I've seen that happen when rate limiting is set very low, causing timeouts.

@alex2108
Member
alex2108 commented May 3, 2015

changing pingTimeout to 600s (and pingIndleTime to 1200s) in https://github.com/syncthing/protocol/blob/master/protocol.go#L146 seems to work a bit better for me, I only get timeouts when multiple devices try to connect at the same time, one device after another works.

The internet connection my raspberry pi has is pretty slow (~40 kB/s), so that is also some kind of rate limiting. Tomorrow I will be in the same LAN again so I will see if it's better or if it's just the CPU.

@calmh calmh added the pr-enhancement label May 4, 2015
@calmh calmh changed the title from Connection closed after index exchange on raspberry pi to Ping timeout during index exchange May 4, 2015
@calmh
Member
calmh commented May 4, 2015

So, what happens here is that there is a lot of index to send. When either the connection itself or the receiving CPU is slow enough, a lot of index data will be buffered in TCP buffers. When the index-sending side sends a ping request, it can get stuck in this queue until the default ping timeout (30 seconds) triggers. Then the connection is closed and the process repeats.

@arneko
arneko commented May 13, 2015

Well it sounds like once #438 is implemented, this will become less of an issue, since hopefully even with an occasional ping-timeout we will have some progress on the index exchange....

@jangrewe

Thanks, currently upgrading all devices to 0.11.11 to test this out!

@arneko
arneko commented Jun 28, 2015

Thanks! There are no settings in the GUI (yet?), so one will have to add the key in the config manually, if I interpret the diff correctly. Thanks a lot though! This was starting to get on my nerves! ;-)

@jangrewe

Finally it's syncing, thanks! I set the timeouts to 600 and 1200, maybe that's a bit high but then again i don't care as long as it works... 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment