Slow transfer over LAN #5037

piedar · 2018-05-26T23:19:40Z

It takes over 1 minute to transfer 21 MB between two machines on a local network.

For most of this test, the receiver's ipfs daemon uses 150% CPU - suggesting it bottlenecks the old dual-core hardware. However, ipfs add hashes new 30 MB files in under 10 seconds. I don't yet understand why the network transfer adds so much overhead.

Workaround: run the slow node with ipfs daemon --routing=none.

Sender

debug info at https://ipfs.io/ipfs/QmdvvzccQaqmE7Ys6jMLDiXcCusShYyc4mdFqdMJJBARW6

# ipfs version 0.4.15
ipfs daemon &
ipfs add vlc-2.2.8.tar.xz

Receiver

# ipfs version 0.4.15
ipfs daemon &
ipfs repo gc
time ipfs cat QmaSSwsS2nAjExnxrqwKtmK5rLLhmqpju1HCsnPSigtHmV > /dev/null

After a few seconds, the transfer starts but it stutters at 1.38 MB, 3.75 MB, 7.50 MB, etc.

real    1m17.238s
user    0m0.652s
sys     0m1.255s

Without ipfs repo gc it takes about 5 seconds. As iperf shows, the network connection is fine.

[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec   112 MBytes  93.8 Mbits/sec

X-Post discuss.ipfs.io

The text was updated successfully, but these errors were encountered:

bonedaddy · 2018-05-27T02:18:10Z

are you sure that you're requesting it off the LAN and not the internet? there's a chance your machine could be fetching it off the internet? why are you running the garbage collect before attempting to download the file?

piedar · 2018-05-27T03:01:04Z

It could be fetching off the internet, but I would still expect it to be fast since ipfs swarm peers finds the LAN node. I'm only running ipfs repo gc for the benefit of the performance test.

My hunch right now is it's CPU-bound due to older hardware, with ipfs daemon at about 150%. Could the hash verification be the bottleneck? But ipfs add is faster and it calculates hashes too...

[edit: Report of 5000% CPU usage might be just a display bug with htop over ssh.]

bonedaddy · 2018-05-27T03:39:17Z

Ah yea if swarm can find the peer than it should be connecting to the peer.
Holy crap! That is insane CPU usage. Given the symptoms you mention, stuttering at specific points, along with that intense CPU utilization, I think your hunch is right.

whyrusleeping · 2018-05-31T09:51:12Z

@piedar hrm... could you see how fast an ipfs pin add takes to do the same transfer? It uses a different codepath under the hood to do the transfer, a more optimized graph traversal.

Another thing to try that would help us debug is to run the daemon fetching the file with --routing=none and ensuring the two nodes get connected before starting the transfer. If this noticeably improves the performance then it's likely the DHT is interfering (your node sends out notifications that you are now hosting this new content as you receive it, and we've been seeing issues around that lately)

piedar · 2018-06-04T19:05:29Z

Yes @whyrusleeping, it's certainly faster without the routing!

ipfs daemon --routing=none &
ipfs repo gc
time ipfs cat QmaSSwsS2nAjExnxrqwKtmK5rLLhmqpju1HCsnPSigtHmV > /dev/null

real    0m8.123s
user    0m0.366s
sys     0m0.509s

And --routing=dhtclient is in the middle, clocking in around 30 seconds.

whyrusleeping · 2018-06-05T05:42:38Z

That very interesting... It would appear then that getting this PR: #4333 merged should help transfer speeds overall.

whyrusleeping · 2018-06-05T05:44:30Z

(well, that PR and its followups)

piedar · 2018-06-09T03:25:09Z

Could DHT announce be adjusted to run in the background, only when there is no active transfer in progress? Though maybe that's too much complexity if the root can be solved by speeding up the operation in general.

Anyway, I'll run this test every couple versions and report back if the results change significantly.

whyrusleeping · 2018-06-11T05:57:43Z

@piedar running the DHT announce in the background is pretty much what we want to do, the main sticking point for why that hasnt happened yet is that, technically, thats whats happening right now. The reason its slowing things down is that there is backpressure from the DHT provide process slowing down anything that sends hashes to it. Since bitswap fetches each block of a graph independently, it sends one provide call per hash (which can be millions of calls). The change we need to make is to make the DHT providing process a bit smarter, so we can tell it 'here are the objects/pins we care about, make sure the world knows' and it can enumerate hashes on demand (and entirely separate from the process of us receiving them).

whyrusleeping · 2018-06-11T05:58:04Z

Anyway, I'll run this test every couple versions and report back if the results change significantly.

:) Please do! This is so helpful to us

dilshat · 2018-06-19T05:21:56Z

Yes this would be a tremendous improvement. Please include this fix into an upcoming release

etursunbaev · 2018-06-19T05:44:57Z

Me too faced such issue. Seems it does not download from local peers.

MirceaKitsune · 2018-06-28T22:36:00Z

I could confirm this issue today with go-ipfs 0.4.15 under Linux openSUSE x64.

The setup: I have two computers connected to the same home router, mine in one room and my mother's in the hallway. The daemon on mine contains a group of large video files (for DTube) which are pinned, I'd estimate 10GB in total (the size of my ~./ipfs directory). I ran the bash script I created to pin this list of videos on my mother's computer, with the daemon on mine also running since I thought that would cause the files to be served more quickly.

The result: Despite being directly connected to a 10 MB/s or 100 MB/s cable, the pinning process hasn't finished after over 6 hours. Judging by my network traffic monitor, my computer appeared to only serve content periodically: For roughly 5 seconds, I'd see it sending data at over 1 MB/s... after that the transfer rate would drop to roughly 300 KB/s or less and stay there. I know the two daemons were exchanging data over LAN because one of them was posting generic errors and they were all about an IP of the format 192.168.0.1 (the local IP's assigned to our machines by the router).

I immediately found that surprising but thought I must be missing something else. I asked on the IRC channel and someone pointed me to this bug. I figured sharing this experience might help.

Stebalien · 2018-06-28T23:28:39Z

So, make sure you're not confusing megabits and megabytes. Those cables are probably 10Mbps and 100Mpbs, 8x slower than 10MB/s and/or 100MB/s.

For comparison, I'd try connecting the two machines with netcat and piping data directly over that connection. To measure the actual transfer speed.

However, that still looks wrong.

Is either machine using a hard disk (not an SSD)?
When you run the test, how does IPFS's CPU usage look? Is it pegged at 100%?
Try running ipfs bitswap wantlist on the machine downloading and report the results.
Try running ipfs bitswap stat ...
Post the error messages you're seeing concerning 192.168.0.1.

MirceaKitsune · 2018-06-29T01:01:54Z

I was referring to Megabytes of course. Both machines have classic hard drives, no SSD yet.

Resource usage: On my mother's old computer (slow single core CPU), the IPFS process kept using roughly 40% CPU. Memory wise it was over 350 MB.

I used Netcat in an unrelated test weeks ago, kinda forgot how to use it since but I could look into it again. Those ipfs stat commands seem like a better test, I might look into them first.

MirceaKitsune · 2018-06-29T12:08:10Z

This is a screenshot from KSysGuard on my mother's computer showing the network transfer rate. This is all go-ipfs, no other process should have been sending or receiving any significant amount of data.

Sharing it here because the transfer rate is abnormally erratic: One moment it's receiving at over 1 MB/s, the other at 100 KB/s. I see no explanation as to why it wouldn't be at +1 MB/s all the time.

Stebalien · 2018-06-30T00:26:24Z

Hm. Yeah, that doesn't look right at all. I'd expect it to be a bit erratic (known issues) but not that slow.

piedar · 2018-06-30T03:23:55Z

@MirceaKitsune Have you tried running the receiver with ipfs daemon --routing=none? It drastically improved the speed for me. Also, check the hard drive activity light! The IO hurts even my old SSD, so I imagine an HDD could be brutally slow.

ItalyPaleAle · 2018-09-19T23:40:19Z

I have the same issue. Freshly installed go-ipfs on macOS 10.13 (High Sierra). I tried it by requesting just a few pages (less than 10). 15 minutes with the daemon running, it was using 168% CPU and significant energy.

djdv · 2019-05-14T20:59:37Z

@hannahhoward
I believe your work on bitswap, graphsync, etc. is likely relevant here.
If not feel free to un/re-assign.

Stebalien · 2019-05-14T23:12:05Z

This issue is pre-bitswap refactor. Three are still known issues but nothing here will likely be relevant.

whyrusleeping added the topic/perf Performance label May 31, 2018

eingenito mentioned this issue Sep 27, 2018

Improve performance on low bandwith / Reduce extra bandwidth usage #5534

Closed

momack2 added this to Inbox in ipfs/go-ipfs May 9, 2019

djdv assigned hannahhoward May 14, 2019

Stebalien closed this as completed May 14, 2019

DonaldTsang mentioned this issue May 16, 2019

Performace, or How IPFS will be better than BitTorrent #6342

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow transfer over LAN #5037

Slow transfer over LAN #5037

piedar commented May 26, 2018 •

edited

bonedaddy commented May 27, 2018 •

edited

piedar commented May 27, 2018 •

edited

bonedaddy commented May 27, 2018

whyrusleeping commented May 31, 2018

piedar commented Jun 4, 2018 •

edited

whyrusleeping commented Jun 5, 2018

whyrusleeping commented Jun 5, 2018

piedar commented Jun 9, 2018

whyrusleeping commented Jun 11, 2018

whyrusleeping commented Jun 11, 2018

dilshat commented Jun 19, 2018

etursunbaev commented Jun 19, 2018

MirceaKitsune commented Jun 28, 2018

Stebalien commented Jun 28, 2018

MirceaKitsune commented Jun 29, 2018

MirceaKitsune commented Jun 29, 2018

Stebalien commented Jun 30, 2018

piedar commented Jun 30, 2018 •

edited

ItalyPaleAle commented Sep 19, 2018

djdv commented May 14, 2019

Stebalien commented May 14, 2019

Slow transfer over LAN #5037

Slow transfer over LAN #5037

Comments

piedar commented May 26, 2018 • edited

Sender

Receiver

bonedaddy commented May 27, 2018 • edited

piedar commented May 27, 2018 • edited

bonedaddy commented May 27, 2018

whyrusleeping commented May 31, 2018

piedar commented Jun 4, 2018 • edited

whyrusleeping commented Jun 5, 2018

whyrusleeping commented Jun 5, 2018

piedar commented Jun 9, 2018

whyrusleeping commented Jun 11, 2018

whyrusleeping commented Jun 11, 2018

dilshat commented Jun 19, 2018

etursunbaev commented Jun 19, 2018

MirceaKitsune commented Jun 28, 2018

Stebalien commented Jun 28, 2018

MirceaKitsune commented Jun 29, 2018

MirceaKitsune commented Jun 29, 2018

Stebalien commented Jun 30, 2018

piedar commented Jun 30, 2018 • edited

ItalyPaleAle commented Sep 19, 2018

djdv commented May 14, 2019

Stebalien commented May 14, 2019

piedar commented May 26, 2018 •

edited

bonedaddy commented May 27, 2018 •

edited

piedar commented May 27, 2018 •

edited

piedar commented Jun 4, 2018 •

edited

piedar commented Jun 30, 2018 •

edited