New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rTorrent reading 4x more data than sending #443

Open
ghost opened this Issue Jun 14, 2016 · 106 comments

Comments

Projects
None yet
@ghost

ghost commented Jun 14, 2016

rTorrent seems to be reading 4-6x more data than it's seeding. In this case I'm seeing 100-160MB/s read from rTorrent while it's only seeding about 20MB/s worth of data.

--- IOTOP
Total DISK READ : 161.06 M/s | Total DISK WRITE : 27.28 M/s
Actual DISK READ: 161.06 M/s | Actual DISK WRITE: 0.00 B/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
42685 be/4 debian-d 161.06 M/s 27.28 M/s 0.00 % 70.79 % rtorrent

--- rTorrent
[Throttle off/off KB] [Rate 21780.5/27685.2 KB] [Port: 65300] [U 357/0] [D 141/0] [H 0/4096] [S 21/700/65023] [F 5339/32768]

Thoughts?

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Jun 15, 2016

I've tried playing with the preload features to see if that causes it, I've tried disabling preloading and that doesn't help, nor does enabling it or making the memory cache larger. rTorrent doesn't seem to utilize the piece cache anyways even though i have it set to like 16G and have 300 torrents all seeding with active peers.

ghost commented Jun 15, 2016

I've tried playing with the preload features to see if that causes it, I've tried disabling preloading and that doesn't help, nor does enabling it or making the memory cache larger. rTorrent doesn't seem to utilize the piece cache anyways even though i have it set to like 16G and have 300 torrents all seeding with active peers.

@Speeddymon

This comment has been minimized.

Show comment
Hide comment
@Speeddymon

Speeddymon Jul 9, 2016

Contributor

I see you don't have throttling enabled.

Can you enable throttling for both upload and download, and see if you can replicate it with throttling enabled. Your throttle should be 80% of your total maximum speed from your ISP.

So if your upload speed is 20MB/s then set the upload throttle to around 17MB/s, and if your download speed is 100MB/s then set the download throttle to 80MB/s.

If you are able to replicate it with throttling enabled, then please set the download throttle to something very low and see if the issue still occurs (uploads drop down to 1/4 of your throttled download speed)

Contributor

Speeddymon commented Jul 9, 2016

I see you don't have throttling enabled.

Can you enable throttling for both upload and download, and see if you can replicate it with throttling enabled. Your throttle should be 80% of your total maximum speed from your ISP.

So if your upload speed is 20MB/s then set the upload throttle to around 17MB/s, and if your download speed is 100MB/s then set the download throttle to 80MB/s.

If you are able to replicate it with throttling enabled, then please set the download throttle to something very low and see if the issue still occurs (uploads drop down to 1/4 of your throttled download speed)

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Jul 9, 2016

I did try this with throttling too.

It seems to happen with lots of uploading slots and lots of torrents. I think what's happening is a slow client requests a piece, rTorrent loads the entire piece into memory, and attempts to send it but in smaller blocks. In the meantime many many torrents and clients are doing this causing an internal cache thrashing. So a piece is having to be reloaded every time a peer requests the next block because the piece gets kicked out of the cache due to other pieces being loaded. So instead of reading the entire piece from disk, how about just the block the peer requests? I know this is less efficient for most setups but an option to change the behavior on how data gets loaded from disk would be nice.

ghost commented Jul 9, 2016

I did try this with throttling too.

It seems to happen with lots of uploading slots and lots of torrents. I think what's happening is a slow client requests a piece, rTorrent loads the entire piece into memory, and attempts to send it but in smaller blocks. In the meantime many many torrents and clients are doing this causing an internal cache thrashing. So a piece is having to be reloaded every time a peer requests the next block because the piece gets kicked out of the cache due to other pieces being loaded. So instead of reading the entire piece from disk, how about just the block the peer requests? I know this is less efficient for most setups but an option to change the behavior on how data gets loaded from disk would be nice.

@Speeddymon

This comment has been minimized.

Show comment
Hide comment
@Speeddymon

Speeddymon Jul 9, 2016

Contributor

Hmm..

Preload options might help here.

pieces.preload.min_size.set
pieces.preload.min_rate.set
pieces.preload.type.set

min_size is set to 131072 (or 128kb)

min_rate appears to be set to 5192

Sadly the documentation is lacking and I'm not an expert at reading this code, so I can't tell if that's 5192 bytes or a factor of time or what have you. Looks like valid values for type are 0 and 1, but no idea what either of those means.

Try lowering the min_size in your rtorrent config and see if that helps any.

Contributor

Speeddymon commented Jul 9, 2016

Hmm..

Preload options might help here.

pieces.preload.min_size.set
pieces.preload.min_rate.set
pieces.preload.type.set

min_size is set to 131072 (or 128kb)

min_rate appears to be set to 5192

Sadly the documentation is lacking and I'm not an expert at reading this code, so I can't tell if that's 5192 bytes or a factor of time or what have you. Looks like valid values for type are 0 and 1, but no idea what either of those means.

Try lowering the min_size in your rtorrent config and see if that helps any.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Jul 9, 2016

I've also tried with and without preload, doesn't make a difference.

ghost commented Jul 9, 2016

I've also tried with and without preload, doesn't make a difference.

@Speeddymon

This comment has been minimized.

Show comment
Hide comment
@Speeddymon

Speeddymon Jul 9, 2016

Contributor

You confirmed it happens even with just one active torrent and all others stopped?

If you could, please test with just one active torrent, and the rest stopped, assuming you haven't already. ;-)

If it doesn't happen there, then that will probably confirm the theory about slow clients, because you're then only dealing with a small subset of the number of peers you normally are interacting with.

Contributor

Speeddymon commented Jul 9, 2016

You confirmed it happens even with just one active torrent and all others stopped?

If you could, please test with just one active torrent, and the rest stopped, assuming you haven't already. ;-)

If it doesn't happen there, then that will probably confirm the theory about slow clients, because you're then only dealing with a small subset of the number of peers you normally are interacting with.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Jul 15, 2016

Here's the result of downloading a SINGLE torrent. With 0 upload speed.

It's reading an insane amount of data for only downloading a torrent at 2MB/s. I mean what's the point of the freaking disk process? I'm using the 0.9.6 published build.

TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
24180 be/4 rtorrent 8.37 M/s 2.03 M/s 0.00 % 7.69 % rtorrent
24181 be/4 rtorrent 3.29 M/s 0.00 B/s 0.00 % 3.15 % rtorrent [rtorrent disk]

[Throttle 200/2048 KB] [Rate 2.3/2071.4 KB] [Port: 12345]

ghost commented Jul 15, 2016

Here's the result of downloading a SINGLE torrent. With 0 upload speed.

It's reading an insane amount of data for only downloading a torrent at 2MB/s. I mean what's the point of the freaking disk process? I'm using the 0.9.6 published build.

TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
24180 be/4 rtorrent 8.37 M/s 2.03 M/s 0.00 % 7.69 % rtorrent
24181 be/4 rtorrent 3.29 M/s 0.00 B/s 0.00 % 3.15 % rtorrent [rtorrent disk]

[Throttle 200/2048 KB] [Rate 2.3/2071.4 KB] [Port: 12345]

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Jul 15, 2016

Using a completely stock rtorrent and stock debian install (no .rtorrent.rc file)... downloading a SINGLE torrent with ZERO UPLOAD.

TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
24861 be/4 rtorrent 10.21 M/s 2.99 M/s 0.00 % 7.71 % rtorrent

[Throttle off/off KB] [Rate 3.5/3091.0 KB] [Port: 6944]

This is a huge performance impact for rTorrent. This renders it unusable for me and is likely impacting the entire userbase and they don't even know it.

ghost commented Jul 15, 2016

Using a completely stock rtorrent and stock debian install (no .rtorrent.rc file)... downloading a SINGLE torrent with ZERO UPLOAD.

TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
24861 be/4 rtorrent 10.21 M/s 2.99 M/s 0.00 % 7.71 % rtorrent

[Throttle off/off KB] [Rate 3.5/3091.0 KB] [Port: 6944]

This is a huge performance impact for rTorrent. This renders it unusable for me and is likely impacting the entire userbase and they don't even know it.

@Speeddymon

This comment has been minimized.

Show comment
Hide comment
@Speeddymon

Speeddymon Jul 15, 2016

Contributor

I can see it reading that much with hashing enabled but you have it off.

Can you try to reproduce this with deluge?

It uses libtorrent also, and will help to narrow down whether its a bug in libtorrent or rtorrent itself.

Contributor

Speeddymon commented Jul 15, 2016

I can see it reading that much with hashing enabled but you have it off.

Can you try to reproduce this with deluge?

It uses libtorrent also, and will help to narrow down whether its a bug in libtorrent or rtorrent itself.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Jul 15, 2016

Deluge uses a different libtorrent actually - libtorrent-rasterbar.

I tried builds all the way back to 0.9.0 and they all are affected. I haven't tried prior versions since they won't compile under newer versions of the C/C++ compilers.

Deluge does not have this same problem. It also performs significantly less write op/s too. I get maybe an average of 30 IOops every few seconds when the write cache is being flushed. When I was using rTorrentI was seeing a constant 300-400 IOops/s with rTorrent.

TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
15868 be/4 deluge 0.00 B/s 2.49 M/s 0.00 % 0.99 % python /usr/bin/deluged

ghost commented Jul 15, 2016

Deluge uses a different libtorrent actually - libtorrent-rasterbar.

I tried builds all the way back to 0.9.0 and they all are affected. I haven't tried prior versions since they won't compile under newer versions of the C/C++ compilers.

Deluge does not have this same problem. It also performs significantly less write op/s too. I get maybe an average of 30 IOops every few seconds when the write cache is being flushed. When I was using rTorrentI was seeing a constant 300-400 IOops/s with rTorrent.

TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
15868 be/4 deluge 0.00 B/s 2.49 M/s 0.00 % 0.99 % python /usr/bin/deluged

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Jul 15, 2016

I can see it reading that much with hashing enabled but you have it off.

check_hash is on by default (e.g. when you don't use a config file). Although it should mean "Check hash for finished torrents.", but who knows, can you try to disable it at first with:

pieces.hash.on_completion.set = no

This is a huge performance impact for rTorrent. This renders it unusable for me and is likely impacting the entire userbase and they don't even know it.

What OS is this exactly? Is it run on a physical device or in virtual environment?
I have just checked on Ubuntu 14.04 laptop. These are the figures that I'm getting with sudo iotop -ok (interactive mode) checking different times and with this config (using my rtorrent-ps fork, based on 0.9.6, not on master):

[Rate 1697.0/1465.2 KB]
19812 be/4 chros73     990.25 K/s    2.53 M/s  0.00 %  2.06 % rtorrent
19812 be/4 chros73    1668.26 K/s 1144.28 K/s  0.00 %  1.87 % rtorrent
19812 be/4 chros73       2.79 M/s 1509.73 K/s  0.00 %  3.11 % rtorrent

[Rate 1469.0/2372.0 KB]
19812 be/4 chros73    1175.79 K/s    2.93 M/s  0.00 %  2.71 % rtorrent
19812 be/4 chros73       3.14 M/s 1419.14 K/s  0.00 %  4.33 % rtorrent
19812 be/4 chros73       2.18 M/s    2.71 M/s  0.00 %  2.94 % rtorrent
19812 be/4 chros73    1304.04 K/s 1591.24 K/s  0.00 %  1.71 % rtorrent

[Rate 1944.1/4556.1 KB]
19812 be/4 chros73     697.34 K/s    5.38 M/s  0.00 %  3.83 % rtorrent
19812 be/4 chros73    1716.96 K/s    4.05 M/s  0.00 %  0.10 % rtorrent
19812 be/4 chros73    1239.27 K/s    4.83 M/s  0.00 %  3.54 % rtorrent

[Rate 2071.3/  3.8 KB] 
19812 be/4 chros73    4846.16 K/s    0.00 K/s  0.00 % 16.97 % rtorrent
19812 be/4 chros73    7732.49 K/s    0.00 K/s  0.00 % 22.20 % rtorrent
19812 be/4 chros73    3456.99 K/s    0.00 K/s  0.00 %  7.55 % rtorrent
19812 be/4 chros73    3328.77 K/s    0.00 K/s  0.00 %  9.09 % rtorrent
19812 be/4 chros73    5752.25 K/s    0.00 K/s  0.00 % 27.22 % rtorrent

As you can see, I don't have that insane fluctuation like you do.
The last readings are interesting , sometimes it can be almost 4x times of data that it reads.
From the config I use these values (that can be interesting to this issue):

# Set the max amount of memory space used to mapping file chunks. This refers to memory mapping, not physical memory allocation. (max_memory_usage) This may also be set using ulimit -m where 3/4 will be allocated to file chunks.
pieces.memory.max.set = 2048M
# Adjust the send and receive buffer size for socket. Disabled by default (0), this means the default is used by OS (you have to modify the system wide settings!!!) (send_buffer_size, receive_buffer_size)
#   Increasing buffer sizes may help reduce disk seeking, connection polling as more data is buffered each time the socket is written to. It will result higher memory usage (not by rtorrent process!).
network.receive_buffer.size.set =  4M
network.send_buffer.size.set    = 12M
# Preloading a piece of a file. (Default: 0) Possible values: 0 (Off) , 1 (Madvise) , 2 (Direct paging). (https://github.com/rakshasa/rtorrent/issues/418)
pieces.preload.type.set = 2
# Check hash for finished torrents. (check_hash)
pieces.hash.on_completion.set = no

I mean what's the point of the freaking disk process?

I don't have rtorrent [rtorrent disk] process at all, just rtorrent. :)

Edit: I have to mention that I'm using modded 3.16.0-pf4-chros02 #4 SMP PREEMPT linux kernel with io tweaks, I don't know whether it matters or not.

chros73 commented Jul 15, 2016

I can see it reading that much with hashing enabled but you have it off.

check_hash is on by default (e.g. when you don't use a config file). Although it should mean "Check hash for finished torrents.", but who knows, can you try to disable it at first with:

pieces.hash.on_completion.set = no

This is a huge performance impact for rTorrent. This renders it unusable for me and is likely impacting the entire userbase and they don't even know it.

What OS is this exactly? Is it run on a physical device or in virtual environment?
I have just checked on Ubuntu 14.04 laptop. These are the figures that I'm getting with sudo iotop -ok (interactive mode) checking different times and with this config (using my rtorrent-ps fork, based on 0.9.6, not on master):

[Rate 1697.0/1465.2 KB]
19812 be/4 chros73     990.25 K/s    2.53 M/s  0.00 %  2.06 % rtorrent
19812 be/4 chros73    1668.26 K/s 1144.28 K/s  0.00 %  1.87 % rtorrent
19812 be/4 chros73       2.79 M/s 1509.73 K/s  0.00 %  3.11 % rtorrent

[Rate 1469.0/2372.0 KB]
19812 be/4 chros73    1175.79 K/s    2.93 M/s  0.00 %  2.71 % rtorrent
19812 be/4 chros73       3.14 M/s 1419.14 K/s  0.00 %  4.33 % rtorrent
19812 be/4 chros73       2.18 M/s    2.71 M/s  0.00 %  2.94 % rtorrent
19812 be/4 chros73    1304.04 K/s 1591.24 K/s  0.00 %  1.71 % rtorrent

[Rate 1944.1/4556.1 KB]
19812 be/4 chros73     697.34 K/s    5.38 M/s  0.00 %  3.83 % rtorrent
19812 be/4 chros73    1716.96 K/s    4.05 M/s  0.00 %  0.10 % rtorrent
19812 be/4 chros73    1239.27 K/s    4.83 M/s  0.00 %  3.54 % rtorrent

[Rate 2071.3/  3.8 KB] 
19812 be/4 chros73    4846.16 K/s    0.00 K/s  0.00 % 16.97 % rtorrent
19812 be/4 chros73    7732.49 K/s    0.00 K/s  0.00 % 22.20 % rtorrent
19812 be/4 chros73    3456.99 K/s    0.00 K/s  0.00 %  7.55 % rtorrent
19812 be/4 chros73    3328.77 K/s    0.00 K/s  0.00 %  9.09 % rtorrent
19812 be/4 chros73    5752.25 K/s    0.00 K/s  0.00 % 27.22 % rtorrent

As you can see, I don't have that insane fluctuation like you do.
The last readings are interesting , sometimes it can be almost 4x times of data that it reads.
From the config I use these values (that can be interesting to this issue):

# Set the max amount of memory space used to mapping file chunks. This refers to memory mapping, not physical memory allocation. (max_memory_usage) This may also be set using ulimit -m where 3/4 will be allocated to file chunks.
pieces.memory.max.set = 2048M
# Adjust the send and receive buffer size for socket. Disabled by default (0), this means the default is used by OS (you have to modify the system wide settings!!!) (send_buffer_size, receive_buffer_size)
#   Increasing buffer sizes may help reduce disk seeking, connection polling as more data is buffered each time the socket is written to. It will result higher memory usage (not by rtorrent process!).
network.receive_buffer.size.set =  4M
network.send_buffer.size.set    = 12M
# Preloading a piece of a file. (Default: 0) Possible values: 0 (Off) , 1 (Madvise) , 2 (Direct paging). (https://github.com/rakshasa/rtorrent/issues/418)
pieces.preload.type.set = 2
# Check hash for finished torrents. (check_hash)
pieces.hash.on_completion.set = no

I mean what's the point of the freaking disk process?

I don't have rtorrent [rtorrent disk] process at all, just rtorrent. :)

Edit: I have to mention that I'm using modded 3.16.0-pf4-chros02 #4 SMP PREEMPT linux kernel with io tweaks, I don't know whether it matters or not.

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Jul 15, 2016

downloading a SINGLE torrent with ZERO UPLOAD.

I've just tried it, just in case (with the above linked config): downloading only 1 with zero upload.

[Rate   4.5/4067.1 KB]
 9094 be/4 chros73       0.00 K/s 3968.74 K/s  0.00 %  0.00 % rtorrent

This part seems to be good on my system.

chros73 commented Jul 15, 2016

downloading a SINGLE torrent with ZERO UPLOAD.

I've just tried it, just in case (with the above linked config): downloading only 1 with zero upload.

[Rate   4.5/4067.1 KB]
 9094 be/4 chros73       0.00 K/s 3968.74 K/s  0.00 %  0.00 % rtorrent

This part seems to be good on my system.

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Jul 15, 2016

I made one more test:

  • switch off pieces preloading (during runtime): pieces.preload.type.set = 0
  • all other settings are the same as was before (network buffer size is still huge)
  • just uploading with around 1130 KB (no download), and using sudo iotop -aokP accumulated output to see whats the sum after 20 sec:
[Rate 1125.2/  2.6 KB] 
 it should be around: 20x1130 = 22600.00 K
 sum is (reported by iotop) =56868.00 K

It's about 2.5x .

Can you try the similar test with deluge if it has any buffering option? I'm just curious.

chros73 commented Jul 15, 2016

I made one more test:

  • switch off pieces preloading (during runtime): pieces.preload.type.set = 0
  • all other settings are the same as was before (network buffer size is still huge)
  • just uploading with around 1130 KB (no download), and using sudo iotop -aokP accumulated output to see whats the sum after 20 sec:
[Rate 1125.2/  2.6 KB] 
 it should be around: 20x1130 = 22600.00 K
 sum is (reported by iotop) =56868.00 K

It's about 2.5x .

Can you try the similar test with deluge if it has any buffering option? I'm just curious.

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Jul 15, 2016

And the last test, testing buffer sizes, setting them during runtime:

network.receive_buffer.size.set =  208K
network.send_buffer.size.set    = 208K
pieces.preload.type.set = 0
  • all other settings are the same as was before
  • just uploading with around 163 KB (no download), and using sudo iotop -aokP accumulated output to see what's the sum after 20 sec:
[Rate 162.4/  8.8 KB]
 it should be around: 20x163 = 3260.00 K
 sum is (reported by iotop) = 13292.00 K

It's about 4x .

So, it seems that bigger buffer sizes can help a bit with this but not really.

This is a huge performance impact for rTorrent. ... and is likely impacting the entire userbase and they don't even know it.

I tend to think that you're right :) Now, all we need is somebody who can fix this. :)
Thanks for mentioning this bug as well.

chros73 commented Jul 15, 2016

And the last test, testing buffer sizes, setting them during runtime:

network.receive_buffer.size.set =  208K
network.send_buffer.size.set    = 208K
pieces.preload.type.set = 0
  • all other settings are the same as was before
  • just uploading with around 163 KB (no download), and using sudo iotop -aokP accumulated output to see what's the sum after 20 sec:
[Rate 162.4/  8.8 KB]
 it should be around: 20x163 = 3260.00 K
 sum is (reported by iotop) = 13292.00 K

It's about 4x .

So, it seems that bigger buffer sizes can help a bit with this but not really.

This is a huge performance impact for rTorrent. ... and is likely impacting the entire userbase and they don't even know it.

I tend to think that you're right :) Now, all we need is somebody who can fix this. :)
Thanks for mentioning this bug as well.

@mrvn

This comment has been minimized.

Show comment
Hide comment
@mrvn

mrvn Jul 18, 2016

Why does it read ANYTHING during download? Unless you download more chunks in parallel than the system can cache all downloads should stay in memory until they are no longer needed.

Only when doing the final hash check when the download is finished I would expect reads, unless the downloaded file is still in cache (likely for a single download test smaller than the ram)..

So on that single download without upload what was the chunk size and how many chunks where downloaded in parallel? How much memory was there?

mrvn commented Jul 18, 2016

Why does it read ANYTHING during download? Unless you download more chunks in parallel than the system can cache all downloads should stay in memory until they are no longer needed.

Only when doing the final hash check when the download is finished I would expect reads, unless the downloaded file is still in cache (likely for a single download test smaller than the ram)..

So on that single download without upload what was the chunk size and how many chunks where downloaded in parallel? How much memory was there?

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Jul 18, 2016

The system had 8G of RAM which was way more than the size of the torrent. Also the torrent, I don't remember the chunk size, there were a few dozen seeds the box was connected to. rTorrent was allowed to use 1G of memory for cache. There's no reason it should be reading anyways during downloading.

I'm wondering if rTorrent is trying to be so aggressive with it's caching that it's actually cache thrashing. Deluge does not have any of the same problems.

Debian 8 system, virtualized, all measurements were taken inside the VM.

ghost commented Jul 18, 2016

The system had 8G of RAM which was way more than the size of the torrent. Also the torrent, I don't remember the chunk size, there were a few dozen seeds the box was connected to. rTorrent was allowed to use 1G of memory for cache. There's no reason it should be reading anyways during downloading.

I'm wondering if rTorrent is trying to be so aggressive with it's caching that it's actually cache thrashing. Deluge does not have any of the same problems.

Debian 8 system, virtualized, all measurements were taken inside the VM.

@mrvn

This comment has been minimized.

Show comment
Hide comment
@mrvn

mrvn Jul 19, 2016

How does rtorrent caching work at all? If it's just mmap()ing the file and then uses that as buffer to recv() then every page will be read in before being overwritten with the download. That would explain as much read as download at least. I hope this isn't the case.

mrvn commented Jul 19, 2016

How does rtorrent caching work at all? If it's just mmap()ing the file and then uses that as buffer to recv() then every page will be read in before being overwritten with the download. That would explain as much read as download at least. I hope this isn't the case.

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Jul 19, 2016

@mrvn As I mentioned before:

  • I can not reproduce reading during downloading 1 torrent
  • I can reproduce the 3x-4x reading issue during seeding

Can you also try it out?

chros73 commented Jul 19, 2016

@mrvn As I mentioned before:

  • I can not reproduce reading during downloading 1 torrent
  • I can reproduce the 3x-4x reading issue during seeding

Can you also try it out?

@mrvn

This comment has been minimized.

Show comment
Hide comment
@mrvn

mrvn Jul 21, 2016

I have not benchmarked it but I have seen a lot more reads than my upload limit would suggest since basically forever.

mrvn commented Jul 21, 2016

I have not benchmarked it but I have seen a lot more reads than my upload limit would suggest since basically forever.

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Jul 21, 2016

Maybe this issue is related in this way: #409

chros73 commented Jul 21, 2016

Maybe this issue is related in this way: #409

@ryandaniels

This comment has been minimized.

Show comment
Hide comment
@ryandaniels

ryandaniels Jul 24, 2016

Reproduced downloading only 1 torrent with everything else stopped, on RaspberryPi 2 / 3 (running fresh install Raspbian/Debian Jessie). Tried all above config settings with no change.
Issue seems to be related to non-local disks.
Easy to detect, since downloading is being limited due to 100Mbit/s ethernet limitation of Pi. This really limits the RaspberryPi !

For reference, this is my up/down limit setting:
download_rate = 3400
upload_rate = 2000

When downloading to local disk on RPi, no issue. Hit limit set in .rtorrent.rc config.
iotop - 17:20:16 2877 be/4 rtorrent 0.00 B/s 3.39 M/s 0.00 % 0.00 % rtorrent
ifstat - 17:20:16 28893.09 724.55

When downloading to NFS share, the issue happens. (Also tested with CIFS (samba) and still same issue). Tons of excessive reads when downloading 1 file!
iotop - 17:18:42 2877 be/4 rtorrent 5.01 M/s 1774.55 K/s 0.00 % 77.63 % rtorrent
iotop - 17:18:42 2879 be/4 rtorrent 2.36 M/s 0.00 B/s 0.00 % 37.71 % rtorrent [rtorrent disk]
ifstat - 17:18:42 88057.85 16187.82

FYI, when copying file from local disk to NFS share, file transfer is fast. So not an issue with NFS share. (Hitting max of RPi ethernet).
iotop - 17:57:31 8058 be/4 rtorrent 11.31 M/s 11.49 M/s 0.00 % 78.37 % cp -rp /NFSshare/ISO/ubuntu-16.04.1-desktop-amd64.iso /home/rtorrent
ifstat - 17:57:31 94198.44 4390.52
iotop - 18:04:02 8197 be/4 rtorrent 12.60 M/s 12.37 M/s 0.00 % 86.25 % cp -rp /home/rtorrent/ubuntu-16.04.1-desktop-amd64.iso /NFSshare/ISO
ifstat - 18:04:02 2599.28 97872.17

Note: ifstat is in bit per sec. iotop is in bytes per sec.
Commands and command headers for reference:
$ iotop -botqqq
TIME TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND
17:18:42 2877 be/4 rtorrent 5.01 M/s 1774.55 K/s 0.00 % 77.63 % rtorrent
17:18:42 2879 be/4 rtorrent 2.36 M/s 0.00 B/s 0.00 % 37.71 % rtorrent [rtorrent disk]

$ ifstat -b -t -i eth0 1
Time eth0
HH:MM:SS Kbps in Kbps out
17:18:42 88057.85 16187.82

ryandaniels commented Jul 24, 2016

Reproduced downloading only 1 torrent with everything else stopped, on RaspberryPi 2 / 3 (running fresh install Raspbian/Debian Jessie). Tried all above config settings with no change.
Issue seems to be related to non-local disks.
Easy to detect, since downloading is being limited due to 100Mbit/s ethernet limitation of Pi. This really limits the RaspberryPi !

For reference, this is my up/down limit setting:
download_rate = 3400
upload_rate = 2000

When downloading to local disk on RPi, no issue. Hit limit set in .rtorrent.rc config.
iotop - 17:20:16 2877 be/4 rtorrent 0.00 B/s 3.39 M/s 0.00 % 0.00 % rtorrent
ifstat - 17:20:16 28893.09 724.55

When downloading to NFS share, the issue happens. (Also tested with CIFS (samba) and still same issue). Tons of excessive reads when downloading 1 file!
iotop - 17:18:42 2877 be/4 rtorrent 5.01 M/s 1774.55 K/s 0.00 % 77.63 % rtorrent
iotop - 17:18:42 2879 be/4 rtorrent 2.36 M/s 0.00 B/s 0.00 % 37.71 % rtorrent [rtorrent disk]
ifstat - 17:18:42 88057.85 16187.82

FYI, when copying file from local disk to NFS share, file transfer is fast. So not an issue with NFS share. (Hitting max of RPi ethernet).
iotop - 17:57:31 8058 be/4 rtorrent 11.31 M/s 11.49 M/s 0.00 % 78.37 % cp -rp /NFSshare/ISO/ubuntu-16.04.1-desktop-amd64.iso /home/rtorrent
ifstat - 17:57:31 94198.44 4390.52
iotop - 18:04:02 8197 be/4 rtorrent 12.60 M/s 12.37 M/s 0.00 % 86.25 % cp -rp /home/rtorrent/ubuntu-16.04.1-desktop-amd64.iso /NFSshare/ISO
ifstat - 18:04:02 2599.28 97872.17

Note: ifstat is in bit per sec. iotop is in bytes per sec.
Commands and command headers for reference:
$ iotop -botqqq
TIME TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND
17:18:42 2877 be/4 rtorrent 5.01 M/s 1774.55 K/s 0.00 % 77.63 % rtorrent
17:18:42 2879 be/4 rtorrent 2.36 M/s 0.00 B/s 0.00 % 37.71 % rtorrent [rtorrent disk]

$ ifstat -b -t -i eth0 1
Time eth0
HH:MM:SS Kbps in Kbps out
17:18:42 88057.85 16187.82

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Jul 24, 2016

Thanks for the detailed report.
Can you also test 3x-4x seeding issue with local disk? That was the one that I could confirm. Thanks

chros73 commented Jul 24, 2016

Thanks for the detailed report.
Can you also test 3x-4x seeding issue with local disk? That was the one that I could confirm. Thanks

@ryandaniels

This comment has been minimized.

Show comment
Hide comment
@ryandaniels

ryandaniels Jul 24, 2016

Upload limit 200KB. 1 torrent.
About 2.75x when seeding from NFS disk:
About 1.75x when seeding from local disk.

NFS disk:
HH:MM:SS KB/s in KB/s out
08:50:27 464.41 208.26
08:50:28 628.22 277.58
08:50:29 628.28 214.16

TIME TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND
08:50:27 11070 be/4 rtorrent 403.75 K/s 0.00 B/s 0.00 % 6.65 % rtorrent
08:50:28 11070 be/4 rtorrent 550.84 K/s 0.00 B/s 0.00 % 3.27 % rtorrent
08:50:29 11070 be/4 rtorrent 550.73 K/s 0.00 B/s 0.00 % 3.58 % rtorrent

Local disk:
09:20:25 7.04 229.83
09:20:26 6.51 218.40
09:20:27 4.59 240.26

09:20:25 11070 be/4 rtorrent 352.54 K/s 0.00 B/s 0.00 % 0.63 % rtorrent
09:20:26 11070 be/4 rtorrent 352.40 K/s 0.00 B/s 0.00 % 0.60 % rtorrent
09:20:27 11070 be/4 rtorrent 352.57 K/s 0.00 B/s 0.00 % 0.00 % rtorrent

Note: both iotop and ifstat in bytes

ryandaniels commented Jul 24, 2016

Upload limit 200KB. 1 torrent.
About 2.75x when seeding from NFS disk:
About 1.75x when seeding from local disk.

NFS disk:
HH:MM:SS KB/s in KB/s out
08:50:27 464.41 208.26
08:50:28 628.22 277.58
08:50:29 628.28 214.16

TIME TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND
08:50:27 11070 be/4 rtorrent 403.75 K/s 0.00 B/s 0.00 % 6.65 % rtorrent
08:50:28 11070 be/4 rtorrent 550.84 K/s 0.00 B/s 0.00 % 3.27 % rtorrent
08:50:29 11070 be/4 rtorrent 550.73 K/s 0.00 B/s 0.00 % 3.58 % rtorrent

Local disk:
09:20:25 7.04 229.83
09:20:26 6.51 218.40
09:20:27 4.59 240.26

09:20:25 11070 be/4 rtorrent 352.54 K/s 0.00 B/s 0.00 % 0.63 % rtorrent
09:20:26 11070 be/4 rtorrent 352.40 K/s 0.00 B/s 0.00 % 0.60 % rtorrent
09:20:27 11070 be/4 rtorrent 352.57 K/s 0.00 B/s 0.00 % 0.00 % rtorrent

Note: both iotop and ifstat in bytes

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Jul 24, 2016

Thanks, that's indeed confirms this seeding bug with a local disk as well.

chros73 commented Jul 24, 2016

Thanks, that's indeed confirms this seeding bug with a local disk as well.

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 commented Jul 30, 2016

Can this be related? rakshasa/libtorrent#83

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Aug 26, 2016

Has anybody tried out this with master branch version?

chros73 commented Aug 26, 2016

Has anybody tried out this with master branch version?

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Aug 26, 2016

I made test about the seeding issue again with local disk using 0.9.6: [Rate 2100.5 / 7500.6 KB] ... [U 65/300]
pidstat -p $PID -dl 20 5 (run it 5 times, gives average at every 20 secs, and after the last run)

Note:

  • downloading isn't affected
  • ratio depends on the used upload slots: the more slot is used the more unnecessary read is made

I. if pieces.preload.type.set=0 or pieces.preload.type.set=2

19:09:31      UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
19:09:51     1000     23536  11553.82   7841.68      0.00  /opt/rtorrent/bin/rtorrent
19:10:11     1000     23536   9645.20   7442.00      0.00  /opt/rtorrent/bin/rtorrent
19:10:31     1000     23536   9087.00   6860.80      0.00  /opt/rtorrent/bin/rtorrent
19:10:51     1000     23536   9763.00   7477.20      0.00  /opt/rtorrent/bin/rtorrent
19:11:11     1000     23536   9218.80   6956.40      0.00  /opt/rtorrent/bin/rtorrent
Average:     1000     23536   9853.73   7315.67      0.00  /opt/rtorrent/bin/rtorrent

It's 4.6x more then it's needed.

II. if pieces.preload.type.set=1

19:07:31      UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
19:07:51     1000     23536  14749.60   6930.60      0.00  /opt/rtorrent/bin/rtorrent
19:08:11     1000     23536  15739.80   7864.20      0.00  /opt/rtorrent/bin/rtorrent
19:08:31     1000     23536  16461.80   7450.60      0.00  /opt/rtorrent/bin/rtorrent
19:08:51     1000     23536  15243.00   7157.60      0.00  /opt/rtorrent/bin/rtorrent
19:09:11     1000     23536  15412.80   7462.20      0.00  /opt/rtorrent/bin/rtorrent
Average:     1000     23536  15521.40   7373.04      0.00  /opt/rtorrent/bin/rtorrent

It's 7.4x more then it's needed.

How can we debug this?

chros73 commented Aug 26, 2016

I made test about the seeding issue again with local disk using 0.9.6: [Rate 2100.5 / 7500.6 KB] ... [U 65/300]
pidstat -p $PID -dl 20 5 (run it 5 times, gives average at every 20 secs, and after the last run)

Note:

  • downloading isn't affected
  • ratio depends on the used upload slots: the more slot is used the more unnecessary read is made

I. if pieces.preload.type.set=0 or pieces.preload.type.set=2

19:09:31      UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
19:09:51     1000     23536  11553.82   7841.68      0.00  /opt/rtorrent/bin/rtorrent
19:10:11     1000     23536   9645.20   7442.00      0.00  /opt/rtorrent/bin/rtorrent
19:10:31     1000     23536   9087.00   6860.80      0.00  /opt/rtorrent/bin/rtorrent
19:10:51     1000     23536   9763.00   7477.20      0.00  /opt/rtorrent/bin/rtorrent
19:11:11     1000     23536   9218.80   6956.40      0.00  /opt/rtorrent/bin/rtorrent
Average:     1000     23536   9853.73   7315.67      0.00  /opt/rtorrent/bin/rtorrent

It's 4.6x more then it's needed.

II. if pieces.preload.type.set=1

19:07:31      UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
19:07:51     1000     23536  14749.60   6930.60      0.00  /opt/rtorrent/bin/rtorrent
19:08:11     1000     23536  15739.80   7864.20      0.00  /opt/rtorrent/bin/rtorrent
19:08:31     1000     23536  16461.80   7450.60      0.00  /opt/rtorrent/bin/rtorrent
19:08:51     1000     23536  15243.00   7157.60      0.00  /opt/rtorrent/bin/rtorrent
19:09:11     1000     23536  15412.80   7462.20      0.00  /opt/rtorrent/bin/rtorrent
Average:     1000     23536  15521.40   7373.04      0.00  /opt/rtorrent/bin/rtorrent

It's 7.4x more then it's needed.

How can we debug this?

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Aug 30, 2016

Has anybody tried out this with master branch version?

I checked, it suffers from the same issue.

... ratio depends on the used upload slots ...

It doesn't, as it turned out: I checked the same instance at a different time and I got better results: [Rate 2100.5 / 7500.6 KB] ... [U 162/300]

12:31:59      UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
12:32:19     1000     23536   3342.60   8174.60      0.00  /opt/rtorrent/bin/rtorrent
12:32:39     1000     23536   3854.00   7937.60      0.00  /opt/rtorrent/bin/rtorrent
12:32:59     1000     23536   2982.80   7713.40      0.00  /opt/rtorrent/bin/rtorrent
12:33:19     1000     23536   2559.00   7169.80      0.00  /opt/rtorrent/bin/rtorrent
12:33:39     1000     23536   2736.20   8156.40      0.00  /opt/rtorrent/bin/rtorrent
Average:     1000     23536   3094.92   7830.36      0.00  /opt/rtorrent/bin/rtorrent

What interesting is the output of dstat -ctdD sdb:

----total-cpu-usage---- ----system---- --dsk/sdb--
usr sys idl wai hiq siq|     time     | read  writ
  8   4  63  17   0   8|27-08 12:34:07|2928k   21M
  2   1   5  86   0   6|27-08 12:34:08| 256k   61M
  6   4  43  45   0   3|27-08 12:34:09|1444k   62M
  3   2  46  43   0   6|27-08 12:34:10|1104k   89M
 12   5  55  24   0   3|27-08 12:34:11|7080k   18M
  7   3  78   8   0   4|27-08 12:34:12|2188k  176k
  5   2  83   5   0   5|27-08 12:34:13|2304k    0
  6   1  74  15   0   4|27-08 12:34:14|2816k    0
  6   3  81   6   0   5|27-08 12:34:15|2436k    0
  6   2  83   5   0   5|27-08 12:34:16|2448k    0

As we can see, writing incoming data is buffered but not reading.

Somebody has to take a look at this issue who is smarter than me. :) @rakshasa ? :)

chros73 commented Aug 30, 2016

Has anybody tried out this with master branch version?

I checked, it suffers from the same issue.

... ratio depends on the used upload slots ...

It doesn't, as it turned out: I checked the same instance at a different time and I got better results: [Rate 2100.5 / 7500.6 KB] ... [U 162/300]

12:31:59      UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
12:32:19     1000     23536   3342.60   8174.60      0.00  /opt/rtorrent/bin/rtorrent
12:32:39     1000     23536   3854.00   7937.60      0.00  /opt/rtorrent/bin/rtorrent
12:32:59     1000     23536   2982.80   7713.40      0.00  /opt/rtorrent/bin/rtorrent
12:33:19     1000     23536   2559.00   7169.80      0.00  /opt/rtorrent/bin/rtorrent
12:33:39     1000     23536   2736.20   8156.40      0.00  /opt/rtorrent/bin/rtorrent
Average:     1000     23536   3094.92   7830.36      0.00  /opt/rtorrent/bin/rtorrent

What interesting is the output of dstat -ctdD sdb:

----total-cpu-usage---- ----system---- --dsk/sdb--
usr sys idl wai hiq siq|     time     | read  writ
  8   4  63  17   0   8|27-08 12:34:07|2928k   21M
  2   1   5  86   0   6|27-08 12:34:08| 256k   61M
  6   4  43  45   0   3|27-08 12:34:09|1444k   62M
  3   2  46  43   0   6|27-08 12:34:10|1104k   89M
 12   5  55  24   0   3|27-08 12:34:11|7080k   18M
  7   3  78   8   0   4|27-08 12:34:12|2188k  176k
  5   2  83   5   0   5|27-08 12:34:13|2304k    0
  6   1  74  15   0   4|27-08 12:34:14|2816k    0
  6   3  81   6   0   5|27-08 12:34:15|2436k    0
  6   2  83   5   0   5|27-08 12:34:16|2448k    0

As we can see, writing incoming data is buffered but not reading.

Somebody has to take a look at this issue who is smarter than me. :) @rakshasa ? :)

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Nov 6, 2016

It sounds/looks like rTorrent is reading larger chunks than needed for each piece request from each seed slot. That would suggest the problem is actually in libtorrent.

ghost commented Nov 6, 2016

It sounds/looks like rTorrent is reading larger chunks than needed for each piece request from each seed slot. That would suggest the problem is actually in libtorrent.

@rakshasa

This comment has been minimized.

Show comment
Hide comment
@rakshasa

rakshasa May 5, 2017

Owner

There is no unneeded copying with mmap.

Owner

rakshasa commented May 5, 2017

There is no unneeded copying with mmap.

@devlo

This comment has been minimized.

Show comment
Hide comment
@devlo

devlo May 5, 2017

Aren't you copying it to the user space buffer (your caching solution) before writing to the socket? which would mean there is a copy.

devlo commented May 5, 2017

Aren't you copying it to the user space buffer (your caching solution) before writing to the socket? which would mean there is a copy.

@rakshasa

This comment has been minimized.

Show comment
Hide comment
@rakshasa

rakshasa May 5, 2017

Owner

No, there is no user-space cache, it is all done using mmap'ed file pages.

Only time there's ever any copying to a user-space buffer is when we deal with encrypted peer connections.

Owner

rakshasa commented May 5, 2017

No, there is no user-space cache, it is all done using mmap'ed file pages.

Only time there's ever any copying to a user-space buffer is when we deal with encrypted peer connections.

@devlo

This comment has been minimized.

Show comment
Hide comment
@devlo

devlo May 5, 2017

I though you use mmap because of copying into user space. Then why you are using mmap instead of sendfile for uploading completed pieces? If you do not touch the data in user space (unless dealing with encrypted peer connections) then what's the point?

devlo commented May 5, 2017

I though you use mmap because of copying into user space. Then why you are using mmap instead of sendfile for uploading completed pieces? If you do not touch the data in user space (unless dealing with encrypted peer connections) then what's the point?

@rakshasa

This comment has been minimized.

Show comment
Hide comment
@rakshasa

rakshasa May 6, 2017

Owner

Because bittorrent sends maximum 16kb per piece message, and using mmap'ed files with sockets is not really that different from what sendfile ends up doing.

Owner

rakshasa commented May 6, 2017

Because bittorrent sends maximum 16kb per piece message, and using mmap'ed files with sockets is not really that different from what sendfile ends up doing.

@devlo

This comment has been minimized.

Show comment
Hide comment
@devlo

devlo May 6, 2017

I've swept through the source code and based on that and many straces we did, you are making three times more syscalls and three times more context switches with current design. I don't see any reason or advantage for using mmap in this case, only disadvantage. If you are doing 3 torrents on 1 mbit connection with 5 peers then sure you are right, difference is not noticeable but we are not in that category. For us it's a difference and for many people here for many months some other design choices were an issue and still are. For us it's not "reasonable" what you wrote couple of comments ago, wasting so much hardware without any clear benefit is a long way from reasonable on our scale. It seems we are not alone seeing other comments in this issue. Based on what you wrote it doesn't seem like my use case and needs will be satisfied by rtorrent anytime soon. But of course it's not your fault, we do not pay you for what you do, we can't demand anything from you and you can design rtorrent as you please. In the end it's my own fault, I just poorly choose bittorrent client. I thought rtorrent was the right tool for the job but I was wrong, it just doesn't scale. I just need to research alternatives and change the client on all our servers or even better just get some additional people from the team and tailor something that will satisfy our needs. Don't get me wrong, it doesn't make rtorrent a bad client, it's a good software, a lot of people are using it and it works for them, it just doesn't work for us anymore. Anyway, thanks for your time, I will not bother you anymore.

devlo commented May 6, 2017

I've swept through the source code and based on that and many straces we did, you are making three times more syscalls and three times more context switches with current design. I don't see any reason or advantage for using mmap in this case, only disadvantage. If you are doing 3 torrents on 1 mbit connection with 5 peers then sure you are right, difference is not noticeable but we are not in that category. For us it's a difference and for many people here for many months some other design choices were an issue and still are. For us it's not "reasonable" what you wrote couple of comments ago, wasting so much hardware without any clear benefit is a long way from reasonable on our scale. It seems we are not alone seeing other comments in this issue. Based on what you wrote it doesn't seem like my use case and needs will be satisfied by rtorrent anytime soon. But of course it's not your fault, we do not pay you for what you do, we can't demand anything from you and you can design rtorrent as you please. In the end it's my own fault, I just poorly choose bittorrent client. I thought rtorrent was the right tool for the job but I was wrong, it just doesn't scale. I just need to research alternatives and change the client on all our servers or even better just get some additional people from the team and tailor something that will satisfy our needs. Don't get me wrong, it doesn't make rtorrent a bad client, it's a good software, a lot of people are using it and it works for them, it just doesn't work for us anymore. Anyway, thanks for your time, I will not bother you anymore.

@rakshasa

This comment has been minimized.

Show comment
Hide comment
@rakshasa

rakshasa May 6, 2017

Owner

BitTorrent is a protocol created for the generic purpose of file sharing in a non-homogeneous swarm of untrusted peers, and it seems like you want a solution optimized for a controlled environment with clear top-down transfer of data.

Owner

rakshasa commented May 6, 2017

BitTorrent is a protocol created for the generic purpose of file sharing in a non-homogeneous swarm of untrusted peers, and it seems like you want a solution optimized for a controlled environment with clear top-down transfer of data.

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 May 6, 2017

If you are doing ... on 1 mbit connection with ... peers then sure you are right, difference is not noticeable

Today's connection speed (even home connections) is not like it was 15 years ago: 100-500-1000 Mb/s. That makes this issue a problem.

chros73 commented May 6, 2017

If you are doing ... on 1 mbit connection with ... peers then sure you are right, difference is not noticeable

Today's connection speed (even home connections) is not like it was 15 years ago: 100-500-1000 Mb/s. That makes this issue a problem.

@devlo

This comment has been minimized.

Show comment
Hide comment
@devlo

devlo May 7, 2017

BitTorrent is a protocol created for the generic purpose of file sharing in a non-homogeneous swarm of untrusted peers, and it seems like you want a solution optimized for a controlled environment with clear top-down transfer of data.

Protocol specification doesn't require anything that would make it inefficient for high traffic/high performance environments. It's all about implementation, that's all.

Today's connection speed (even home connections) is not like it was 15 years ago: 100-500-1000 Mb/s. That makes this issue a problem.

Indeed. On boxes with 2x10 gbe cards you can see it even more clearly. In future linux kernel itself will be the bottleneck in current shape if you stick 40 or even 100 gbe cards. With one 10 gbe you have around 1 us (microsecond) between 1538 byte packets, for 40 gbe it's around 300 ns and for 100 gbe it's around 120 ns. From around kernel 4.8 you get eXpress Data Path that is used for example by facebook for packet processing before they go into the kernel, but it's a solution more oriented around firewalls/ddos filtering etc. It gives you direct access to Tx/Rx buffers. You also can get IRQ storms so it's better to do polling directly on DMA buffers instead of relying on interrupts if you have steady flow of traffic that will not starve your loop. What I see promising are ideas about making layer-7 protocols proxying inside the kernel. At the moment if you want to get 40/100 gbe line rate with non trivial data processing then you probably need to use some kind of kernel bypass like netmap kernel module that cloudflare is using or user space network stack like DPDK that only needs around 80 cycles per packet. Anyway, future looks promising for linux kernel network stack. You could build new bittorrent client around one of those solutions, depending on needs.

devlo commented May 7, 2017

BitTorrent is a protocol created for the generic purpose of file sharing in a non-homogeneous swarm of untrusted peers, and it seems like you want a solution optimized for a controlled environment with clear top-down transfer of data.

Protocol specification doesn't require anything that would make it inefficient for high traffic/high performance environments. It's all about implementation, that's all.

Today's connection speed (even home connections) is not like it was 15 years ago: 100-500-1000 Mb/s. That makes this issue a problem.

Indeed. On boxes with 2x10 gbe cards you can see it even more clearly. In future linux kernel itself will be the bottleneck in current shape if you stick 40 or even 100 gbe cards. With one 10 gbe you have around 1 us (microsecond) between 1538 byte packets, for 40 gbe it's around 300 ns and for 100 gbe it's around 120 ns. From around kernel 4.8 you get eXpress Data Path that is used for example by facebook for packet processing before they go into the kernel, but it's a solution more oriented around firewalls/ddos filtering etc. It gives you direct access to Tx/Rx buffers. You also can get IRQ storms so it's better to do polling directly on DMA buffers instead of relying on interrupts if you have steady flow of traffic that will not starve your loop. What I see promising are ideas about making layer-7 protocols proxying inside the kernel. At the moment if you want to get 40/100 gbe line rate with non trivial data processing then you probably need to use some kind of kernel bypass like netmap kernel module that cloudflare is using or user space network stack like DPDK that only needs around 80 cycles per packet. Anyway, future looks promising for linux kernel network stack. You could build new bittorrent client around one of those solutions, depending on needs.

@rakshasa

This comment has been minimized.

Show comment
Hide comment
@rakshasa

rakshasa May 8, 2017

Owner

Protocol specification doesn't require anything that would make it inefficient for high traffic/high performance environments. It's all about implementation, that's all.

It is not the protocol itself that is the major issue, it is the non-homogeneous swarm.

In most cases there is no way to predict what piece chunk the peer is going to request next, and while it is (mostly) going to work through a whole piece requesting chunks sequentially, that is not a guarantee. And once finished with a whole piece most clients randomize the next piece index to request.

The stuff you talk about above is a separate issue, one that can be approached if there are cases of the rtorrent process maxing out cpu usage rather than io.

Owner

rakshasa commented May 8, 2017

Protocol specification doesn't require anything that would make it inefficient for high traffic/high performance environments. It's all about implementation, that's all.

It is not the protocol itself that is the major issue, it is the non-homogeneous swarm.

In most cases there is no way to predict what piece chunk the peer is going to request next, and while it is (mostly) going to work through a whole piece requesting chunks sequentially, that is not a guarantee. And once finished with a whole piece most clients randomize the next piece index to request.

The stuff you talk about above is a separate issue, one that can be approached if there are cases of the rtorrent process maxing out cpu usage rather than io.

@devlo

This comment has been minimized.

Show comment
Hide comment
@devlo

devlo May 8, 2017

In most cases there is no way to predict what piece chunk the peer is going to request next, and while it is (mostly) going to work through a whole piece requesting chunks sequentially, that is not a guarantee. And once finished with a whole piece most clients randomize the next piece index to request.

I've wrote it in my comment 16 days ago in this issue.

The stuff you talk about above is a separate issue, one that can be approached if there are cases of the rtorrent process maxing out cpu usage rather than io

Yes, and this issue will arise if you switch to 40/100 gbe cards today.

I don't know yet (as bittorrent usage is declining year after year making it minority of our business), but we had one early stage of designing/planning new internal client that would fit our needs, linux only (4.9 kernel minimum), multi threaded with share nothing design, NUMA aware, future proof (scaling up to 2x40 gbe cards with minimum overhead).

Main point to all this is not to waste resources unnecessary, which rtorrent does and this issue confirms that.

devlo commented May 8, 2017

In most cases there is no way to predict what piece chunk the peer is going to request next, and while it is (mostly) going to work through a whole piece requesting chunks sequentially, that is not a guarantee. And once finished with a whole piece most clients randomize the next piece index to request.

I've wrote it in my comment 16 days ago in this issue.

The stuff you talk about above is a separate issue, one that can be approached if there are cases of the rtorrent process maxing out cpu usage rather than io

Yes, and this issue will arise if you switch to 40/100 gbe cards today.

I don't know yet (as bittorrent usage is declining year after year making it minority of our business), but we had one early stage of designing/planning new internal client that would fit our needs, linux only (4.9 kernel minimum), multi threaded with share nothing design, NUMA aware, future proof (scaling up to 2x40 gbe cards with minimum overhead).

Main point to all this is not to waste resources unnecessary, which rtorrent does and this issue confirms that.

@Ondjultomte

This comment has been minimized.

Show comment
Hide comment
@Ondjultomte

Ondjultomte May 27, 2017

More data showing extreme exessive reads from rtorrent
I have one host running rtorrent writing to a fileserver via nfs. (10 Gbit network)

I tried a torrent with 1k seeders no leechers so the rtorrent didnt do much ul only dl.
I have rtorrent dl a torrent and writing to a nfs share. This is the internet/WAN from the router
[img]https://www.dropbox.com/s/6s9ru1cpzc7qns5/rtorrentWAN.PNG?dl=0 [/img]
https://www.dropbox.com/s/6s9ru1cpzc7qns5/rtorrentWAN.PNG?dl=0

This is the LAN traffic on the host that runs rtorrent
Normal traffic is the incomming 400Mbit from the swarm and the writing that data to the NFS share at also 400Mbit. But we are seeing 5Gbit reads from the NFS ! even tho there is no real seeding going on!

[img]https://www.dropbox.com/s/no3zvzvgjf4p4jv/rtorrent.PNG?dl=0 [/img]
https://www.dropbox.com/s/no3zvzvgjf4p4jv/rtorrent.PNG?dl=0

Ondjultomte commented May 27, 2017

More data showing extreme exessive reads from rtorrent
I have one host running rtorrent writing to a fileserver via nfs. (10 Gbit network)

I tried a torrent with 1k seeders no leechers so the rtorrent didnt do much ul only dl.
I have rtorrent dl a torrent and writing to a nfs share. This is the internet/WAN from the router
[img]https://www.dropbox.com/s/6s9ru1cpzc7qns5/rtorrentWAN.PNG?dl=0 [/img]
https://www.dropbox.com/s/6s9ru1cpzc7qns5/rtorrentWAN.PNG?dl=0

This is the LAN traffic on the host that runs rtorrent
Normal traffic is the incomming 400Mbit from the swarm and the writing that data to the NFS share at also 400Mbit. But we are seeing 5Gbit reads from the NFS ! even tho there is no real seeding going on!

[img]https://www.dropbox.com/s/no3zvzvgjf4p4jv/rtorrent.PNG?dl=0 [/img]
https://www.dropbox.com/s/no3zvzvgjf4p4jv/rtorrent.PNG?dl=0

@colinhd8

This comment has been minimized.

Show comment
Hide comment
@colinhd8

colinhd8 May 30, 2017

Have the same problem(local raid0 disk, 200torrents~, ul at 20MiB/s, but read at 80-120MiB/s), hope it will be fixed soon.

colinhd8 commented May 30, 2017

Have the same problem(local raid0 disk, 200torrents~, ul at 20MiB/s, but read at 80-120MiB/s), hope it will be fixed soon.

@plutohiyo

This comment has been minimized.

Show comment
Hide comment
@plutohiyo

plutohiyo Jun 4, 2017

Have the same problem
2Tx2 soft raid0 disk, 30torrents, ul at 30MiB/s, but read at 70-110MiB/s, hope it will be fixed soon.

I have 16G memory and the memry useage is low.

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27563 xxxx+ 20 0 4256808 816736 694652 D 11.3 5.0 341:44.81 rtorrent main

pieces.memory.max.set = 12000M
pieces.preload.type.set = 1
network.receive_buffer.size.set = 64M
network.send_buffer.size.set = 64M
debian 8

plutohiyo commented Jun 4, 2017

Have the same problem
2Tx2 soft raid0 disk, 30torrents, ul at 30MiB/s, but read at 70-110MiB/s, hope it will be fixed soon.

I have 16G memory and the memry useage is low.

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27563 xxxx+ 20 0 4256808 816736 694652 D 11.3 5.0 341:44.81 rtorrent main

pieces.memory.max.set = 12000M
pieces.preload.type.set = 1
network.receive_buffer.size.set = 64M
network.send_buffer.size.set = 64M
debian 8

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Jun 4, 2017

I have 16G memory and the memry useage is low.

That's normal, that's the reason why.

chros73 commented Jun 4, 2017

I have 16G memory and the memry useage is low.

That's normal, that's the reason why.

@plutohiyo

This comment has been minimized.

Show comment
Hide comment
@plutohiyo

plutohiyo Jun 4, 2017

@chros73 Thanks, I understand now.

plutohiyo commented Jun 4, 2017

@chros73 Thanks, I understand now.

@Ondjultomte

This comment has been minimized.

Show comment
Hide comment
@Ondjultomte

Ondjultomte Jun 24, 2017

Any news here?

Ondjultomte commented Jun 24, 2017

Any news here?

@ghartz

This comment has been minimized.

Show comment
Hide comment
@ghartz

ghartz Aug 5, 2017

I eventually found a workaround regarding this issue for my use of rtorrent. Hopefully others might find this useful too! :)

I run rtorrent on servers having 12x4TB in raid6 (mdadm / 512K chunk) with dozens of running rtorrent per server. Before the workaround a typical day was an average >500MB/s read from disks for less 35MB/s of actual seed. IOwait was at over 30-40% easily.
Now for 40-50MB/s read from disks the seed is at 25-30MB/s with IOwait <10%.

Magic numbers are:
pieces.preload.type.set = 1
pieces.preload.min_size.set = 1
pieces.preload.min_rate.set = 1

I'm not really sure about the reason why it's working (didn't dig into the code...) but the numbers are working for me.

ghartz commented Aug 5, 2017

I eventually found a workaround regarding this issue for my use of rtorrent. Hopefully others might find this useful too! :)

I run rtorrent on servers having 12x4TB in raid6 (mdadm / 512K chunk) with dozens of running rtorrent per server. Before the workaround a typical day was an average >500MB/s read from disks for less 35MB/s of actual seed. IOwait was at over 30-40% easily.
Now for 40-50MB/s read from disks the seed is at 25-30MB/s with IOwait <10%.

Magic numbers are:
pieces.preload.type.set = 1
pieces.preload.min_size.set = 1
pieces.preload.min_rate.set = 1

I'm not really sure about the reason why it's working (didn't dig into the code...) but the numbers are working for me.

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Aug 5, 2017

a typical day was an average >500MB/s read from disks for less 35MB/s of actual seed

:D
With the above settings you force libtorrent to almost always use madvise (and here) type of preloading.

The real question is why it works for you, and why not pieces.preload.type.set = 2 :)

a. How much RAM do you have in those boxes?
b. Have you modified the following settings?

pieces.memory.max
network.receive_buffer.size
network.send_buffer.size

c. Which version of rtorrent/libtorrent?

chros73 commented Aug 5, 2017

a typical day was an average >500MB/s read from disks for less 35MB/s of actual seed

:D
With the above settings you force libtorrent to almost always use madvise (and here) type of preloading.

The real question is why it works for you, and why not pieces.preload.type.set = 2 :)

a. How much RAM do you have in those boxes?
b. Have you modified the following settings?

pieces.memory.max
network.receive_buffer.size
network.send_buffer.size

c. Which version of rtorrent/libtorrent?

@ghartz

This comment has been minimized.

Show comment
Hide comment
@ghartz

ghartz Aug 5, 2017

During my testing, I noticed that pieces.preload.type.set = 0 and pieces.preload.type.set = 2 ended up with more iops than pieces.preload.type.set = 1 (with defaults pieces.preload.min_size.set pieces.preload.min_rate.set) but still a lot of iops. It's only the below settings that worked:
pieces.preload.type.set = 1
pieces.preload.min_size.set = 1
pieces.preload.min_rate.set = 1

Note that I didn't test:
pieces.preload.type.set = 2
pieces.preload.min_size.set = 1
pieces.preload.min_rate.set = 1

I'll try this setting sometime in next upcoming days to test it and maybe lower a little more the iops :)
By the way, you look surprised that madvise provides better values than direct pagging. Could you elaborate on it? Why direct pagging is expected to be better?

Regarding your questions:
a. 64GB memory with <20GB used with all rtorrent clients running ( having php-fpm/nginx each)
b. "Slightly" modified:

network.receive_buffer.size.set = 64M
network.send_buffer.size.set = 64M
pieces.memory.max.set = 5120M

I'm not quite sure regarding the benefits of such huge buffer to be honest.

c. latest version from feature-bind branch: rtorrent/3ab1c69 and libtorrent/c38ec6f

ghartz commented Aug 5, 2017

During my testing, I noticed that pieces.preload.type.set = 0 and pieces.preload.type.set = 2 ended up with more iops than pieces.preload.type.set = 1 (with defaults pieces.preload.min_size.set pieces.preload.min_rate.set) but still a lot of iops. It's only the below settings that worked:
pieces.preload.type.set = 1
pieces.preload.min_size.set = 1
pieces.preload.min_rate.set = 1

Note that I didn't test:
pieces.preload.type.set = 2
pieces.preload.min_size.set = 1
pieces.preload.min_rate.set = 1

I'll try this setting sometime in next upcoming days to test it and maybe lower a little more the iops :)
By the way, you look surprised that madvise provides better values than direct pagging. Could you elaborate on it? Why direct pagging is expected to be better?

Regarding your questions:
a. 64GB memory with <20GB used with all rtorrent clients running ( having php-fpm/nginx each)
b. "Slightly" modified:

network.receive_buffer.size.set = 64M
network.send_buffer.size.set = 64M
pieces.memory.max.set = 5120M

I'm not quite sure regarding the benefits of such huge buffer to be honest.

c. latest version from feature-bind branch: rtorrent/3ab1c69 and libtorrent/c38ec6f

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Aug 7, 2017

I noticed that pieces.preload.type.set = 0 and pieces.preload.type.set = 2 ended up with more iops than pieces.preload.type.set = 1 (with defaults pieces.preload.min_size.set pieces.preload.min_rate.set)
Could you elaborate ... ?

I posted above that it was completely the opposite for me: madvise gave the worst result of all (note that I only used the default values for min_size and min_rate).

4GB RAM, (local pc, 1 hdd, ext4)
pieces.memory.max.set = 2048M
network.receive_buffer.size.set = 4M
network.send_buffer.size.set = 12M

Why direct pagging is expected to be better?

What rakshasa mentioned in the above linked comment, but we are not sure anymore :)

chros73 commented Aug 7, 2017

I noticed that pieces.preload.type.set = 0 and pieces.preload.type.set = 2 ended up with more iops than pieces.preload.type.set = 1 (with defaults pieces.preload.min_size.set pieces.preload.min_rate.set)
Could you elaborate ... ?

I posted above that it was completely the opposite for me: madvise gave the worst result of all (note that I only used the default values for min_size and min_rate).

4GB RAM, (local pc, 1 hdd, ext4)
pieces.memory.max.set = 2048M
network.receive_buffer.size.set = 4M
network.send_buffer.size.set = 12M

Why direct pagging is expected to be better?

What rakshasa mentioned in the above linked comment, but we are not sure anymore :)

@Ondjultomte

This comment has been minimized.

Show comment
Hide comment
@Ondjultomte

Ondjultomte Aug 10, 2017

Regarding IO efficiency and speed, when 35 MBps upload requires 500 MBps disc reads we have long passed sane speed optimizations.

It seems to me that the disc readahead cache is optimized for very slow broadband links ie first generations of dsl and docsis. Not for todays FE and GE connections, with FTTH and with colo/vPC

Ondjultomte commented Aug 10, 2017

Regarding IO efficiency and speed, when 35 MBps upload requires 500 MBps disc reads we have long passed sane speed optimizations.

It seems to me that the disc readahead cache is optimized for very slow broadband links ie first generations of dsl and docsis. Not for todays FE and GE connections, with FTTH and with colo/vPC

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Aug 11, 2017

when 35 MBps upload requires 500 MBps ...

It probably happened due to the multiple rtorrent instances. If it's the case then it makes sence why the madvise type of caching helped and not the other 2: it uses the OS caching.

chros73 commented Aug 11, 2017

when 35 MBps upload requires 500 MBps ...

It probably happened due to the multiple rtorrent instances. If it's the case then it makes sence why the madvise type of caching helped and not the other 2: it uses the OS caching.

@Nottt

This comment has been minimized.

Show comment
Hide comment
@Nottt

Nottt Oct 26, 2017

I'm seem this too, just if any random person drops here...this issue has probably not been fixed. Don't try to use rutorrent...it will be a waste of time :(

I have rutorrent downloading to a mergerfs pool of cifs drives...

Nottt commented Oct 26, 2017

I'm seem this too, just if any random person drops here...this issue has probably not been fixed. Don't try to use rutorrent...it will be a waste of time :(

I have rutorrent downloading to a mergerfs pool of cifs drives...

@bcronce

This comment has been minimized.

Show comment
Hide comment
@bcronce

bcronce Jan 14, 2018

Possibly in regards to both read amplification and reading when writing, I wonder if there is a block-size mismatch.

eg Reading when writing issue:
If the FS has 8KiB blocks, but you only write 4KiB of data, then the FS has to read the 8KiB block from the HD, change 4KiB of the block, then write it back out. This would also have the unhealthy side-effect of write amplification for SSDs.

eg Read amplification
Reading a subset of a FS block requires reading the entire block and then discarding the rest. Seeing that torrent's IO pattern tends to be fairly random, disk caches could be thrashing.

Could be something as simple as tuning filestream buffer sizes.

bcronce commented Jan 14, 2018

Possibly in regards to both read amplification and reading when writing, I wonder if there is a block-size mismatch.

eg Reading when writing issue:
If the FS has 8KiB blocks, but you only write 4KiB of data, then the FS has to read the 8KiB block from the HD, change 4KiB of the block, then write it back out. This would also have the unhealthy side-effect of write amplification for SSDs.

eg Read amplification
Reading a subset of a FS block requires reading the entire block and then discarding the rest. Seeing that torrent's IO pattern tends to be fairly random, disk caches could be thrashing.

Could be something as simple as tuning filestream buffer sizes.

@phxn

This comment has been minimized.

Show comment
Hide comment
@phxn

phxn Sep 27, 2018

Is this fixed in 0.9.7? I suppose not as this would be closed...

phxn commented Sep 27, 2018

Is this fixed in 0.9.7? I suppose not as this would be closed...

@chros73

This comment has been minimized.

Show comment
Hide comment
@chros73

chros73 Sep 28, 2018

No, it isn't yet.

chros73 commented Sep 28, 2018

No, it isn't yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment