Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeding a million torrents - two major issues #63

Closed
dessalines opened this issue Aug 2, 2015 · 22 comments
Closed

Seeding a million torrents - two major issues #63

dessalines opened this issue Aug 2, 2015 · 22 comments

Comments

@dessalines
Copy link

@arvidn @aldenml

Hey guys, I've hit a brick wall with a libtorrent issue I'm having, trying for over a month to fix this issue on my own, but I'm coming up empty handed. I'm currently using jlibtorrent's java impl of libtorrent 1.0.5.0 .

I've used both of these articles as reference points for my testing and settings:
http://www.libtorrent.org/tuning.html
http://blog.libtorrent.org/2012/01/seeding-a-million-torrents/

My project is called TorrentTunes , a distributed torrent-based music service where I'm basically turning each song into a .torrent file, tagging it with musicbrainz, and sharing it on an open bittorrent tracker. Here's a beta of the website up and running.

Note: I'm not using automanage for these(after extensive testing I found it inadequate for making all the files highly available).

The two issues:

  1. Libtorrent will stall/freeze if attempting to add too many torrents at once.
    I have ~10k torrents in my library, and when I loop over all of them running torrent.resume(), libtorrent basically stalls or becomes unresponsive, and stops adding them after torrent # 1200 or so.

My solution to this(although it took me a few weeks to figure out), is to wait for a tracker reply before continuing to the next torrent, then pausing that torrent. Then after all the torrents have received replies and have been paused, then loop over all of them and run torrent.resume(). I don't know if the reason this works is memory concerns, or what, but its the only way I've been able to seed more than 1200 or so torrents.

  1. I still cannot seed more than about 7000 torrents.
    This solution can bring the # of torrents I'm able to seed from 1200 to 7000 torrents. If I try this method for torrents more than 7000, libtorrent simply stops requesting tracker replies for any of the torrents, and every torrent becomes unavailable.

Here are my settings for reference:

sessionSettings.setActiveDownloads(10);
sessionSettings.setActiveLimit(999999);
sessionSettings.setActiveSeeds(999999);
sessionSettings.setActiveTrackerLimit(999999);
sessionSettings.setUploadRateLimit(0);
sessionSettings.setDownloadRateLimit(0);
session.stopLSD();
session.stopDHT();
sessionSettings.announceDoubleNAT(true);
sessionSettings.setPeerConnectTimeout(60);
sessionSettings.useReadCache(false);
sessionSettings.setMaxPeerlistSize(500);
sessionSettings.setHalgOpenLimit(5);
sessionSettings.setMixedModeAlgorithm(BandwidthMixedAlgo.PEER_PROPORTIONAL);
sessionSettings.setMinAnnounceInterval(1740);
sessionSettings.setFilePoolSize(200000);
sessionSettings.setIncomingStartsQueuedTorrents(true);
sessionSettings.setTrackerCompletionTimeout(10);

I've tried asking this on stackoverflow, but haven't received anything helpful. Thanks in advance for any help you can give me.

@aldenml
Copy link
Contributor

aldenml commented Aug 2, 2015

Hi @tchoulihan, I think jlibtorrent could be the problem, but not sure if it's related to something in libtorrent or not. I recommend you move this issue to jlibtorrent and close this one, until we have a way to isolate the issue in the native part (if there is any). Regards

@ruslanfedoseenko
Copy link

Sorry if I'm not right person who you actually asking.

  1. If you want Active torrent or active seeds limits to be infine just set -1.
  2. I dont actually looked at your code but probably it has problems with updating UI.
    If you store list of torrent_handel's and use them for getting info about torrent than your situation sounds real. Any call of torrent_handle::status or any other function will make some load to network thread. To update torrent's information effectivly you should handle state_update_alert.

@dessalines
Copy link
Author

Hey @aldenml , in a few hours I'll try to mock up a simple example of creating a ton of fake torrents, and try to recreate the problem with jlibtorrent. Once that's done I'll open up the issue over there if I still get the issue; I can't rule out libtorrent as a problem though unless someone has actually tried seeding a million torrents with libtorrent.

I've asked around, but I haven't talked with anyone who has successfully done it yet.

@dessalines
Copy link
Author

@ruslanfedoseenko, I don't think its a UI issue, because I never call torrent.getstatus. I use state alert updates for everything.

@aldenml
Copy link
Contributor

aldenml commented Aug 2, 2015

Good. The info from @ruslanfedoseenko is interesting to me.

@arvidn
Copy link
Owner

arvidn commented Aug 2, 2015

The problem with this question (and I believe I said this on the mailing list a while ago) is that we don't know what the problem is, so I can't propose a solution. The only thing I can do is to help you diagnose the problem (i.e. identify what it is). The best way of doing this is to build libtorrent with session stats logging, and then generate graphs from those logs. At least there's a chance that bottlenecks can be identified from that (but it's not always straight-forward and sometimes may require quite a lot of analysis).

I understand that auto-managed is not very scalable. In fact, still in trunk every time it determines which torrents to have started and paused, it builds a list of all torrents and sorts them. This is expensive. However, in master at least only auto managed torrents are part of this process, so it's basically free when you don' have any auto-managed torrents. in 1.0.x it still loops over all torrents to find out which ones are auto-managed and not.

1.0.x is not very well optimized for having many torrents. I would expect there to be lots of bottlenecks to hit. One that comes to mind is the single disk I/O thread. For every torrent you add, libtorrent will stat all the files in it (which takes a lot of time). If you don't have resume data, it will even read the whole file to compare it to the piece hashes. The one way to avoid this cost is to add the torrent in seed_mode (where you promise you're seeding it).

There are a lot of other aspects of 1.0.x that do not scale very well, like ticking every torrent every second, and trying to connect to more peers etc. I don't know why the tracker response would make a difference in your case. Perhaps all the tracker announces start using a lot of RAM, but they should all be happening in parallel. The only way to know is to profile, either with a regular CPU and memory profiler and with the built-in session stats.

If you control the tracker, really what you should do is to make it always include your seed(s) in its responses, and have your seed not have to announce to it. That would save the seed a lot of time and bandwidth.

as for (2), you mean libtorrent stops announcing to trackers? Are you sure the torrents aren't stopped/paused? Do you have any error alerts? Really though, I don't see a good reason for you to announce to the tracker. The tracker announces in libtorrent aren't deliberately spread out over time, to balance the load. If you add all torrents at the same time, they will all announce immediately, which will cause a load spike and it will keep repeating every announce interval. (some trackers add some randomness to reannounce intervals, and so does libtorrent, but that will only spread it out so much).

However, I went through a lot of this a few years ago and fixed master. You really should just use master. (If you do, please let me know if there are any issues in there you would consider blockers for releasing a stable build from there).

As for whether jlibtorrent causes some problems, that's definitely possible. Especially if it uses some of the older APIs (and some of the new ones are only available in master). You may want to check out these posts as well:

http://blog.libtorrent.org/2012/12/principles-of-high-performance-programs/
http://blog.libtorrent.org/2011/11/scalable-interfaces/

As for your settings. All of the ActiveDownloads, ActiveLimit ActiveSeeds only apply to auto managed torrents. If none of your torrents are, they won't affect anything.

The half-open limit feature is deprecated and has been removed in master. It's also a very complicated feature that I don't have a lot of confidence in working 100% correctly. You may want to disable that (i.e. set it to infinite) and see if the tracker announces still get stuck. (it's only useful for a brief window of windows XP SP2 up until Vista SP2, starting with vista SP2 microsoft removed this silly limitation). It's possible that it interacts poorly with the 10 second tracker timeout for instance. Because TCP connections will get queued and it's possible the 10 second timeout doesn't account for that.

instead of stopping LSD and DHT, you can avoid starting it by passing in flags 0 to the session constructor. (it will prevent you from requesting nodes from your DHT router node every time on startup).

Is there a reason to disable the read cache? Did you find it perform poorly?

@arvidn
Copy link
Owner

arvidn commented Aug 2, 2015

I would probably suggest organizing songs as a torrent per album, or some kind of aggregation. You can still just download a few files from a torrent, but you'll cut a factor out of the overhead associated with a single torrent.

@dessalines
Copy link
Author

@arvidn Thanks for all of this, it's extremely helpful.

The best way of doing this is to build libtorrent with session stats logging, and then generate graphs from those logs. At least there's a chance that bottlenecks can be identified from that (but it's not always straight-forward and sometimes may require quite a lot of analysis).

I'm not sure if there's any way to turn on session stats through the jlibtorrent layer I'm using. Perhaps @aldenml could chime in and I can figure out how to do this. I realize that I'm asking for help, and you need more info to provide it, so I need to try to come up with those logs in some way.

However, in master at least only auto managed torrents are part of this process, so it's basically free when you don' have any auto-managed torrents.

In my testing, I found that automanage, even when having only a small number of torrents active, did not alleviate the stalling problems. The only thing that would, was when I actively pause them after receiving a reply or checking them. I'm not sure if its a memory or disk issue.

If you don't have resume data, it will even read the whole file to compare it to the piece hashes. The one way to avoid this cost is to add the torrent in seed_mode (where you promise you're seeding it).

I do use resume data, but I'll try setting this seed_mode flag(something to be added to jlibtorrent, not your problem), and see if it can fix my problems.

If you control the tracker, really what you should do is to make it always include your seed(s) in its responses, and have your seed not have to announce to it. That would save the seed a lot of time and bandwidth.

I don't control the tracker, I'm using open.demonii.com . But still, how do I avoid announcing to a tracker? I've looked through the documentation many times, and can't figure out how to prevent a torrent_handle from announcing after its been resumed. I had assumed though, that this occasional announcing was necessary, or the tracker would forget you as a peer. For my purposes, I just extended the min_announce_interval to above what it usually is.

At some point though, I have to receive that tracker reply for every torrent, because otherwise, when other clients add that torrent, they won't get that peer, since it hasn't announced. Am I correct there, or does DHT get around this problem in a way that I don't fully understand?

However, I went through a lot of this a few years ago and fixed master. You really should just use master. (If you do, please let me know if there are any issues in there you would consider blockers for releasing a stable build from there).

I would bet that those fixes you made are in the version I'm using, libtorrent_RC_1.0.5.0

The half-open limit feature is deprecated and has been removed in master.

I will remove this. I'd suggest also updating the libtorrent/tuning doc to reflect this.

Is there a reason to disable the read cache? Did you find it perform poorly?

I can try removing that too, it's probably just a leftover setting from when I was spamming pretty much every setting I could find trying to get it to work.

I would probably suggest organizing songs as a torrent per album, or some kind of aggregation.
I'm really wanting to fight for single song torrents, for several reasons:

  • Songs can be on multiple albums(sometimes as many as 20 albums!)
  • Users can download individual songs without having to deal with selecting songs within an album torrent(whether that selection be automatic or manual)
  • Data on song seeders is simplified (because each torrent is a song). If the torrent were per album or group, the seeder info could potentially be wrong.

@arvidn
Copy link
Owner

arvidn commented Aug 2, 2015

I'm not sure if there's any way to turn on session stats through the jlibtorrent layer I'm using. Perhaps @aldenml could chime in and I can figure out how to do this.

It's a compile time switch. You have to build libtorrent with statistics=on (in master it's controllable programmatically and no longer a compile-time switch, but instead you have to write the values to disk yourself). When it's enabled in 1.0.x, the current working directory must be writeable by libtorrent. It will create a director with the logs there.

In my testing, I found that automanage, even when having only a small number of torrents active, did not alleviate the stalling problems.

Yeah, I don't think that's the bottleneck you're hitting.

I don't control the tracker, I'm using open.demonii.com . But still, how do I avoid announcing to a tracker? I've looked through the documentation many times, and can't figure out how to prevent a torrent_handle from announcing after its been resumed.

you can remove the trackers from the torrent_info before adding it (or the &tr= keys if you're adding magnet links).

I had assumed though, that this occasional announcing was necessary, or the tracker would forget you as a peer. For my purposes, I just extended the min_announce_interval to above what it usually is.

That's right. That's why you can only apply this optimization if you control the tracker, to make it always know about you regardless of if you announce or not.

At some point though, I have to receive that tracker reply for every torrent, because otherwise, when other clients add that torrent, they won't get that peer, since it hasn't announced. Am I correct there

That's right.

or does DHT get around this problem in a way that I don't fully understand?

The DHT is like its own tracker. You still need to announce to it regularly. The main distinction is that announcing to the DHT is more expensive than a tracker (but the cost is possibly more distributed). The original DHT requires you to reannounce every 15 minutes, libtorrent relaxes this a bit to 30 minutes. Busy trackers can set the reannounce interval to a lot longer though. 1 or 2 hours.

I would bet that those fixes you made are in the version I'm using, libtorrent_RC_1.0.5.0

No, that's still the 1.0.x branch. master is still unreleased and unofficially the 1.1.x branch. I would really like to release a stable build from master though. I've spent a lot of time lately on its test coverage to make sure it's stable enough.

@dessalines
Copy link
Author

Ah, it seems that its unavoidable for me then; I have to announce every torrent on the tracker periodically.

I read an article that you wrote about the difference between peer-centric networks, and content-centric networks. Wouldn't it be possible to combine these things. IE, if the DHT doesn't already do this, use the DHT to get a list of active torrents from other peers, after only receiving the reply from one torrent?

This would mean you'd only potentially have to only announce 1 torrent, but could get information about thousands. I suppose this functionality could be done with either the DHT or trackers(maybe it is and I'm just ignorant of it). Certainly you use some of these methods like PEX to find extra peers, I could see the same type of functionality being used to find more content.

Currently, in my project I have both DHT and LSD turned off, because they were too expensive network-wise, and now just announce to a single UDP tracker.

No, that's still the 1.0.x branch. master is still unreleased and unofficially the 1.1.x branch.

Ah gotcha, I'll have to wait for the jlibtorrent devs to catch up to that one once you do a proper release.

@dessalines
Copy link
Author

Hey @arvidn, @aldenml successfully got benchmarking working with jlibtorrent as you might know, and now I have a file_access.log file.

I tried to read here, http://www.libtorrent.org/tuning.html#benchmarking , to see how to interpret the log(I can't read the file)

I'm trying to run the parse_disk_log.py on a file_access.log file, and its not working. Is that the right test? I just guessed that that python file is the right one, but there are other python tests there, but none of them apparently process a file_access.log file.

$: python parse_disk_log.py file_access.log 
Traceback (most recent call last):
  File "parse_disk_log.py", line 43, in <module>
    new_time = long(l[0])
ValueError: invalid literal for long() with base 10: ''

The only search result I could find of file_access.log was its reference in a .cpp file, not a python script.

@arvidn
Copy link
Owner

arvidn commented Aug 9, 2015

even more interesting would be to get session stats. Those actually count internal events, buffer sizes, limits and rates. They can give a much better understanding of what's going on internally

@dessalines
Copy link
Author

Ah. I think you showed @aldenml how to turn on default_storage.disk_write_access_log(true); but not those. I'd much appreciate any type of logging I can get in order to try to debug this issue.

@arvidn
Copy link
Owner

arvidn commented Aug 9, 2015

you get session stats by building with statistics=on (if building with boost-build) or defining TORRENT_STATS.

@dessalines
Copy link
Author

@arvidn Okay, here's my session stats finally(with uTP turned off), tar and gzipped fine?

https://drive.google.com/file/d/0B0PA7XM1BmLMOWV3b2FVV1R1Nkk/view?usp=sharing

run :
tar -vxzf session_stats.tar.gz

This is just adding all of my 10k torrents to the session, and resuming them at the same time. It got up to about 500 before it stopped resuming the rest.

@arvidn
Copy link
Owner

arvidn commented Aug 13, 2015

almost all of your torrents are stopped. That's why they're not announcing to the tracker or are available. (see the pink plot on the top graph, "torrents"). They are also not auto-manged, so libtorrent is not responsible for stopping and starting them

@arvidn
Copy link
Owner

arvidn commented Aug 13, 2015

you added the torrents as non-auto-managed, right? and then you simply called torrent_handle::resume() on all of them, but only 500 started?

@dessalines
Copy link
Author

@arvidn ooh you might be right. Currently I'm adding them to the session, then disabling automanage. I suppose I need to set the seed_mode and automanage flags. I'll do this today and see if it makes a difference.

@arvidn
Copy link
Owner

arvidn commented Aug 14, 2015

please make sure you understand what seed-mode means. and doing it the way
i described should work. was that what failed for you?

On Thu, Aug 13, 2015 at 9:36 AM, Tyhou notifications@github.com wrote:

@arvidn https://github.com/arvidn ooh you might be right. Currently I'm
adding them to the session, then disabling automanage. I suppose I need to
set the seed_mode and automanage flags. I'll do this today and see if it
makes a difference.


Reply to this email directly or view it on GitHub
#63 (comment).

Arvid Norberg

@arvidn
Copy link
Owner

arvidn commented Aug 14, 2015

oh.. I see.. if you add them started and auto-managed.. then disable
auto-managed, libtorrent may already have stopped them before you disable
auto-managed.

On Thu, Aug 13, 2015 at 8:49 PM, Arvid Norberg arvid.norberg@gmail.com
wrote:

please make sure you understand what seed-mode means. and doing it the way
i described should work. was that what failed for you?

On Thu, Aug 13, 2015 at 9:36 AM, Tyhou notifications@github.com wrote:

@arvidn https://github.com/arvidn ooh you might be right. Currently
I'm adding them to the session, then disabling automanage. I suppose I need
to set the seed_mode and automanage flags. I'll do this today and see if it
makes a difference.


Reply to this email directly or view it on GitHub
#63 (comment).

Arvid Norberg

Arvid Norberg

@dessalines
Copy link
Author

@arvidn This appears to have worked! Ever since I added the torrents with seed_mode = true and auto_manage = false, I can add all my 10k torrents to the session, and it appears to be working fine(with uTP turned off though... #77 )

I'm pretty excited and have been stress-testing my application all day, and so far every one of the 10k torrents is seeding correctly... http://torrenttunes.ml .

Thanks a ton @arvidn @aldenml , I hit a brick wall on this issue for about a month, and I'm only past it cause you guys kick ass.

You can close this issue out, unless you want to solve the uTP issue first.

@arvidn
Copy link
Owner

arvidn commented Aug 14, 2015

let's track the utp issue in the other ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants