Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve startup #13348

Closed
wants to merge 5 commits into from
Closed

Improve startup #13348

wants to merge 5 commits into from

Conversation

glassez
Copy link
Member

@glassez glassez commented Sep 8, 2020

No description provided.

@glassez glassez added the Core label Sep 8, 2020
@glassez
Copy link
Member Author

glassez commented Sep 8, 2020

This PR is an attempt to improve performance when loading torrents into session (especially when restoring a session, so many torrents are loaded in a row).
Can someone test it with a large number of torrents (on the order of several hundred or thousands) and tell us how it behaves?
It is unlikely to be able to accelerate the restoring of session, but it should make UI more responsive at this time.
PR is in the draft stage (it is also based on other not merged changes), so do not pay attention to the "dirty" git history, etc.
A side effect is that there is a certain time interval between the UI is displayed and the torrent list starts to fill up. If it turns out that it does its main job well, then I'm waiting for your suggestions for improving its interface (for example, displaying progress in some way, or just displaying the "busy" status).

@jagannatharjun, @An0n666, ping.

@ghost
Copy link

ghost commented Sep 8, 2020

I can test on windows if someone can provide me with a test build.
Ping @FranciscoPombal

@c0re100
Copy link

c0re100 commented Sep 8, 2020

I can test on windows if someone can provide me with a test build.
Ping @FranciscoPombal

qb: master+improve-startup
libtorrent: 1.2.10(70eb2b4)

https://drive.google.com/file/d/1sZNbT5tvoPP2d_wVTFQ3mohJ1-Y-ZPr5/view?usp=sharing

@FranciscoPombal
Copy link
Member

@c0re100

qb: master+improve-startup
libtorrent: 1.2.10(70eb2b4)

https://drive.google.com/file/d/1sZNbT5tvoPP2d_wVTFQ3mohJ1-Y-ZPr5/view?usp=sharing

Thanks, beat me to it. Is this before or after the force push from 37aa89d to db3bd7a?

@c0re100
Copy link

c0re100 commented Sep 8, 2020

@c0re100

qb: master+improve-startup
libtorrent: 1.2.10(70eb2b4)

https://drive.google.com/file/d/1sZNbT5tvoPP2d_wVTFQ3mohJ1-Y-ZPr5/view?usp=sharing

Thanks, beat me to it. Is this before or after the force push from 37aa89d to db3bd7a?

After the force push(db3bd7a)

@ghost
Copy link

ghost commented Sep 9, 2020

Ok so I ran multiple types of tests with 5000 torrents on a SSD storage.

First I tried with the last build that ned-martin used. #12825 (comment)

It took 13 seconds for the UI to become responsive(white blank UI to transfer list) with only 1 non-working tracker.

It took 2 mins and 29 seconds with 10 non-working trackers.

Then I tried with the build provided here #13348 (comment)

It took around ~10 seconds for UI to load and display no torrents. But then it went back to Not responding for another 2 mins and 40 seconds. So ultimately the UI remains unusable. I'm not seeing any improvement here. Except now the UI loads earlier with no torrents at all and it remains non responsive for almost the same amount of time.

@glassez
Copy link
Member Author

glassez commented Sep 9, 2020

Well, I will continue further research.

@ghost
Copy link

ghost commented Sep 9, 2020

The load time is faster with less trackers.
5000 torrents 10 trackers each = 50000 announces per endpoint. I have a total of 5 endpoints out of which only 1 has internet. Others are just 127.0.0.1, link local v6 and 2 VM adapters.

@xavier2k6
Copy link
Member

@An0n666

Windows test build of 4.3.0(Alpha1) with listed libraries:

qBittorrent master   | 4.3.0 +git 2e6236e PR #13348 (Sep 09, 2020)
libtorrent-rasterbar | 1.2.10.0 +git 6ed8872 (Sep 09, 2020)
Qt                   | 5.15.1
OpenSSL              | 1.1.1g
zlib                 | 1.2.11
Boost                | 1.74

Download Link

@ghost
Copy link

ghost commented Sep 10, 2020

@An0n666

Windows test build of 4.3.0(Alpha1) with listed libraries:

qBittorrent master   | 4.3.0 +git 2e6236e PR #13348 (Sep 09, 2020)
libtorrent-rasterbar | 1.2.10.0 +git 6ed8872 (Sep 09, 2020)
Qt                   | 5.15.1
OpenSSL              | 1.1.1g
zlib                 | 1.2.11
Boost                | 1.74

Download Link

With zero trackers, session restores completely within 8 seconds. That is 5k torrents get loaded within 8 seconds.
No not responding issue. But the UI remains non responsive for that 5-6 seconds after loading till all torrents restore.
Memory usage is fairly low at 80 MiB.

With 10 non working trackers in each torrent, it takes around 2 min 40 seconds for the UI to go from Not responding to a choppy one that lags like hell. The memory usages gradually climbs up to 800 MiB + and 50% of all 4 CPU cores were being utilised constantly during the Not responding period. The memory seem to get freed pretty slowly.

@ghost
Copy link

ghost commented Sep 10, 2020

Ok so I think I got something:

The trackers that I was using were like this:
http://example1.com
http://example2.com
http://example3.com
and so on...

I changed them to
http://example1.com/announce
http://example2.com/announce
http://example3.com/announce
etc..
And now the locking issue is gone! qBt loads fine and then the TRACKERS section is gradually being populated with errored torrents!
No lag or choppiness as well.

So I think something is breaking due to not having /announce in the tracker URL?
@arvidn ping!

Edit: The issue could be at qBt end as well. Like how the TRACKERS section parses the announce URLS. @glassez

@FranciscoPombal
Copy link
Member

@An0n666

And now the locking issue is gone! qBt loads fine and then the TRACKERS section is gradually being populated with errored torrents!
No lag or choppiness as well.

Did you double-check that the locking occurs without this patch even when using http://examplexxx.com/announce URLs? Just to check if this patch actually improves things and this wasn't just the URL issue all along.

@ghost
Copy link

ghost commented Sep 10, 2020

@An0n666

And now the locking issue is gone! qBt loads fine and then the TRACKERS section is gradually being populated with errored torrents!
No lag or choppiness as well.

Did you double-check that the locking occurs without this patch even when using http://examplexxx.com/announce URLs? Just to check if this patch actually improves things and this wasn't just the URL issue all along.

Checked 3 times. The patch actually doesn't improve anything.
Locking issue is gone when /announce is appended to URLs.

@ghost
Copy link

ghost commented Sep 10, 2020

There's the white loading UI for 10-15 seconds in this build with /announce appended.
#12825 (comment)
Without /announce the lock extends to 2 mins 30 seconds.

On this build #13348 (comment)
The UI loads within 2-3 seconds and remains non responsive for another 5-10 seconds till session gets restored with /announce appended.
Without /announce the locking issue appears and the lock extends for 2 mins 40 seconds.

@xavier2k6
Copy link
Member

xavier2k6 commented Sep 10, 2020

@An0n666 are you testing with "qBittorrent defaults"?

For testing purposes:
Have you tried changing async i/o from 4 -> 8
process memory priority below normal -> normal

@arvidn
Copy link
Contributor

arvidn commented Sep 10, 2020

has anyone profiled the process during the time of lag/stall/choppiness?
I would think the most important thing to understand is what the various threads are doing and/or waiting for.
libtorrent itself doesn't treat tracker URLs that end with /announce any differently than any other URL (except that trackers whose URL does not end with /announce cannot be scraped). Perhaps the trackers send different responses that are quicker to handle when you change the URLs. The best way to know is to profile the problematic case I'd say.

Perhaps a full alert log might help too, but it would still be trying to deduce the cause from bits and pieces, rather than actually looking at the cause of the lag in a profiler.

@FranciscoPombal
Copy link
Member

FranciscoPombal commented Sep 10, 2020

@An0n666 to get the full alert log, you probably want to attempt to reproduce this with client_test. It would be interesting to see how this problem manifests itself with client_test I suppose what we are looking for here is scrape_reply_alert and scrape_failed_alert.

@arvidn

libtorrent itself doesn't treat tracker URLs that end with /announce any differently than any other URL (except that trackers whose URL does not end with /announce cannot be scraped)

qBittorrent will end up trying to scrape those URLs due to the auto scrape interval configuration, configured here:

pack.set_int(lt::settings_pack::auto_scrape_interval, 1200); // 20 minutes

The results are processed here:

int TrackerEntry::numSeeds() const

Would that be the problem? These URLs (http://examplexxx.com) cannot be scraped.

@arvidn
Copy link
Contributor

arvidn commented Sep 10, 2020

scrape attempts for HTTP trackers whose URL does not end with /announce will just fail immediately, here: https://github.com/arvidn/libtorrent/blob/RC_1_2/src/http_tracker_connection.cpp#L78

@ghost
Copy link

ghost commented Sep 10, 2020

One difference that I forgot to mention between /announce and w/o /announce case.
When the lock ends with the latter, the errored tracker torrent count shows 5000, meaning all had finished announcing and the lock was released.
While with the former, there’s no UI lock and the errored tracker count gradually increases.
I’ll try to reproduce this on ubuntu with a build having debug symbols then get a stack trace if I get time.

@glassez
Copy link
Member Author

glassez commented Sep 10, 2020

@An0n666, are all 5000+ mentioned torrents that you use in tests really running (i.e. started in non-paused state)?

It took 2 mins and 29 seconds with 10 non-working trackers.

IIRC, all your torrents have the same 10 trackers. Am I right?

@ghost
Copy link

ghost commented Sep 10, 2020

@An0n666, are all 5000+ mentioned torrents that you use in tests really running (i.e. started in non-paused state)?

Yes. Running in downloading state(stalled).

@ghost
Copy link

ghost commented Sep 10, 2020

It took 2 mins and 29 seconds with 10 non-working trackers.

IIRC, all your torrents have the same 10 trackers. Am I right?

Yes same 10 http trackers without the /announce part in their URL.

@glassez
Copy link
Member Author

glassez commented Sep 10, 2020

@An0n666
Can you measure it under the following conditions:

  1. All 5000 torrents are paused
  2. One from 5000 is running (all others are paused)
  3. 10 from 5000 are running

@xavier2k6
Copy link
Member

@An0n666
Can you measure it under the following conditions:

  1. All 5000 torrents are paused
  2. One from 5000 is running (all others are paused)
  3. 10 from 5000 are running

@glassez do you want me to provide a build with latest commit ecf8c37 for this or previous build with 2e6236e is ok?

@ghost
Copy link

ghost commented Sep 10, 2020

@An0n666
Can you measure it under the following conditions:

  1. All 5000 torrents are paused
  2. One from 5000 is running (all others are paused)
  3. 10 from 5000 are running
  1. 10 seconds white UI followed by empty transfer list followed by session restored on 17th second. No lag/choppiness afterwards.
  2. Same as number 1.
  3. Same as number 1.
  4. I ran another test with 500 torrents running. This time 10 seconds white UI followed by empty transfer list followed by session restored on 22nd second. No lag/choppiness afterwards.

Ran the same test with torrents with /announce appended to tracker.

  1. 10 seconds white UI followed by empty transfer list followed by session restored on 17th second. No lag/choppiness afterwards.
  2. Same as number 1.
  3. Same as number 1.
  4. Same as number 1.

I have found more interesting thing:
With /announce appended, libtorrent announces to each tracker in a loop, that is it moves to the next tracker once the upper one fails.
Capture with /announce appended URL : https://gyazo.com/b1aa352b233fd07ed36067b4ed21f59b

But without /announce appended, it announces to all trackers at once...skipping the looping. I think this generates 5000*10 = 50000 alerts at once per endpoint. And this is causing the locking issue.
Capture without /announce : https://gyazo.com/d5c1c6ed78b906e4644056bb7b4e840c

@glassez
Copy link
Member Author

glassez commented Sep 27, 2020

what is the alert_queue_size set to?

By default libtorrent sets it to 2000.

AFAIK, it is redefined in qBittorrent. But why do you ask about it?

@ghost
Copy link

ghost commented Sep 27, 2020

what is the alert_queue_size set to?

By default libtorrent sets it to 2000.

AFAIK, it is redefined in qBittorrent. But why do you ask about it?

settingsPack.set_int(lt::settings_pack::alert_queue_size, std::numeric_limits<int>::max() / 2);

@xavier2k6
Copy link
Member

AFAIK, it is redefined in qBittorrent. But why do you ask about it?

Just for myself to better understand the internals of the code & also if there was anything there that could've been tweaked.

@xavier2k6
Copy link
Member

xavier2k6 commented Sep 27, 2020

memory priority was changed previously...could the same be done for I/O priority??

I'm using a program called TaskExplorer v1.2.8 to change (increase) the I/O priority & it is making a positive difference in my testing...

also for reference

@glassez
Copy link
Member Author

glassez commented Sep 28, 2020

Well, I don't see much excitement...
But what do you expect to get in the end? Best of all, it's GUI that doesn't hang, but has some responsiveness.
As you can see, the more responsive GUI you want to get at the start, the longer it takes to restore the session.

Personally, I don't like any of these options.
As I mentioned earlier, the best performance option is to populate GUI elements (transfer list etc.) once the session is fully restored. Up to this point, you can only show some "lightweight" progress indicator.

@ghost
Copy link

ghost commented Sep 28, 2020

Well, I don't see much excitement...
But what do you expect to get in the end? Best of all, it's GUI that doesn't hang, but has some responsiveness.
As you can see, the more responsive GUI you want to get at the start, the longer it takes to restore the session.

Personally, I don't like any of these options.
As I mentioned earlier, the best performance option is to populate GUI elements (transfer list etc.) once the session is fully restored. Up to this point, you can only show some "lightweight" progress indicator.

What about the cases where people have 100+ trackers per torrent in different tiers and all announce at once at startup? In such cases the GUI will take couple of minutes to load.
Specially when many trackers fail immediately due to a bad response from hostname lookups.

@glassez
Copy link
Member Author

glassez commented Sep 28, 2020

What about the cases where people have 100+ trackers per torrent in different tiers and all announce at once at startup? In such cases the GUI will take couple of minutes to load.
Specially when many trackers fail immediately due to a bad response from hostname lookups.

If I understand correctly, your comment refers to my last suggestion to restore the session behind the scenes, displaying only the progress indicator, am I right?
But what essentially changes in this case? It just adds more work that you can't get away from. In any case, this work can be done much faster if it doesn't affect the GUI every step of the way.
The only alternative I can think of is to restore all torrents in a "paused" state, and then "resume" them when all are restored.

@ghost
Copy link

ghost commented Sep 28, 2020

What about the cases where people have 100+ trackers per torrent in different tiers and all announce at once at startup? In such cases the GUI will take couple of minutes to load.
Specially when many trackers fail immediately due to a bad response from hostname lookups.

If I understand correctly, your comment refers to my last suggestion to restore the session behind the scenes, displaying only the progress indicator, am I right?
But what essentially changes in this case? It just adds more work that you can't get away from. In any case, this work can be done much faster if it doesn't affect the GUI every step of the way.
The only alternative I can think of is to restore all torrents in a "paused" state, and then "resume" them when all are restored.

I mean process the initial tracker alerts behind the scene as well, not just restoring the session.

@arvidn
Copy link
Contributor

arvidn commented Sep 28, 2020

here's a crazy thought:

What if the session is paused on startup, all torrents are added, then the session is resumed. Perhaps that will mitigate an "alert storm" to some extent. Clearly the torrent_added alerts would still be posted.

@ghost
Copy link

ghost commented Sep 28, 2020

When the torrents are being resumed after restoring the session, they could be resumed in batches, to further reduce the rate of alerts. Otherwise the user may experience another lock after the session is restored resumed.

@glassez
Copy link
Member Author

glassez commented Sep 28, 2020

here's a crazy thought:

What if the session is paused on startup, all torrents are added, then the session is resumed. Perhaps that will mitigate an "alert storm" to some extent. Clearly the torrent_added alerts would still be posted.

It's not such a crazy idea. At least we're crazy the same way, since it also crossed my mind. But I haven't had time to study it in detail yet.
My initial attempts failed due to an "assertion fauled" error related to the tracker response, when trying to resume the session.
I also noticed that there is no way to create a session in a paused state, I have to manually call "paused" after it is created.

@glassez
Copy link
Member Author

glassez commented Sep 28, 2020

When the torrents are being resumed after restoring the session, they could be resumed in batches, to further reduce the rate of alerts. Otherwise the user may experience another lock after the session is restored resumed.

Don't forget that different alerts have different processing time.
Anyway we should investigate it to see how it works.

@glassez
Copy link
Member Author

glassez commented Sep 28, 2020

In addition, I noticed that tracker event processing is too wasteful. It affects the GUI with every related alert handling, although we could only do this with regular updates.

@xavier2k6
Copy link
Member

@glassez just as a follow up really....do you want me to provide another test build? I didn't get a ping from you, so was just wondering.....

@glassez
Copy link
Member Author

glassez commented Oct 7, 2020

@glassez just as a follow up really....do you want me to provide another test build? I didn't get a ping from you, so was just wondering.....

I haven't added anything new here since.

@xavier2k6
Copy link
Member

I have identified 4x users through the issue tracker that have ~1800, ~7950, ~12,000 & ~36,000 torrents, one user was already pinged here to try a build/get feedback but no response from them as of yet........should we ping the other 3?

Would be interested to see feedback from the user with ~36,000 torrents........if willing/able to test.

@glassez @FranciscoPombal @thalieht What do you guys think? (these users have experienced the white screen/delayed startup etc)

I can provide a build with glassez:improve-startup commit: dd1e7d0 with latest libtorrent RC_1_2 commit: d622951

@glassez
Copy link
Member Author

glassez commented Oct 11, 2020

Please take a look at another approach!
#13498 (It is still draft too).

@jagannatharjun
Copy link
Member

@glassez how about transferring alert handling into a separate thread, my main objective is snappy GUI with no performance lost on libtorrent side and I suspect most of the slow GUI issues can be related to alert processing (I need to test it but I am too lazy).

@arvidn
Copy link
Contributor

arvidn commented Oct 12, 2020

calling pop_alerts() is thread safe in the sense that you can call it from any thread. As long as all of the alerts from the last call have been processed and forgotten by the time pop_alerts() is called again, you're safe. Having a thread dedicated to processing alerts would be safe from the libtorrent API point of view, since that would be the only thread calling it. Some care would have to be taken to ensure that thread communicates safely with the GUI though. I imagine that part could get hairy.

@glassez
Copy link
Member Author

glassez commented Oct 12, 2020

Some care would have to be taken to ensure that thread communicates safely with the GUI though. I imagine that part could get hairy.

👍

how about transferring alert handling into a separate thread, my main objective is snappy GUI with no performance lost on libtorrent side and I suspect most of the slow GUI issues can be related to alert processing (I need to test it but I am too lazy).

Core processing of alerts is mostly fairly trivial and cannot cause too much load, so we are unlikely to benefit from transferring it to a separate thread, but on the contrary, we can only lose due to multithread interaction.
Thing that can really cause the GUI to slow down is the processing of core events in the GUI classes themselves. But your suggestion won't help either.
In addition, as I understand it, the main problem is the launch of the application. But there, as I have repeatedly mentioned, the best solution is to create a GUI after restoring the Session and loading the rest of the core components (#13498). Otherwise, we have completely meaningless behavior when it is updated so frequently that it does not have time to draw it. In addition, all these intermediate updates during session restoring do not make sense to the user. Why, for example, do you need to see the current number of torrents in a particular filter, if it does not reflect any current state of the session?

@xavier2k6 xavier2k6 mentioned this pull request Oct 12, 2020
@glassez glassez closed this Oct 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants