Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to avoid a single user returning 1000+ results instead of 50 (limit results per user, exclude users from a search, include only results from one country...) #2133

Closed
MasterSuggester opened this issue Jul 31, 2022 · 34 comments

Comments

@MasterSuggester
Copy link

MasterSuggester commented Jul 31, 2022

Nicotine+ version: 3.2.2 • GTK 3.24.33
Operating System/Distribution: Windows 10 Enterprise 2016 LTSB

Describe the bug

There seems to be a search result limit of 25000 that you can't change. That can be an issue on itself, but let's move to the actual bug. The thing is, when you search for a word that returns many thousands of results (potentially many more than the 25000 limit by default, such as "01") the system somehow detects this (probably because in this case, every single user sharing files has some file which includes "01" in the file name) and plain stops your search at about 5000 results shown.

Expected behavior

If the search result limit is not going to be extended beyond 25000, I would suggest to at least allow these kind of massive searches to reach the limit of 25000 instead of stopping showing results after 5000.

Steps to reproduce the bug

Simply search for "01" and observe the low number of results returned and how fast they seem to appear, which seems to indicate the program suddenly "gets scared" and stops by default.

You can also try to run the same massive search only on the room "Nicotine" and observe how the number of results is roughly the same, which makes it very unlikely that these are the only results.

Additional context

Screenshots, logs, stacktraces or relevant information.
(N/A)

@MasterSuggester MasterSuggester changed the title Massive searches show much less than the actual results (and much less than the limit of 25000) Massive searches show much less results than the actual number (and much less than the limit of 25000) Jul 31, 2022
@mathiascode
Copy link
Member

mathiascode commented Jul 31, 2022

There seems to be a search result limit of 25000 that you can't change.

It's there because the program hasn't been tested and optimized for more results than the limit yet. You can manually bypass this limit by editing the config file, but I don't recommend this.

"01" in the file name

Keep in mind that this search term is shorter than three characters, so you'll only get results from Soulseek NS and some older clients. A search term with more results would be e.g. "music".

I would suggest to at least allow these kind of massive searches to reach the limit of 25000 instead of stopping showing results after 5000.

We're not intentionally stopping at 5000.

There have been some changes related to searches in Nicotine+ 3.2.3rc3. Could you check if the issue still occurs there? https://nicotine-plus.org/doc/TESTING.html

@MasterSuggester
Copy link
Author

MasterSuggester commented Jul 31, 2022

I shouldn't have tried because the issue is clearly what you mentioned about the minimum of 3 characters, so I assume only SoulseekNS results are appearing. But anyway, I tried with the Unstable release and the "issue" persists (not an actual issue as we now know it's by design!). Thanks!

EDIT: I just tried searching for "mp3", which should return a huge result list, and the same "issue" happens. Much less results (the scroll "bar" that you can move up and down is twice as long, which means a much shorter list) than when searching "madonna", for example. So something seems to be stopping Nicotine+ after 2K-3K results instead of 25K when a massive number of results is detected (or maybe the minimum number of characters is 4 instead of 3?).

@mathiascode
Copy link
Member

mathiascode commented Jul 31, 2022

What is your operating system? Is your listening port open? Something strange is still going on here (assuming your search scope is set to "Global").

My listening port is closed, and these are the results I get:

  • "01": 20k results
  • "music": 25k results (limit hit)
  • "mp3": 25k results (limit hit)
  • "madonna": 25k results (limit hit)

Note that SoulseekQt clients don't send results for the search term "mp3" either (presumably blacklisted in the client), which makes it even more strange that you're receiving so few results.

@MasterSuggester
Copy link
Author

MasterSuggester commented Jul 31, 2022

Thanks! I'm glad that it works for you, I knew something was off.

Yeah I just searched for "music" and I have the same issue! Very few results, search stops after a few seconds.

My operating system is Windows 10, but I'm almost sure my listening port is not open. Let me check.

Okay, I checked via Nicotine+ and the page says: "IP: 93.117.84.244 Port: 2234/tcp open. Your router and Soulseek client is configured correctly. Don't forget that you need to restart your client if you modify your listening port."

Maybe it's my ISP?

@mathiascode
Copy link
Member

Could you do the following?

  1. Enable debug logging to file in Preferences -> Logging
  2. Right-click the log pane (three dots in the bottom right corner) -> log categories -> enable "messages" and "connections".
  3. Perform a search and send the logs to the email on my profile (right-click the log pane -> "open log folder" to see the logs)

Also, when performing a search, does the connection count in the status bar get stuck at a number close to 512 for a longer time?

@MasterSuggester
Copy link
Author

MasterSuggester commented Jul 31, 2022

I still haven't done that procedure, but I just saw the number when running another massive search ("grunge"). Exactly 512.

EDIT: As you predicted, the number goes up (strangely much after the search has stopped adding more results to the list) until 512, but it never gets higher than 512 and it stays there for about a minute.

EDIT 2: My app is in Spanish, so now I have a second reason to change the language to English: the logs are translated into Spanish. How can I change the language to English? I didn't find an option. Maybe editing the config file?

@mathiascode
Copy link
Member

EDIT 2: My app is in Spanish, so now I have a second reason to change the language to English: the logs are translated into Spanish. How can I change the language to English? I didn't find an option. Maybe editing the config file?

You can set the LANGUAGE environment variable to en_US.UTF-8. I don't quite remember how to do that on Windows, but there should be tutorials. In any case, the relevant debug logs should always be in English anyway.

it never gets higher than 512 and it stays there for about a minute.

Coincidentally, I was looking into a similar issue the other day. Once you've gathered logs for the current testing build, could you check if this build fixes the issue (scroll down to the bottom of the page): https://github.com/mathiascode/nicotine-plus/actions/runs/2769641053

@MasterSuggester
Copy link
Author

UPDATES:

  • I tried with the build you suggested, but the issue persists. Searching for "mp3" returns half the results than searching for any famous rock band.

  • Also, I noticed you assumed not to experience the issue just because you saw "25000+ results", so I warn you: Please note this only shows the "detected" results, but not the "shown" results. You can do the test by yourself comparing how short the list is when you search for "mp3" or "music" compared to any random band, when it should be at least equally long (in your case, you mention "01" has 20K results, but we can assume this is clearly incorrect as every single user has a "01" on their files, but that one we can ignore due to the 2 character issue you mentioned).

@mathiascode
Copy link
Member

mathiascode commented Jul 31, 2022

  • I tried with the build you suggested, but the issue persists. Searching for "mp3" returns half the results than searching for any famous rock band.

I think it boils down to the connections clogging up and preventing further connections from coming through (I don't have this issue on my end). I need the logs to debug this further though.

  • Also, I noticed you assumed not to experience the issue just because you saw "25000+ results", so I warn you: Please note this only shows the "detected" results, but not the "shown" results. You can do the test by yourself comparing how short the list is when you search for "mp3" or "music" compared to any random band, when it should be at least equally long (in your case, you mention "01" has 20K results, but we can assume this is clearly incorrect as every single user has a "01" on their files, but that one we can ignore due to the 2 character issue you mentioned).

tt's definitely the shown results, the counter increments for every added file row in the GUI. I searched for "01" again, and verified that I had 25k visible results from 72 users (presumably more users logged in compared to last time).

@MasterSuggester
Copy link
Author

MasterSuggester commented Jul 31, 2022

Could you do the following?

  1. Enable debug logging to file in Preferences -> Logging
  2. Right-click the log pane (three dots in the bottom right corner) -> log categories -> enable "messages" and "connections".
  3. Perform a search and send the logs to the email on my profile (right-click the log pane -> "open log folder" to see the logs)

Also, when performing a search, does the connection count in the status bar get stuck at a number close to 512 for a longer time?

When I click on the three dots at the bottom right corner, nothing happens. No menu appears.

@slook
Copy link
Member

slook commented Jul 31, 2022

  • Main Menu > View > Show Log History pane (or keyboard shortcut is Ctrl+L )

  • Then Right click on the log history pane area > Log Categories > "[Debug] Connections"

  • Again Right click on the log history pane area > Log Categories > "[Debug] Messages"

@MasterSuggester
Copy link
Author

MasterSuggester commented Jul 31, 2022

Thanks slook! I just sent the log file to mathiascode. However, Nicotine+ crashed while I was collecting the logs... too much information I guess! The .log file was 64MB already.

@slook
Copy link
Member

slook commented Jul 31, 2022

Wow! That sure is a lot of information! It might take some time to investigate, but hopefully it can lead to some useful fixes for this sort of edge case - thank you for your time in reporting this.

What happened in the crash did you get a traceback in a Critical Error dialog? Infact I think some CPU hogging that occurs with the logging is improved in Nicotine+ 3.2.3... Can you verify the same crash exists in that version?

If you want to test any recent bug fixes as they get added see:

It would be most useful to test package from the latest master branch to verify if any problematic behaviour exists or if it might already be solved by recent cleanups (or if infact any bug might present itself in a slightly different way).

@mathiascode
Copy link
Member

Based on the log output, could you shed some light on these questions?

  • Do you receive a lot of results for an initial search, if you restart Nicotine+ and search for e.g. "music"?
  • Were the searches with less results performed within a minute of the previous search, while the connection count was stuck on 512?

@MasterSuggester
Copy link
Author

MasterSuggester commented Jul 31, 2022

The answer to both questions is "no".

The process is:

  1. I search for "mp3"
  2. Nicotine+ appears to go crazy for some seconds, "hard lock" of the OS included
  3. The results start appearing and suddenly stop after a few seconds
  4. The results count keeps on getting bigger until reaching "25000+", even if no new results appear
  5. The Connection count slowly gets bigger and bigger until reaching 512, even if no new results appear
  6. When the Connection count gets to 512, it stays there for 30 seconds-1 minute
  7. Then the Connection count starts to get smaller
  8. When you look at the final results, you see things that make no sense when looking for "mp3", such as only 6 users from my country Vs. around 20 if I search for "rosalia".

@slook
Copy link
Member

slook commented Jul 31, 2022

Which OS / platform?

Which Nicotine+ version are you running for the above test?

@MasterSuggester
Copy link
Author

The Nicotine+ version is the last stable one, that's already in the bug info. But my OS was not, so I added it. Now it looks like this:

Nicotine+ version: 3.2.2 • GTK 3.24.33
Operating System/Distribution: Windows 10 Enterprise 2016 LTSB

@mathiascode
Copy link
Member

I have added some code to the latest testing build to close inactive connections if the connection limit is reached, but

The results count keeps on getting bigger until reaching "25000+", even if no new results appear

is the crucial part.

It shouldn't be possible for the result counter and actual results to go out of sync, since the counter is incremented here when a result is added: https://github.com/nicotine-plus/nicotine-plus/blob/3.2.x/pynicotine/gtkgui/search.py#L810

Not sure what's happening here... It's not the case of a single user having thousands of search results either?

@slook
Copy link
Member

slook commented Jul 31, 2022

@MasterSuggester Please install the latest unstable build of Nicotine+ 3.3.0dev1 to test the recent fix that was pushed, it can be installed alongside your existing install without loosing your settings (although just incase it's always a good idea to Export your Preferences when flipping between older versions).

@mathiascode There might be a series of commits to resolve this, which should ideally be properly tested before considering backporting into the 3.2.x branch.

@MasterSuggester
Copy link
Author

MasterSuggester commented Jul 31, 2022

I wasn't taking into account that these searches, instead of 1-100 results per user, return thousands per user. So they obviously stop with less users, ironically. Maybe that's all! Now if only I could reduce the number of results so I can find more users, perhaps adding "mp3 01"...

(Unless you guys see something else and believe this is a bug, of course, but this theory would explain why I see "less results". I'm only seeing the user list, not the full list of files! Maybe I've been wasting your time)

@MasterSuggester
Copy link
Author

Perhaps I should have raised this a Feature request: "Return a single result from each user". This option could help limit the search results when you're only looking for someone who has "X".

I would also like to say that it would be great that filters actually affect the maximum number of results by removing the 90% of results that you're not filtering in, allowing you to get more results from your positive filter, but I guess that's not possible as the filter acts on completed searches.

@slook
Copy link
Member

slook commented Jul 31, 2022

You do not have any control over how many results per user is received, since it is the peer on the other end that decides this in their Preferences > Searches > "Maximum search results to send per search request" (default is 150). Obviously if some user has set a very high number then this will reduce the number of results you will see from other users.

For the full view select "Ungrouped" from the Result Grouping Mode menu on the upper right of the Search Files toolbar, although it might be the case that some bug exists when building/counting the results in a particular view mode.

"hard lock" of the OS included

For me this is an important factor that would be good to identify and avoid if possible.

I guess that's not possible as the filter acts on completed searches.

In the past versions, there was another setting for storing a higher number of results than that which is displayed on-screen. I believe this was indeed for the purpose of showing more results when filtered, as you have suggested. Nowadays, this isn't needed anymore since the connection speeds and improved application's performance is intended to be good enough for displaying large lists as needed without any need for a limit in most cases, but for the edge cases of massive searches using popular common-word queries is an area that requires improvement.

It is possible to exclude search results that contain certain words by using the - operator in the search entry, which does not eat into your maximum displayed results and thus may allow better filtering in some cases.

@mathiascode
Copy link
Member

mathiascode commented Jul 31, 2022

Perhaps I should have raised this a Feature request: "Return a single result from each user". This option could help limit the search results when you're only looking for someone who has "X".

This shouldn't be necessary if your search term is specific enough. May I ask why you're using such a generic search term? Are you trying to find as many online users as possible?

Perhaps there would be another, less cryptic way to accomplish your goal.

Obviously if some user has set a very high number then this will reduce the number of results you will see from other users.

I believe the official clients don't limit the number of results they send.

@slook
Copy link
Member

slook commented Jul 31, 2022

I think the OP might be trying to establish how many users have a certain file, perhaps this would be interesting to gauge the popularity of said file or something along those lines?... but for most normal use cases it probably isn't required as a core feature.

Some parts of this bug report do have relevance in that more testing and debugging might be required in the future on the matter of application freezing in certain edge case scenarios, that could be interesting to investigate.

@mathiascode
Copy link
Member

"hard lock" of the OS included

For me this is an important factor that would be good to identify and avoid if possible.

Inserting 25k results at once isn't really optimized yet. I think you should be able to reproduce the issue to some degree on your end as well, if you increase the limit to 25k.

@slook
Copy link
Member

slook commented Jul 31, 2022

increase the limit to 25k.

Actually in this scenario I consider the performance of Nicotine+ 3.2.3rc3 and 3.3.0dev1 to be quite impressive on a Linux machine, although some effort from the dual-core CPU with hyper-threading is indeed evident and scrolling FPS might be somewhat jerky for a few seconds whilst the list is being populated this is not unreasonable, and it certainly does not cause any freezing in the OS such as the Windows user has reported when using Nicotine+ 3.2.2

@MasterSuggester
Copy link
Author

Perhaps I should have raised this a Feature request: "Return a single result from each user". This option could help limit the search results when you're only looking for someone who has "X".

This shouldn't be necessary if your search term is specific enough. May I ask why you're using such a generic search term? Are you trying to find as many online users as possible?

Perhaps there would be another, less cryptic way to accomplish your goal.

Obviously if some user has set a very high number then this will reduce the number of results you will see from other users.

I believe the official clients don't limit the number of results they send.

Let's imagine someone was trying to find as many users online as possible. Any suggestions of less cryptic methods to find them?

@MasterSuggester
Copy link
Author

MasterSuggester commented Aug 1, 2022

Hey guys, I was trying to understand the "issue" of a single user returning 5000 results instead of 50. So I went user by user in the "mp3" results and found out that most of the users return exactly 50, but 2 or 3 users return 1000+ results.

I believe these users probably didn't know what they were doing when they increased the "50" to "1000" or maybe "5000". They probably though this was affecting their own searches, not searches of other users who find their files.

In any case, the possible solutions to this "problem" would actually be suggestions for potential features:

  1. Add an option to exclude one or several users from a search.

  2. Add an option to limit the results returned per user to a certain value.

  3. Change the way filters work so that they can be applied to the search itself, not to the list of results after the list is created (so that the 25,000 limit is applied only to the results in your filter, instead of your filter reducing the 25,000 results to 5000).

I hope you can consider adding one of these! : )

@slook slook added enhancement and removed bug labels Aug 1, 2022
@MasterSuggester MasterSuggester changed the title Massive searches show much less results than the actual number (and much less than the limit of 25000) Add an option to avoid a single user returning 1000+ results instead of 50 (limit results per user, exclude users from a search, include only results from one country...) Aug 1, 2022
@mathiascode
Copy link
Member

mathiascode commented Aug 1, 2022

  1. Add an option to exclude one or several users from a search.

In Nicotine+ 3.3.0.dev1, you can ignore a user to prevent their search results from appearing.

I could probably backport this behavior to 3.2.3 as well. It's been in 3.3.0.dev1 long enough at this point.

@mathiascode
Copy link
Member

What happened in the crash did you get a traceback in a Critical Error dialog?

Since the log view history wasn't limited in 3.2.2, GTK could crash if you added too many lines.

@MasterSuggester
Copy link
Author

  1. Add an option to exclude one or several users from a search.

In Nicotine+ 3.3.0.dev1, you can ignore a user to prevent their search results from appearing.

I could probably backport this behavior to 3.2.3 as well. It's been in 3.3.0.dev1 long enough at this point.

Great! I do have a couple of questions, though:

  1. Does that option simply remove the results from the screen after they are "counted" for the limit or does it remove it from the actual search as if you included a "-term" statement?

  2. In case the option works excluding the user(s) from an actual search and not simply removing results from the screen once are counted, is there a way I can install Nicotine+ 3.3.0.dev1 ?

@slook
Copy link
Member

slook commented Aug 1, 2022

  1. The results are never added to the list in the first place, in the same way as "-term"
  2. Install package for latest 3.3.0dev1: https://github.com/nicotine-plus/nicotine-plus/blob/master/doc/TESTING.md#windows

@mathiascode
Copy link
Member

Is this issue solved?

@slook
Copy link
Member

slook commented Aug 9, 2022

  1. Add an option to exclude one or several users from a search.

Since Nicotine+ 3.2.4, the search results now exclude 'ignored' users. That would seem to solve the issue.

Please feel free to re-open the issue if you feel that something more is required.

@slook slook closed this as completed Aug 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants