Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed: (AnimeBytes) RateLimit 1req per 10s #1573

Merged
merged 1 commit into from
Apr 5, 2023

Conversation

bakerboy448
Copy link
Contributor

Database Migration

NO

Description

title

Screenshot (if UI related)

Todos

  • Tests
  • Translation Keys (./src/NzbDrone.Core/Localization/Core/en.json)
  • Wiki Updates

Issues Fixed or Closed by this PR

@bakerboy448 bakerboy448 added the Status: Ready for Review Awaiting review label Apr 5, 2023
@mynameisbogdan mynameisbogdan changed the title Fixed: (AnimeBytes) Respect Scape RateLimit 1req/10s Fixed: (AnimeBytes) Respect Scape RateLimit Apr 5, 2023
@bakerboy448 bakerboy448 changed the title Fixed: (AnimeBytes) Respect Scape RateLimit Fixed: (AnimeBytes) RateLimit 1req per 6s Apr 5, 2023
mynameisbogdan
mynameisbogdan previously approved these changes Apr 5, 2023
@proton-ab
Copy link

Please note official limit is 1 request per 10 seconds. While we indeed don't enforce it at that granularity, this should not be exploited in such ways.

I realize the limit might be too aggressive, however I also do see quite a lot of requests from Prowlarr that do not include any search query. How often Prowlarr needs to perform requests and on average how many requests would it fire per single user interaction (eg. I'd expect user searching for entry on Sonarr to generate 1 request from Prowlarr)?

@bakerboy448
Copy link
Contributor Author

bakerboy448 commented Apr 5, 2023

however I also do see quite a lot of requests from Prowlarr that do not include any search query.

Prowlarr does not do any searching on its own.....either users or third party apps (i.e. Sonarr/Radarr etc.) are triggering the searches.

empty requests are likely either rss searches or tests from clients adding the site to Prowlarr.

Starr enforce a minimum RSS interval of 15min and will attempt to page back - if paging is supported - up to 10 pages / 1000 results to find the last release seen.

How often Prowlarr needs to perform requests and on average how many requests would it fire per single user interaction (eg. I'd expect user searching for entry on Sonarr to generate 1 request from Prowlarr)?

Not possible to know as each user's setup is unique and different and will depend on how many apps they are running, the rss intervals for each app. in theory in a perfect world it's 1 request per app per rss interval (15min min)

Note that Prowlarr nightly also has improvements around paging which should reduce additional requests as well.

Perhaps that'd resolve the root cause for the increased rate limiting?

Can easily swap it to 10s instead of 6s if that's still needed with the additional context.

Note that the majority of your requests are going to be sonarr searches which due to the nature of anime, the lack of standard series naming, and the absence of anime having season packs, these can be several requests per episode....one for each alias * number of episodes being searched.

@proton-ab
Copy link

proton-ab commented Apr 5, 2023

Starr enforce a minimum RSS interval of 15min and will attempt to page back up to 10 pages / 1000 results to find the last release seen.

Have you considered using actual RSS endpoints for RSS requests? They are far less costly than performing empty query across entire dataset. Not particularly a fan of /scrape.php endpoint being used for RSS needs - it's supposed to expose search.

number of episodes being searched.

We (as most of other sites that originally started from Gazelle codebase) perform search on group-basis, not torrent basis. I'd expect your software to be aware of it by now and consolidate requests for singular episode to a base series name.

@mynameisbogdan
Copy link
Contributor

mynameisbogdan commented Apr 5, 2023

Hello @proton-ab

Please note official limit is 1 request per 10 seconds. While we indeed don't enforce it at that granularity, this should not be exploited in such ways.

Based on #1572 we thought one request at 6s was a nice compromise. Should we change to 1req/10s?

This affects radarr/sonarr when searching for something with multiple international titles, but I'm not 100%.

I realize the limit might be too aggressive, however I also do see quite a lot of requests from Prowlarr that do not include any search query.

This is the RSS sync functionality that Radarr/Sonarr triggers one requests by a user-defined interval, to fetch latest releases and grab new releases if user is monitoring them.

In this case it helps if the page returns 100 releases.

How often Prowlarr needs to perform requests and on average how many requests would it fire per single user interaction (eg. I'd expect user searching for entry on Sonarr to generate 1 request from Prowlarr)?

For interactive search, sadly it depends on the international titles a series have. If it has 10 titles but finds something on the second request, then it takes 2 requests. If doesn't find anything for any of them then it makes 10 requests. Also note that Radarr/Sonarr is making these requests, since Prowlarr is more like a "proxy".

I disabled pagination recently since scrape.php didn't seem to support it anyways, thus you should see a lower number of requests due to this too.

We (as most of other sites that originally started from Gazelle codebase) perform search on group-basis, not torrent basis. I'd expect your software to be aware of it by now and consolidate requests for singular episode to a base series name.

It's already doing that.

private string StripEpisodeNumber(string term)
{
// Tracer does not support searching with episode number so strip it if we have one
term = Regex.Replace(term, @"\W(\dx)?\d?\d$", string.Empty);
term = Regex.Replace(term, @"\W(S\d\d?E)?\d?\d$", string.Empty);
term = Regex.Replace(term, @"\W\d+$", string.Empty);
return term;
}

@bakerboy448
Copy link
Contributor Author

@DevYukine - can likely provide some background insight for the workarounds required due to AB's non-standard series naming that no other site - and thus Sonarr and XEM - will never support / details on the RSS feed or lack of

@bakerboy448
Copy link
Contributor Author

For interactive search, sadly it depends on the international titles a series have. If it has 10 titles but finds something on the second request, then it takes 2 requests. If doesn't find anything for any of them then it makes 10 requests. Also note that Radarr/Sonarr is making these requests, since Prowlarr is more like a "proxy".

Radarr will search likely only few titles "Search will use the Movie's Original Title, English Title, and Translated Title from whatever languages you have preferred in the movie's quality profile and any custom formats with scores in the quality profile greater than zero."

Sonarr will search for each and every series alias on xem & sonarr services & each and every applicable season alias - regardless of results.

Taking AOT for example - https://thexem.info/xem/show/5576

Seasons 2-4 will be 5 searches per episode
Season 1 will be 2 searches per episode

Sadly this is necessary and due to Anime sites often not supporting ID based searches combined with the lack of any standard anime release naming by uploaders.

@proton-ab
Copy link

I disabled pagination recently since scrape.php didn't seem to support it anyways, thus you should see a lower number of requests due to this too.

We support pagination for over 3 years now by passing page query parameter with page number. In fact we recently decreased number of results (which are individual groups consisting of many torrents) from 50 to 10.

@proton-ab
Copy link

Sadly this is necessary and due to Anime sites often not supporting ID based searches combined with the lack of any standard anime release naming by uploaders.

I understand the need for searching of aliases and we can't do anything about them, however I fail to see why you'd issue separate request per episode. From the results you receive, searching for episode 4 of season 3 will product exactly same result set as searching for episode 3 of season 3 because both will return Group entity which contains all episodes. You can easily cache recently performed requests per search query and consolidate external requests from Sonarr/Radarr to cut off episode number and instead return cached dataset from previous search.

@bakerboy448
Copy link
Contributor Author

You can easily cache recently performed requests per search query and consolidate external requests from Sonarr/Radarr to cut off episode number and instead return cached dataset from previous search.

No plans for Prowlarr to perform any caching.

Again, anime has no concept of season packs and Sonarr has no concept of Group - nor does any other site really use Group - and have normal names for their releases. Of note Yukine had to use the filename for single episode releases in order for AnimeBytes to even be compatible with Sonarr due to the Group names never being supported by Sonarr/Xem

@mynameisbogdan
Copy link
Contributor

I disabled pagination recently since scrape.php didn't seem to support it anyways, thus you should see a lower number of requests due to this too.

We support pagination for over 3 years now by passing page query parameter with page number. In fact we recently decreased number of results (which are individual groups consisting of many torrents) from 50 to 10.

10 results per page is sadly very low. 50 is okay, 100 is perfect. This explains why you're seeing a lot of empty requests, users set a lower interval to make sure they catch new releases.

@mynameisbogdan mynameisbogdan changed the title Fixed: (AnimeBytes) RateLimit 1req per 6s Fixed: (AnimeBytes) RateLimit 1req per 10s Apr 5, 2023
@proton-ab
Copy link

proton-ab commented Apr 5, 2023

Again, anime has no concept of season packs and Sonarr has no concept of Group - nor does any other site really use Group - and have normal names for their releases.

On the contrary, almost all of our collection is comprised of "season packs". If you assume "Attack on Titan The Final Season" is a season 4 (and the name maps directly via XEM) then all torrents under it are season packs which contain all episodes.

firefox_2023-04-05_19-34-00

Where this might get confusing is for contest that is ongoing and still does not have any "season packs" (ie. season has not finished yet), in which case singular torrents are marked to contain their respective episode.

chrome_2023-04-05_19-43-19

Or in API representation:

  [...]
      "Torrents": [
        {
          "ID": 1030600,
          "EditionData": {
            "EditionTitle": "Episode 1078"
          },
          "RawDownMultiplier": 0,
          "RawUpMultiplier": 1,
          "Link": "https://animebytes.tv/torrent/1030600/download/{:passkey}",
          "Property": "Web | MKV | h264 | 720p | AAC 2.0 | Softsubs (SubsPlease) | Freeleech",
          "Snatched": 61,
          "Seeders": 57,
          "Leechers": 0,
          "Size": 765051013,
          "FileCount": 1,
          "FileList": [
            {
              "filename": "[SubsPlease] Detective Conan - 1078 (720p) [6F2EC19D].mkv",
              "size": "765051013"
            }
          ],
          "UploadTime": "2023-04-01 11:48:21"
        },
        {
          "ID": 1030599,
          "EditionData": {
            "EditionTitle": "Episode 1078"
          },
          "RawDownMultiplier": 0,
          "RawUpMultiplier": 1,
          "Link": "https://animebytes.tv/torrent/1030599/download/{:passkey}",
          "Property": "Web | MKV | h264 | 1080p | AAC 2.0 | Softsubs (SubsPlease) | Freeleech",
          "Snatched": 121,
          "Seeders": 109,
          "Leechers": 0,
          "Size": 1507376202,
          "FileCount": 1,
          "FileList": [
            {
              "filename": "[SubsPlease] Detective Conan - 1078 (1080p) [E48EDDA8].mkv",
              "size": "1507376202"
            }
          ],
          "UploadTime": "2023-04-01 11:34:23"
        },
  [...]

I have no idea how this is handled by either Prowlarr or Sonarr/Radarr to be honest.

@proton-ab
Copy link

10 results per page is sadly very low. 50 is okay, 100 is perfect. This explains why you're seeing a lot of empty requests, users set a lower interval to make sure they catch new releases.

On the contrary since the change we've noticed decreased load. Note that our daily torrent uploads is not high enough for Prowlarr to need to paginate over or be at risk of missing new upload when searching every 15 minutes.

@proton-ab
Copy link

nor does any other site really use Group

On the contrary, many sites use groups to categorize singular series-season relation. Your perspective might be skewed naturally because of what sites *arr software generally indexes which are mostly 0-day trackers which indeed do not have such concept, but take a look at RED and you will find that each album is in its own group and searching for FLAC or MP3 version of same album will return same group from search.

@mynameisbogdan
Copy link
Contributor

mynameisbogdan commented Apr 5, 2023

On the contrary, many sites use groups to categorize singular series-season relation. Your perspective might be skewed naturally because of what sites *arr software generally indexes which are mostly 0-day trackers which indeed do not have such concept, but take a look at RED and you will find that each album is in its own group and searching for FLAC or MP3 version of same album will return same group from search.

This is not quite the same thing, for albums it's exactly the same songs just different format.

I don't have an AB account, so let's take how anime is usually released. For example One Piece I'm getting One Piece 629-746, One Piece - 747-782, One Piece - 892-928, and so on. Might be the same group aka series title, but it's not the same thing.

@proton-ab
Copy link

For example One Piece I'm getting One Piece 629-746, One Piece - 747-782, One Piece - 892-928, and so one. Might be the same group aka series title, but it's not the same thing.

You've picked rather unusual one. As a general rule AB follows anidb season scheme, so episode 629 would be same season as episode 747, unlike tvdb which splits seasons by ??? (indeed, by what is a good question, because official media does not split season in any shape of form)

@mynameisbogdan mynameisbogdan merged commit 3c60159 into Prowlarr:develop Apr 5, 2023
33 checks passed
@bakerboy448 bakerboy448 deleted the patch-7 branch May 17, 2023 23:23
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AnimeBytes] Ratelimited during search.
3 participants