Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing results in Jackett (search : manual and auto) #10733

Closed
WoisWoi opened this issue Jan 4, 2021 · 20 comments
Closed

Missing results in Jackett (search : manual and auto) #10733

WoisWoi opened this issue Jan 4, 2021 · 20 comments

Comments

@WoisWoi
Copy link

WoisWoi commented Jan 4, 2021

Environment

OS: Linux Ubunta Server 20.04 (x64)

Jackett Version: v0.17.197

Last Working Jackett Version: v0.17.197 I guess

Are you using a proxy or VPN? no

Description

When I do a research with Radarr, Sonarr or manually directly into Jackett, results are missing (compared to the one I get by searching directly on the tracker). In this case, YGGTorrent is the tracker and I'm using FlareSolverr with the hcaptcha-solver.

Log

"2021-01-04 23:37:23.6561 Info Manual search in yggtorrent for Supernatural => Found 100 releases "
No error(s) on the search. Is there a limitation to 100 releases found ? I found 346 results on the tracker with the same request.

@ilike2burnthing
Copy link
Contributor

URL when searching on ygg:
https://www2.yggtorrent.si/engine/search?name=supernatural&do=search

URL when searching through Jackett (using default indexer settings):
https://www2.yggtorrent.si/engine/search?category=all&name=supernatural&description=&file=&uploader=&sub_category=&do=search&order=desc&sort=publish_date

Both return 346 results. The difference is just that the latter is sorted by date (which you can change in the indexer's settings).

We're just retrieving the first page of results (twice, due to #9114 - @xfouloux was that unavoidable? We end up with most/all results duplicated), hence why there are results not included. While we can increase the number of pages retrieved, due to the current setup we'd be requesting 4 pages instead of 2, and still only get 100 unique results - increasing search times for users, and server loads for ygg (which I'm pretty sure are already an issue for them).

Does that explain what you're seeing, or was there a different issue?

@xfouloux
Copy link
Contributor

xfouloux commented Jan 5, 2021

Well it's been a long time since we do not request multiple pages on the YGG Indexer i think, hence the 100 result limit

If there is a way to detect there is more than 1 page of results, then perform the request to page 2, etc etc, i don't remember if there is something like this possible in YML

My pull included the fix about French 'Saison' (season) word to be matched on search, but because of YML indexer limitation, i had not found any other way than doing a second search with a regex replace of "Season" by "Saison" and still a normal search with Season word, then bunch of regex for the results to display "Season" everywhere.

Unless someone find a better way, or do it as a C# indexer, there will always be issues :(

Or maybe the jackett core could be having a remove duplicate function ?

@xfouloux
Copy link
Contributor

xfouloux commented Jan 5, 2021

In any case, i remember that we removed the multiple pages, because sonarr, radarr and such should be precise enough to not make a request that would return lots of results to YGG, but well it could i guess, especially for animes

It really need a big rework in C# someday

@ilike2burnthing
Copy link
Contributor

ilike2burnthing commented Jan 5, 2021

Yea, the request for the second page ...&page=50 was swapped out for the saison fix, so if we wanted the second page, we'd need to (well, should) include it for the fix as well, therefore making 4 requests to the site.

Though we could change it so at least the keywordless search gets the second page, no duplicates:

- path: "https://{{ .Config.searchanddlurl }}/{{ if .Config.betasearchengine }}new_search{{ else }}engine{{ end }}/search?category={{ .Config.category }}&name={{ re_replace .Keywords \"[sS]0(\\d{1,2})\" \"Saison.$1\"}}&description=&file=&uploader=&sub_category=&do=search&order={{ .Config.type }}&sort={{ .Config.sort }}"

...&name={{ if .Keywords }}{{ re_replace .Keywords \"[sS]0(\\d{1,2})\"  \"Saison.$1\"}}{{ else }}&page=50{{ end }}&description=...

I'll test it in a bit.

EDIT: yep, that worked. Will have another look at the saison fix, but I very much I'll get anywhere.

@WoisWoi
Copy link
Author

WoisWoi commented Jan 5, 2021

Yea, the request for the second page ...&page=50 was swapped out for the saison fix, so if we wanted the second page, we'd need to (well, should) include it for the fix as well, therefore making 4 requests to the site.

Though we could change it so at least the keywordless search gets the second page, no duplicates:

- path: "https://{{ .Config.searchanddlurl }}/{{ if .Config.betasearchengine }}new_search{{ else }}engine{{ end }}/search?category={{ .Config.category }}&name={{ re_replace .Keywords \"[sS]0(\\d{1,2})\" \"Saison.$1\"}}&description=&file=&uploader=&sub_category=&do=search&order={{ .Config.type }}&sort={{ .Config.sort }}"

...&name={{ if .Keywords }}{{ re_replace .Keywords \"[sS]0(\\d{1,2})\"  \"Saison.$1\"}}{{ else }}&page=50{{ end }}&description=...

I'll test it in a bit.

EDIT: yep, that worked. Will have another look at the saison fix, but I very much I'll get anywhere.

image
The problem is still there, I haven't noticed any differences.

@ilike2burnthing
Copy link
Contributor

I've not changed anything for the repo just yet, and when I do it will only affect keywordless searches (what Sonarr/Radarr/*arr use to check for newly posted torrents), not keyword searches.

I'll post when the commit is done, and again when the new stable build is out. Feel free to try out the above if you want, just edit yggtorrent.yml from:

...&name={{ re_replace .Keywords \"[sS]0(\\d{1,2})\"  \"Saison.$1\"}}&description=...

to:

...&name={{ if .Keywords }}{{ re_replace .Keywords \"[sS]0(\\d{1,2})\"  \"Saison.$1\"}}{{ else }}&page=50{{ end }}&description=...

and restart Jackett.

@WoisWoi
Copy link
Author

WoisWoi commented Jan 5, 2021

I've not changed anything for the repo just yet, and when I do it will only affect keywordless searches (what Sonarr/Radarr/*arr use to check for newly posted torrents), not keyword searches.

I'll post when the commit is done, and again when the new stable build is out. Feel free to try out the above if you want, just edit yggtorrent.yml from:

...&name={{ re_replace .Keywords \"[sS]0(\\d{1,2})\"  \"Saison.$1\"}}&description=...

to:

...&name={{ if .Keywords }}{{ re_replace .Keywords \"[sS]0(\\d{1,2})\"  \"Saison.$1\"}}{{ else }}&page=50{{ end }}&description=...

and restart Jackett.

image
I did this as you sent "2nd page for keywordless search", but still

Results on sonarr for example :
image

some of the missing results from the tracker for this example
image

@ilike2burnthing
Copy link
Contributor

ilike2burnthing commented Jan 6, 2021

it will only affect keywordless searches [...] not keyword searches

Open the search for ygg in Jackett, don't type anything in the query box, and click the search button. This should show the same results as the first two pages on ygg when you perform a keywordless/empty search (sorted by date).

Nothing has changed when using keyword searches (e.g. Supernatural).

If you don't need the saison fix, you can just use this instead of the current second path:

    - path: "https://{{ .Config.searchanddlurl }}/{{ if .Config.betasearchengine }}new_search{{ else }}engine{{ end }}/search?category={{ .Config.category }}&name={{ if .Config.betasearchengine }}{{ .Keywords }}{{ else }}{{ re_replace .Keywords \"\\b[^\\s]+\\b\"  \"\"$&\"\"}}{{ end }}&description=&file=&uploader=&sub_category=&do=search&order={{ .Config.type }}&sort={{ .Config.sort }}&page=50"

You'll either have to change this with each update to Jackett or create a copy of the .yml file in /config/cardigann/definitions/, with that line edited, change the name and id as well, then use this new indexer in Jackett (you can find more details here, just ignore the bits about Docker - #10646).

Doing so though will mean that you'll have to keep an eye out for any updates to the file and apply them yourself/repeat the above - https://github.com/Jackett/Jackett/commits/master/src/Jackett.Common/Definitions/yggtorrent.yml

@xfouloux
Copy link
Contributor

xfouloux commented Jan 6, 2021

Because multiple thinking minds can do much better things =) !

Here is what i propose :

If there is no keywords, it returns 4 pages of results (200)
if there is a keyword, then it will do 2 sets of 2 search queries :

  • Searching normally with Sxx
  • Searching with "Saison" Keyword

Working fine, yeah there will be duplicates sometimes when doing a Keyword search, and it's unavoidable. But it doesn't break anything (maybe sonarr when manually sorting sometimes, and well, for radarr it will do 4 times the same search :/ )

  paths:
    #Normal Search
    - path: "https://{{ .Config.searchanddlurl }}/{{ if .Config.betasearchengine }}new_search{{ else }}engine{{ end }}/search?category={{ .Config.category }}&name={{ if .Config.betasearchengine }}{{ .Keywords }}{{ else }}{{ re_replace .Keywords \"\\b[^\\s]+\\b\"  \"\"$&\"\"}}{{ end }}&description=&file=&uploader=&sub_category=&do=search&order={{ .Config.type }}&sort={{ .Config.sort }}"
      followredirect: true
    - path: "https://{{ .Config.searchanddlurl }}/{{ if .Config.betasearchengine }}new_search{{ else }}engine{{ end }}/search?category={{ .Config.category }}&name={{ if .Config.betasearchengine }}{{ .Keywords }}{{ else }}{{ re_replace .Keywords \"\\b[^\\s]+\\b\"  \"\"$&\"\"}}{{ end }}&description=&file=&uploader=&sub_category=&do=search&order={{ .Config.type }}&sort={{ .Config.sort }}&page=50"
      followredirect: true
    #Search with Saison word instead of Sxx
    - path: "https://{{ .Config.searchanddlurl }}/{{ if .Config.betasearchengine }}new_search{{ else }}engine{{ end }}/search?category={{ .Config.category }}&name={{ if .Keywords }}{{ re_replace .Keywords \"[sS]0(\\d{1,2})\"  \"Saison.$1\"}}{{ else }}&page=100{{ end }}&description=&file=&uploader=&sub_category=&do=search&order={{ .Config.type }}&sort={{ .Config.sort }}"
      followredirect: true
    - path: "https://{{ .Config.searchanddlurl }}/{{ if .Config.betasearchengine }}new_search{{ else }}engine{{ end }}/search?category={{ .Config.category }}&name={{ if .Keywords }}{{ re_replace .Keywords \"[sS]0(\\d{1,2})\"  \"Saison.$1\"}}&page=50{{ else }}&page=150{{ end }}&description=&file=&uploader=&sub_category=&do=search&order={{ .Config.type }}&sort={{ .Config.sort }}"
      followredirect: true

Think it's a good alternative @ilike2burnthing ?

Keyword search

image

No Keywords

image

Proof Search

You can see it works fine searching for example for "Battlestar Galactica S1" one of the torrent name is "Battlestar Galactica Saison 1 [720P] [FR] [ENG]" and jackett displays it correctly "Battlestar Galactica S01 [720P] [FR] [ENG]" among the other results

image

@WoisWoi
Copy link
Author

WoisWoi commented Jan 6, 2021

Because multiple thinking minds can do much better things =) !

Here is what i propose :

If there is no keywords, it returns 4 pages of results (200)
if there is a keyword, then it will do 2 sets of 2 search queries :

* Searching normally with Sxx

* Searching with "Saison" Keyword

Working fine, yeah there will be duplicates sometimes when doing a Keyword search, and it's unavoidable. But it doesn't break anything (maybe sonarr when manually sorting sometimes, and well, for radarr it will do 4 times the same search :/ )

  paths:
    #Normal Search
    - path: "https://{{ .Config.searchanddlurl }}/{{ if .Config.betasearchengine }}new_search{{ else }}engine{{ end }}/search?category={{ .Config.category }}&name={{ if .Config.betasearchengine }}{{ .Keywords }}{{ else }}{{ re_replace .Keywords \"\\b[^\\s]+\\b\"  \"\"$&\"\"}}{{ end }}&description=&file=&uploader=&sub_category=&do=search&order={{ .Config.type }}&sort={{ .Config.sort }}"
      followredirect: true
    - path: "https://{{ .Config.searchanddlurl }}/{{ if .Config.betasearchengine }}new_search{{ else }}engine{{ end }}/search?category={{ .Config.category }}&name={{ if .Config.betasearchengine }}{{ .Keywords }}{{ else }}{{ re_replace .Keywords \"\\b[^\\s]+\\b\"  \"\"$&\"\"}}{{ end }}&description=&file=&uploader=&sub_category=&do=search&order={{ .Config.type }}&sort={{ .Config.sort }}&page=50"
      followredirect: true
    #Search with Saison word instead of Sxx
    - path: "https://{{ .Config.searchanddlurl }}/{{ if .Config.betasearchengine }}new_search{{ else }}engine{{ end }}/search?category={{ .Config.category }}&name={{ if .Keywords }}{{ re_replace .Keywords \"[sS]0(\\d{1,2})\"  \"Saison.$1\"}}{{ else }}&page=100{{ end }}&description=&file=&uploader=&sub_category=&do=search&order={{ .Config.type }}&sort={{ .Config.sort }}"
      followredirect: true
    - path: "https://{{ .Config.searchanddlurl }}/{{ if .Config.betasearchengine }}new_search{{ else }}engine{{ end }}/search?category={{ .Config.category }}&name={{ if .Keywords }}{{ re_replace .Keywords \"[sS]0(\\d{1,2})\"  \"Saison.$1\"}}&page=50{{ else }}&page=150{{ end }}&description=&file=&uploader=&sub_category=&do=search&order={{ .Config.type }}&sort={{ .Config.sort }}"
      followredirect: true

Think it's a good alternative @ilike2burnthing ?

Keyword search

image

No Keywords

image

Proof Search

You can see it works fine searching for example for "Battlestar Galactica S1" one of the torrent name is "Battlestar Galactica Saison 1 [720P] [FR] [ENG]" and jackett displays it correctly "Battlestar Galactica S01 [720P] [FR] [ENG]" among the other results

image

Better, for example :
image
image
But why do I get 200 results when you get 14 with Jackett (on the same keyword manual search) ?

And there is still some "problem" ->

image
image

In any case, thanks a lot for your (lovely) help ! :)

@xfouloux
Copy link
Contributor

xfouloux commented Jan 6, 2021

hummmm you are using the Beta search engine check option, i get 113 results too when using it.
Unfortunately, the beta search option, is good in certain case, and bad in others.

The Beta search will search for anything that have 'Supernatural' in it in YGG website, it is not a jackett issue, ygg take the search term and remove everything that looks like episode number and such, doing a wider search

Test without the option checked, and see if it fits you better

if you want to try on the website, change /engine/ by /new_search/ in the URL

https://www2.yggtorrent.si/engine/search?name=Supernatural+s11&do=search
BETA SEARCH
https://www2.yggtorrent.si/new_search/search?name=Supernatural+s11&do=search

image

Also on your second screenshot, you are searching for s5 in jacket and s05 in ygg, it can return different stuff

Regards

@WoisWoi
Copy link
Author

WoisWoi commented Jan 6, 2021

hummmm you are using the Beta search engine check option, i get 113 results too when using it.
Unfortunately, the beta search option, is good in certain case, and bad in others.

The Beta search will search for anything that have 'Supernatural' in it in YGG website, it is not a jackett issue, ygg take the search term and remove everything that looks like episode number and such, doing a wider search

Test without the option checked, and see if it fits you better

if you want to try on the website, change /engine/ by /new_search/ in the URL

https://www2.yggtorrent.si/engine/search?name=Supernatural+s11&do=search
BETA SEARCH
https://www2.yggtorrent.si/new_search/search?name=Supernatural+s11&do=search

image

Also on your second screenshot, you are searching for s5 in jacket and s05 in ygg, it can return different stuff

Regards

Thanks a lot ! 🥇
Ou devrais-je dire merci beaucoup, comme tu es francophone 👍

@ilike2burnthing
Copy link
Contributor

I had considered this, but had a few reservations:

  1. a similar method was rejected in the initial fix - 748bff1
  2. for a site that already seems to be under strain, doubling our contribution to their sever load is probably not a great idea - presumably the reason for 1)
  3. keywordless searches will return 200 results, but Sonarr/Radarr/*arr will only take the first 100, so this won't help improve anything in that case
  4. keyword searches with >100 results will face the same issue as 4)
  5. keyword searches with >50 but <100 results would be the only ones to benefit without waste from this change

@xfouloux
Copy link
Contributor

xfouloux commented Jan 7, 2021

Yeah there is no perfect solution unfortunately without doing a C# indexer :/
looking forward to have some docs about it to be able to tackle it

@xfouloux
Copy link
Contributor

xfouloux commented Jan 7, 2021

is there any way to hack this, and make the if else do something on keyword search and fail (bad url) when keywordless ?

@ilike2burnthing
Copy link
Contributor

Changing &page=100 and &page=150 to / results in HTTP error 500 in browser, though I haven't tested it in Jackett.

If it doesn't result in errors in Jackett's logs (which I imagine it will), then it could be a possible hacky way to only pull 2 keywordless results pages (i.e. the vast majority of any traffic we would be causing), but still get 4 for keyword searches.

@WoisWoi
Copy link
Author

WoisWoi commented Jan 7, 2021

My problem now, is that, Radarr and Sonarr are saying quite often that the indexer (ygg) failed due to errors, but when I test it, it comes back to normal. Could you have a solution, perhaps ? Or a way to auto-test the indexer when there is a failure ?

@ilike2burnthing
Copy link
Contributor

I'd suggest asking over on https://forums.sonarr.tv/ but it will retest in the sense that it will keep trying to use the indexer every 15/30/60mins or whatever you have it set to. If errors continue for extended periods of time, it will increase the delay - https://wiki.servarr.com/Sonarr_System#indexers_are_unavailable_due_to_failures

@xfouloux
Copy link
Contributor

it's an issue with the captcha not being solved all the time i guess via flaresolverr, and then it goes into failed state :/

@xfouloux
Copy link
Contributor

@WoisWoi and for whoever wants to do this, here is how you can do a testall indexer via a bash script (i do mine every hours to keep indexer OK, especially YGG)

#!/bin/bash

SonarrMultiApiKey=xxxxxxxxxxxxx
SonarrApiKey=xxxxxxxxxxxxx
RadarrApiKey=xxxxxxxxxxxxxxxx
LidarrApiKey=xxxxxxxxxxxxxxx

#IF YOU HAVE DOCKER, ELSE PUT YOUR IP, OR EVEN 127.0.0.1 i don't know your config
SonarrMultiDockerName=sonarrmulti 
SonarrDockerName=sonarr 
RadarrDockerName=radarr-v3
LidarrDockerName=lidarr 

SonarrMultiIP=`docker inspect $SonarrMultiDockerName | jq -r '.[].NetworkSettings.Networks.cloudbox.IPAddress'`
SonarrIP=`docker inspect $SonarrDockerName | jq -r '.[].NetworkSettings.Networks.cloudbox.IPAddress'`
RadarrIP=`docker inspect $RadarrDockerName | jq -r '.[].NetworkSettings.Networks.cloudbox.IPAddress'`
LidarrIP=`docker inspect $LidarrDockerName | jq -r '.[].NetworkSettings.Networks.cloudbox.IPAddress'`

curl -H "Content-Type: application/json" -X POST -d '{length:30}' http://$SonarrMultiIP:8989/api/v3/indexer/testall?apikey=$SonarrMultiApiKey
curl -H "Content-Type: application/json" -X POST -d '{length:30}' http://$SonarrIP:8989/api/v3/indexer/testall?apikey=$SonarrApiKey
curl -H "Content-Type: application/json" -X POST -d '{length:30}' http://$RadarrIP:7878/api/v3/indexer/testall?apikey=$RadarrApiKey
curl -H "Content-Type: application/json" -X POST -d '{length:30}' http://$LidarrIP:8686/api/v1/indexer/testall?apikey=$LidarrApiKey

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants