Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: categories cleanup #10006

Merged
merged 6 commits into from Oct 28, 2020
Merged

core: categories cleanup #10006

merged 6 commits into from Oct 28, 2020

Conversation

ngosang
Copy link
Member

@ngosang ngosang commented Oct 27, 2020

  • swap categories 2050 Movies/BluRay / 2060 Movies/3D
  • swap categories 6050 XXX/Pack / 6070 XXX/Other
  • swap categories 7010 Books/Mags / 7020 Books/EBook / 7030 Books/Comics
  • category validation is case sensitive
  • renamed some categories to follow Newznab specs

@ngosang
Copy link
Member Author

ngosang commented Oct 27, 2020

@garfield69 take a look at this PR.
Category renames are fine but I have found more categories not following this specs => https://newznab.readthedocs.io/en/latest/misc/api/#predefined-categories

* swap categories 2050 Movies/BluRay / 2060 Movies/3D
* swap categories 6050 XXX/Pack / 6070 XXX/Other
* swap categories 7010 Books/Mags / 7020 Books/EBook / 7030 Books/Comics
* category validation is case sensitive
* renamed some categories to follow Newznab specs
@ngosang
Copy link
Member Author

ngosang commented Oct 27, 2020

@Taloth Sorry to bother you but we need an expert in Newznab / Torznab.
Recently we have found that Jackett categories are different from this specification => https://newznab.readthedocs.io/en/latest/misc/api/#predefined-categories

  • Should we follow those specs?
  • Torznab specs say something different about categories?
  • We have some "unofficial" categories like 1180, "Console/PS4" because Newznab is a bit outdated. Is there a new version of Newznab / Torznab specification with newer categories?

@garfield69 garfield69 linked an issue Oct 28, 2020 that may be closed by this pull request
@Taloth
Copy link

Taloth commented Oct 28, 2020

The best approach there is to look at newznab and nzedb for those special categories and just pick something that's best. (https://github.com/nZEDb/nZEDb/blob/b485fa326a0ff1f47ce144164eb1f070e406b555/resources/db/schema/data/10-categories.tsv)
For example nzedb has PS3 releases as 1080, and 1180 for PS4.
You can also look at existing newznab indexers and see if they already have the category and use that.
I checked and one private one uses 1085 for PS4, and 1090 for XBOX ONE.
Another one, oznzb seems to be newznab-based and has 1100 as PS4.
I suspect nzedb started using 11xx for 'new' categories to avoid conflicting with new newznab ones in the near future, but that's gonna overlap pretty soon.
Regardless, this is one of the reasons I added the dropdown in Sonarr, so it doesn't really matter.

For the categories specified in the newznab documentation you definitely should try to follow it, because it means defaults like 5070 will work properly.
It's also the reason why when Jackett first started out I recommended to put the indexer specific categories in the 100000 range, which is being done for the bulk of jackett indexers. Each indexer specific category then gets aliased to one or more main categories like 5070. (Categories are like tags, a release can have multiple, in fact they always do, because releases in 5070 are also in 5000) It's rarely a 1 to 1 mapping.

The torznab spec refers to newznab and only adds a few new attributes for seeds, not categories. So there's no 'one truth'. It's perhaps possible to contact bb and see if the official list can be extended. But there's already some divergence so it probably doesn't matter much.
Note that nzedb is essentially a fork/competitor of newznab so it might not be easy to unify something.

Edit to add: I just saw you swapped 2050 and 2060... That's clearly newznab vs nzedb. nzedb has them inverted.

@ngosang
Copy link
Member Author

ngosang commented Oct 28, 2020

The best approach there is to look at newznab and nzedb for those special categories and just pick something that's best.
Edit to add: I just saw you swapped 2050 and 2060... That's clearly newznab vs nzedb. nzedb has them inverted.

Since there is not a clear standard I think we should follow Newznab as close as possible and add new categories when required. This PR is moving in that direction. @garfield69 We should update the Wiki to add reference links to make it clear what specs we are following.

You can also look at existing newznab indexers and see if they already have the category and use that.

I don't have access, but I think it's not really important since they are not standard. Will be interesting to see if they have more new categories. @Taloth could you share the caps of some newznab indexers so we can add newer cats?

It's also the reason why when Jackett first started out I recommended to put the indexer specific categories in the 100000 range, which is being done for the bulk of jackett indexers. Each indexer specific category then gets aliased to one or more main categories like 5070. (Categories are like tags, a release can have multiple, in fact they always do, because releases in 5070 are also in 5000) It's rarely a 1 to 1 mapping.

We are already adding custom categories in most trackers but there are some bugs in the code like #9746 I'm working on it.

Offtopic: the Torznab caps in Jackett are wrong too. In one indexer with only 2 categories (Movie/HD/Bluray and Movie/Xvid) we are generating this tree.

<categories>
  <category id="100048" name="Movie/HD/Bluray"/>
  <category id="100007" name="Movie/Xvid"/>
  <category id="2000" name="Movies">
    <subcat id="2010" name="Movies/Foreign"/>
    <subcat id="2020" name="Movies/Other"/>
    <subcat id="2030" name="Movies/SD"/>
    <subcat id="2040" name="Movies/HD"/>
    <subcat id="2045" name="Movies/UHD"/>
    <subcat id="2050" name="Movies/3D"/>
    <subcat id="2060" name="Movies/BluRay"/>
    <subcat id="2070" name="Movies/DVD"/>
    <subcat id="2080" name="Movies/WEBDL"/>
  </category>
  <category id="2030" name="Movies/SD"/>
  <category id="2040" name="Movies/HD"/>
</categories>

It's wrong because we are adding all subcategories of Movies (most of them are not supported) and some categories like Movies/SD are duplicated. With the fix I'm going to push soon the new tree will be:

<categories>
  <category id="2000" name="Movies">
    <subcat id="2030" name="Movies/SD"/>
    <subcat id="2040" name="Movies/HD"/>
  </category>
  <category id="100048" name="Movie/HD/Bluray"/>
  <category id="100007" name="Movie/Xvid"/>
</categories>

This should solve some issues in Sonarr category tree.

@garfield69 If you agree you can merge this PR.

@Taloth
Copy link

Taloth commented Oct 28, 2020

I don't have access, but I think it's not really important since they are not standard. Will be interesting to see if they have more new categories. @Taloth could you share the caps of some newznab indexers so we can add newer cats?

I could, but most indexers do not require an apikey:

https://nzbgeek.info/api?t=caps (newznab 1.0.0, not sure if that's an actual newznab version)
https://api.oznzb.com/api?t=caps (newznab 0.2.3p)
https://drunkenslug.com/api?t=caps (nzedb 0.8.21)
https://nzb.cat/api?t=caps (nzedb 0.8.19)

We are already adding custom categories in most trackers but there are some bugs in the code like #9746 I'm working on it.

Right, so you have to map the string category to an internal number, and alias that to an official newznab category.
Why I mentioned this is because those 100000 categories show up in the dropdown, and could be useful for indexers that don't have categories at all or odd categories that don't map 1-to-1. Like the 8000 problem. You could add 100001 "All" and then alias those releases in both Movies and TV.
It doesn't really matter since the users has to change the category anyway because Sonarr uses 5030 and 5040 as default.

Here's what we did for hdaccess (no long exists, but was one of the two trackers that natively supported torznab).

Offtopic: the Torznab caps in Jackett are wrong too. In one indexer with only 2 categories (Movie/HD/Bluray and Movie/Xvid) we are generating this tree.
...

It's wrong because we are adding all subcategories of Movies (most of them are not supported) and some categories like Movies/SD are duplicated. With the fix I'm going to push soon the new tree will be:

<categories>
  <category id="2000" name="Movies">
    <subcat id="2030" name="Movies/SD"/>
    <subcat id="2040" name="Movies/HD"/>
  </category>
  <category id="100048" name="Movie/HD/Bluray"/>
  <category id="100007" name="Movie/Xvid"/>
</categories>

This should solve some issues in Sonarr category tree.

It's not necessarily wrong to have subcategories that don't contain any releases, in fact it could be useful in some cases. It's like "Yes I recognize this category". In fact, by default Sonarr only selects 5030 and 5040.
But yes, the dupe would be an issue.
Are those Movie/HD/Bluray releases actually transcoded bluray (blurip) or raw untouched bluray? afaik that's what determines if it's 2040 or 2060.

add 6045(XXXSD), 6080(XXXWEBDL), 6090(XXXUHD)
@garfield69
Copy link
Contributor

@garfield69 garfield69 merged commit 767700d into Jackett:master Oct 28, 2020
@garfield69
Copy link
Contributor

the dashboard continued to use the old names for categories until I used Shift-F5 to flush the Chrome browser cache
do we need to add a dummy ?changed=2020103001 to a script src reference in the index.html to force a refresh for the next update?

@ngosang ngosang deleted the feature/cats2 branch October 29, 2020 08:17
@ngosang
Copy link
Member Author

ngosang commented Oct 29, 2020

do we need to add a dummy ?changed=2020103001 to a script src reference in the index.html to force a refresh for the next update?

That's not going to work and it's not necessary. Maybe some of them need to close and reopen the browser to see the new cats but it's not a big deal. 😃

ngosang added a commit to ngosang/Jackett that referenced this pull request Oct 31, 2020
* Core: Categories are stored in a real tree
* Sorting: First Torznab categories sorted by Id and then custom cats sorted by Name
* Filtering: Results with child category are not removed when searching by parent category. Details in Jackett#8049
* Jacket UI: Add parent category when at least one child category exists
* Torznab (caps): Remove non existent children categories. Remove duplicated categories. Details in Jackett#10006
ngosang added a commit to ngosang/Jackett that referenced this pull request Oct 31, 2020
* Core: Categories are stored in a real tree
* Sorting: First Torznab categories sorted by Id and then custom cats sorted by Name
* Filtering: Results with child category are not removed when searching by parent category. Details in Jackett#8049
* Jacket UI: Add parent category when at least one child category exists
* Torznab (caps): Remove non existent children categories. Remove duplicated categories. Details in Jackett#10006
ngosang added a commit to ngosang/Jackett that referenced this pull request Nov 1, 2020
* Core: Categories are stored in a real tree
* Sorting: First Torznab categories sorted by Id and then custom cats sorted by Name
* Filtering: Results with child category are not removed when searching by parent category. Details in Jackett#8049
* Jacket UI: Add parent category when at least one child category exists
* Torznab (caps): Remove non existent children categories. Remove duplicated categories. Details in Jackett#10006
ngosang added a commit that referenced this pull request Nov 1, 2020
* Core: Categories are stored in a real tree
* Sorting: First Torznab categories sorted by Id and then custom cats sorted by Name
* Filtering: Results with child category are not removed when searching by parent category. Details in #8049
* Jacket UI: Add parent category when at least one child category exists
* Torznab (caps): Remove non existent children categories. Remove duplicated categories. Details in #10006
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Core: correct Other as 8000 and Books as 7000
3 participants