Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some personal requests/considerations about this beautiful project #22

Open
Jorman opened this issue Nov 2, 2019 · 14 comments
Open

Some personal requests/considerations about this beautiful project #22

Jorman opened this issue Nov 2, 2019 · 14 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@Jorman
Copy link

Jorman commented Nov 2, 2019

Hi, I just discovered this project and I think will be very goood!!!!
First I want to thank you the owner!

I've some question about this, in order to better understand the whole idea:

  1. There's any possibility to have (in future) the regex capability like radarr? Is quiete convenient create custom filter to include/exclude releases
  2. There's any idea to have more torrent downloading client supported? Like qBittorrent or others
  3. When you say "Supports multiple languages", is only for movie or also for TV show? The bad of sonarr for example is that don't grab any title from tvdb like radarr do with TMDB. For that reason I proposed one simple idea (is just a starting point), on jackett side, you can find it here Feature request - list of replacement by language for foreign languages Jackett/Jackett#2743 then I asked to sonarr guys but on sonarr side seems that the foreign language are not considered so much.

Thanks and have a nice day.

Jo

@lardbit
Copy link
Owner

lardbit commented Nov 4, 2019

  1. I think a regex could be a nice feature. There's already a similar feature in the settings called Keyword Exclusions which lets you filter out specific words.
    image
  2. I've thought about swapping out Transmission for another torrent client with more capabilities (like renaming individual files) but I doubt I'd support multiple torrent clients. I'd prefer to keep it simple. Maybe rtorrent? Any suggesstions?
  3. nefarious supports multiple languages for Movies & TV Shows, but loosely. For instance, it really supports whatever languages http://tmdb.org/ supports, which will internationalize Titles, Descriptions and Poster artwork on the output. But, it will also query Jackett using the chosen language. For instance, if your language is set to english, then it will query the movie Howl's Moving Castle vs the original Japanese title ハウルの動く城. But, I see what you're saying about TV Shows searching the english word "Season" and the schema "S01E01" difference. I haven't thought about this yet but will try to.

@Jorman
Copy link
Author

Jorman commented Nov 4, 2019

Yep, but you know, regex always win, for example I've a regex for movie because I can't simply reject eng, because sometimes there's more than one language inside the movie so ... in my case I use this regex to prevent this /(?:^|])(?:(?!\b(?:eng|ita)\b)[^]\n])*\b(sub|eng(?:\W+\w+)?\W+(sub|softsub|hardsub|multisub)\W+ita)\b/i but is only a little example.

What about torrent client? I recently switched from transmission to qbittorrent because of some bug/problem on transmission, like a very high performance degradation with large torrent, I opened a git issue 1 year ago, and qbittorrent is more flexible with qbittorrent-cli https://github.com/fedarovich/qbittorrent-cli but I agree with you if the purpose is to keep is simple maybe transmission is the best one, and rtorrent for who want to have a very good seedbox or home server 24h/24

Like you said, for the movie is quite simple, just take what tmdb have, maybe a fallback language is needed, sometimes the movie is not yet updated, so a refresh is needed when this happens.
For the tv show is more complex here there're 2 possibilities, I think

  1. nefarious offer a kind of conversion table, where the user specify one or more substitution for the chosen configured languages. In this way during a search S01 can be translated into season or stagione or temporada of full season or stagione completa or whatever
  2. jackett have to make this conversion, like I said I already opened a request about this, but I don't know if jackett will make this change and when
    Any other ideas?

What about a switch to .net framework? Is to late like sonarr and radarr or is possible? Mono have always some residual issue ...

@lardbit
Copy link
Owner

lardbit commented Nov 6, 2019

I think your TV conversion table makes sense. I borrowed Sonarr's TV regex which searches the english "Season" and "Episode" strings.

See https://github.com/lardbit/nefarious/blob/master/src/nefarious/parsers/tv.py. For example, when looking for the format: Season 01 Episode 03

regex.compile(r'(?:.*(?:"|^))(?<title>.*?)(?:[-_\W](?<![()\[]))+(?:\W?Season\W?)(?<season>(?<!\d+)\d{1,2}(?!\d+))(?:\W|_)+(?:Episode\W)(?:[-_. ]?(?<episode>(?<!\d+)\d{1,2}(?!\d+)))+', regex.I)

we'd have to swap out "Season" and "Episode" with a user specified language translation. I would propose nefarious comes with a pre-built translation table for user contributed languages.

Then, if the user's language was defined in nefarious as italian, and the translation table existed already, the regex searching would automatically swap it out. Sound reasonable?

@lardbit
Copy link
Owner

lardbit commented Nov 6, 2019

As a workaround, though, you can use the Manual Search feature which lets you search for whatever string you want. It sends the query directly to jackett.

image

@Jorman
Copy link
Author

Jorman commented Nov 6, 2019

Sound very reasonable, and is also good the manual search too!
In my opinion, have a pre-built translation table is very good, maybe is quite long because for example I don't have idea what can be in other language, I mean I know that in Italian season can be stagione, but in French? or other?
So pre-built translation is ok, but with the opportunity to edit or specify all the possible substitution, so a user can do all, at the end is a table (for each language specified other than default English) with some considerations:
One section for the complete season, pipe separated
S01 -> Stagione | S | Stagione completa
So when a complete season have to be searched all the elements are searched
Then a section for the rest, divided in 2 parts, always pipe separated
S -> Stagione | S
E -> Episodio | X | E
Obviously always case insensitive
So during a search, if specified language is in use, nefarious will search all combination instead so

Stagione 01 Episodio 01
Stagione 01 X 01
Stagione 01 E 01
S 01 Episodio 01
S 01 X 01
S 01 E 01

The only "problem" is the leading space, because when a single separator X or E is in use is quite sure that there's no space so S01E01 or S01X01 but when a word is specified the space is present, but this depend, unfortunately there's a lot of bad releaser out there! Maybe the problem can be easily resolved with the separator, so if
**E** -> Episodio | X | E
Means that the X have a space after and before and if
**E** -> Episodio |X|E
Means that the X and E don't have any space
This's only one idea, I think that is very good have a pre-built table but I also think that the best is to "leave" the user to choice all the possible translation.
In that way nefarious do all dirty work and there's no necessity to have personalized indexer inside jackett (I personally edited a lot of local indexer site to avoid this problem!)
What's your idea bout it?

@lardbit
Copy link
Owner

lardbit commented Nov 7, 2019

As far as the translation table, I'm suggesting we start off with only Italian and then future users can add to the translation table with additional languages. So, nefarious would check to see if the translation table matched the current language and then substitute every word found in the jackett results to find a match.

Currently, nefarious's approach is to send two TV queries to jackett. Let's say the show is Rick and Morty, and the user is looking for season 3, episode 1. The two queries would be:

  1. Rick and Morty
  2. Rick and Morty S03E01

nefarious scans through every result and tries to find a matching regex.

With translations in place, the query sent to jackett would need to be adjusted (however, in Italian it sounds like it would still be S03E01). And the regex patterns would definitely need to use the translations to look for episodio, stagione etc.

By querying jackett with only the show's name, 1) Rick and Morty, it throws a wider net and then looks for the specific episode/season. I've found this has better results than only doing 2. Rick and Morty S03E01 based on how indexing site's search functions work.

@Jorman
Copy link
Author

Jorman commented Nov 7, 2019

Yep, basically you right, but remember that on jackett side most tracker are configured to report only some pages, most of the time only the 1st one, not all results, so for example is possible that if you search only rick and morty you find other results instead the one you're searching for.
I think I edited in past one jackett tracker to do this kind of search, because of some problems with char like è or ò and so on.
I know that make more than one search (because of permutations) are a long and maybe sometimes unnecessary job.
Only a question, why you suggest to start with only one language profile? I though that one "universal" page for each profile can be enough? Am I wrong? Like on sonarr side
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Each language profile, that a user can add or not (default is English) have the table so can be different for each language
Do you think is best to make a wider search and then filter with regex and language table?

@lardbit
Copy link
Owner

lardbit commented Nov 7, 2019

I see what you're saying about Jackett and some sites only returning the first page for results, so only searching Rick and Morty would be really limiting the results. This is why I'm searching Rick and Morty S01E03 AND Rick and Morty as a "catch all" in case Rick and Morty S01E03 didn't match with anything on the tracker's site. We could add more permutations like you're saying, by searching:

  • Rick and Morty
  • Rick and Morty S01E03
  • Rick and Morty 01x03
  • Rick and Morty Season 1 Episode 3

But the overhead and performance would be hit pretty hard to search that many variations.

Regarding the language profile, I would prefer to make user's language translations automatically available to the main translation table. So, for instance, if I let every user define their own translations, it wouldn't benefit the next italian user. They'd have to do it themselves as well. I'm envisioning defining a simple file in the application that would define language translations and people could submit github pull requests to add/update their own translations so future users could benefit. I just want to reduce the redundancy of translations. Does that make sense?

@Jorman
Copy link
Author

Jorman commented Nov 7, 2019

Yep, permutation are very bad in performance!
I can only suggest to search
Rick and Morty 01 03 instead all others, then apply all the language regex translations, maybe is better? Is only one idea, maybe is bad maybe not, have to be tested.
Yes, make sense and at the end I think is the best solution to avoid mistakes. This will make a lot of work on your side, because when the project will start to support more languages you'll have to manage any single pull request to add and/or correct a substitution. Maybe one possibility is to collect all the custom substitution that a user do (with user agreement), filter duplicate and apply the more used.
I suggest to make available to see all the language agreement to users (inside the language profiles in order to avoid useless requests that have already present)
What do you think about?

@lardbit
Copy link
Owner

lardbit commented Nov 11, 2019

Thanks for the suggestions. I think there's definitely some great ideas here. If you're interested in learning python, let me know and I can help you get up and running to make a pull request. But I'm going to brainstorm and come up with a solution hopefully soon.

@Jorman
Copy link
Author

Jorman commented Nov 14, 2019

You're welcome! I always have some ideas, maybe not always good but with a little of brainstorm all can be good!
As you can see I try to use some language, but I never learned a real language, so I don't know if I'm able to learn python. Sometimes I feel like the matrix scene, where there's a guy that seeing the falling code say that he can see, he can understand it but he can't write it!
Lol, btw if you think, maybe I can make a try.
J

@lardbit lardbit added enhancement New feature or request help wanted Extra attention is needed labels Dec 14, 2019
@karibuTW
Copy link

Hello, back on this subject, any plan to move from Transmission to rtorrent? (Or to let connect to rtorrent daemon?)

Thank you

@lardbit
Copy link
Owner

lardbit commented Apr 25, 2022

@karibuTW, I do not have any plans to support additional torrent clients. I'm trying to keep nefarious as basic as possible.

@karibuTW
Copy link

Thanks @lardbit for the reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants