-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Cardigann] Add infohash feature for download block #12258
Conversation
question: is the infohash routine in the download block aware of the before block? ---
id: kinozal
name: Kinozal
description: "Kinozal is a RUSSIAN Semi-Private Torrent Tracker for MOVIES / TV / MUSIC"
language: ru-ru
type: semi-private
encoding: windows-1251
links:
- http://kinozal.tv/ # site forces http, https is not supported
caps:
categorymappings:
# TV
- {id: 1001, cat: TV, desc: "All TV Shows"}
- {id: 45, cat: TV, desc: "Russian TV Series"}
- {id: 46, cat: TV, desc: "TV Series"}
# Movies
- {id: 1002, cat: Movies, desc: "All Movies"}
- {id: 8, cat: Movies, desc: "Movies - Comedy"}
- {id: 6, cat: Movies, desc: "Movies - Action / War"}
- {id: 15, cat: Movies, desc: "Movies - Thriller / Detective"}
- {id: 17, cat: Movies, desc: "Movies - Drama"}
- {id: 35, cat: Movies, desc: "Movies - Melodrama"}
- {id: 39, cat: Movies, desc: "Movies - Indian"}
- {id: 13, cat: Movies, desc: "Movies - Science Fiction"}
- {id: 14, cat: Movies, desc: "Movies - Fantasy"}
- {id: 24, cat: Movies, desc: "Movies - Horror / Mystery"}
- {id: 11, cat: Movies, desc: "Movies - Adventure"}
- {id: 10, cat: Movies, desc: "Movies - Russian Movies"}
- {id: 9, cat: Movies, desc: "Movies - Historical"}
- {id: 47, cat: Movies, desc: "Movies - Asian"}
- {id: 18, cat: Movies, desc: "Movies - Documentaries"}
- {id: 37, cat: Movies, desc: "Movies - Sport"}
- {id: 12, cat: Movies, desc: "Movies - Kids / Family"}
- {id: 7, cat: Movies, desc: "Movies - Classic"}
- {id: 48, cat: Movies, desc: "Movies - Concerts"}
- {id: 49, cat: Movies, desc: "Movies - Shows / TV Shows"}
- {id: 50, cat: Movies, desc: "Movies - TV Show Mir"}
- {id: 38, cat: Movies, desc: "Movies - Theatre, Opera, Ballet"}
- {id: 16, cat: Movies, desc: "Movies - Erotica"}
# Cartoons
- {id: 1003, cat: TV/Anime, desc: "All Cartoons/Anime"}
- {id: 21, cat: TV/Anime, desc: "Cartoons"}
- {id: 22, cat: TV/Anime, desc: "Cartoons - Russian"}
- {id: 20, cat: TV/Anime, desc: "Cartoons - Anime"}
# Music
- {id: 1004, cat: Audio, desc: "All Music"}
- {id: 3, cat: Audio, desc: "Music"}
- {id: 4, cat: Audio, desc: "Music - Russian"}
- {id: 5, cat: Audio, desc: "Music - Collections"}
- {id: 42, cat: Audio, desc: "Music - Classical"}
# Other
- {id: 1006, cat: TV/Other, desc: "Shows, Concerts, Sports"}
- {id: 2, cat: Audio/Audiobook, desc: "Other - AudioBooks"}
- {id: 1, cat: Audio/Video, desc: "Other - Music Video's"}
- {id: 23, cat: Console, desc: "Other - Games"}
- {id: 32, cat: PC, desc: "Other - Programs"}
- {id: 40, cat: Other, desc: "Other - Design / Graphics"}
- {id: 41, cat: Books, desc: "Other - Library"}
modes:
search: [q]
tv-search: [q, season, ep]
movie-search: [q]
music-search: [q]
book-search: [q]
settings:
- name: username
type: text
label: Username
- name: password
type: password
label: Password
- name: freeleech
type: checkbox
label: Search freeleech only
default: false
- name: striprussian
type: checkbox
label: Strip Russian Letters
default: true
- name: sort
type: select
label: Sort requested from site
default: 0
options:
0: created
1: seeders
3: size
- name: type
type: select
label: Order requested from site
default: 0
options:
0: desc
1: asc
login:
path: takelogin.php
method: post
inputs:
username: "{{ .Config.username }}"
password: "{{ .Config.password }}"
error:
- selector: div.bx1:has(div.red)
message:
selector: div.bx1 div.red
test:
path: userdetails.php
download:
before:
path: get_srv_details.php
inputs:
action: 2
id: "{{ .DownloadUri.Query.id }}"
infohash:
hash:
selector: li:first-child
filters:
- name: regexp
args: ([A-F|0-9]{40})
- name: strdump
args: hash
title:
selector: div.b
filters:
- name: trim
- name: strdump
args: title
search:
paths:
# http://kinozal.tv/browse.php?s=lucifer+2017&g=0&c=0&v=0&d=0&w=0&t=0&f=0
- path: browse.php
keywordsfilters:
# - name: diacritics # 8686
# args: replace
- name: re_replace # S01 to 1
args: ["(?i)\\bS0*(\\d+)\\b", "$1"]
- name: re_replace # S01E01 to 1 1
args: ["(?i)\\bS0*(\\d+)E0*(\\d+)\\b", "$1 $2"]
inputs:
# multi cat is not supported. so defaulting to ALL
c: 0
s: "{{ .Keywords }}"
# where 0 title, 1 person, 2 genres, 3 regular expression
g: 0
# format 0 all
v: 0
# released 0 all
d: 0
# filter 0 all, 1 today, 2 yesterday, 3 in 3 days, 4 this week, 5 per month, 6-10 size rages, 11 gold, 12 silver
w: "{{ if .Config.freeleech }}11{{ else }}0{{ end }}"
t: "{{ .Config.sort }}"
f: "{{ .Config.type }}"
rows:
selector: table > tbody > tr:has(td.bt)
fields:
category:
selector: td.bt img
attribute: onclick
filters:
- name: re_replace
args: ["[^\\d+]", ""]
title:
selector: td.nam a[href^="/details.php?id="]
filters:
# normalize to SXXEYY format
- name: replace
args: [" / ", " "]
- name: replace
args: ["Кураж-Бамбей", "kurazh"]
- name: replace
args: ["Кубик в Кубе", "Kubik"]
- name: replace
args: ["Кравец", "Kravec"]
- name: re_replace
args: ["\\((\\d+)\\s+[Сс]езон:\\s+(?:(\\d+-*\\d*)\\s+[Сс]ери[ия]\\s+.*\\d+)\\)(.*)\\s([12][0-9]{3})\\s(.*)", "$3 - S$1E$2 - rus $5"]
- name: re_replace
args: ["(\\([А-Яа-яЁё\\W]+\\))|(^[А-Яа-яЁё\\W\\d]+\\/ )|([а-яА-ЯЁё \\-]+,+)|([а-яА-ЯЁё]+)", "{{ if .Config.striprussian }}{{ else }}$1$2$3$4{{ end }}"]
- name: re_replace
args: ["\\((\\d+p)\\)", "$1"]
- name: replace
args: ["-Rip", "Rip"]
- name: replace
args: ["WEB-DL", "WEBDL"]
- name: replace
args: ["WEBDLRip", "WEBDL"]
- name: replace
args: ["HDTVRip", "HDTV"]
details:
selector: td.nam a[href^="/details.php?id="]
attribute: href
download:
selector: td.nam a[href^="/details.php?id="]
attribute: href
size:
selector: td:nth-child(4)
filters:
- name: replace
args: ["ТБ", "TB"]
- name: replace
args: ["ГБ", "GB"]
- name: replace
args: ["МБ", "MB"]
- name: replace
args: ["КБ", "KB"]
seeders:
selector: td:nth-child(5)
leechers:
selector: td:nth-child(6)
# dates come in four flavours:
date:
# now
# Today 09:10
# Yesterday 13:04
selector: td:nth-child(7):not(:contains("."))
optional: true
filters:
- name: replace
args: [" в", ""]
- name: replace
args: ["сейчас", "now"]
- name: replace
args: ["сегодня", "Today"]
- name: replace
args: ["вчера", "Yesterday"]
date:
# 24.10.2017 at 23:44
selector: td:nth-child(7):contains(".")
optional: true
filters:
- name: replace
args: [" в", ""]
- name: append
args: " +00:00" # auto adjusted by site account profile
- name: dateparse
args: "02.01.2006 15:04 -07:00"
downloadvolumefactor:
case:
a.r1: 0 # gold
a.r2: 0.5 # silver
"*": 1
uploadvolumefactor:
text: 1
minimumratio:
text: 1.0
# engine n/a I can send you creds for my kinozal a/c if you want to do your own debugging |
Yes, the infohash block doesn't affect any other block except selectors. The only condition change is that if infohash block is present, the selectors block won't be processed whether present or not. |
just spotted an error in the regexp for the hash args: ([A-F|0-9]{40}) |
shit, I'm too tired, I may have been testing kinorun by mistake. I need to go to bed. |
Don't worry, you got it. Hopefully, will issue a new pull request. |
Ok, So two main things changed.
I've pushed a commit that fixes these 2 problems and hence have tested kinozal indexer working great for which I'm opening a separate pull. EDIT: If possible, please test some indexers with the download before block to know if this doesn't breaks anything and if it does, we probably would want to fix that before merging this. |
awesome find! |
Sure, go ahead with that and let me know if I can help anyway or more changes are required. |
I just noticed in siambit and if you would've done the same |
I was confused for a bit, as during the testing of siambit indexer with your code, it failed to download when I tried the yaml with action and again when I reset it back to _action, so I was starting to think your code was broken. |
LGTM, tested kinozal, siambot, and EbookParadijs plus assorted public indexers |
this is the doc I added to the wiki https://github.com/Jackett/Jackett/wiki/Definition-format#download example of a download block using the infohash method download:
# [OPTIONAL] HTTP request which needs to be done before downloading the file
before:
path: get_srv_details.php
inputs:
action: 2
id: "{{ .DownloadUri.Query.id }}"
# [OPTIONAL] If you only have a magnet hash to work with, this method will allow you to automatically generate a magnet URI
infohash:
# [OPTIONAL] if you want the infohash and title to come from the page generated by the previous BEFORE block then include this clause.
# The default is false, which causes the infohash and title to come from the page you provided the link for in the search download block.
before: true
# [REQUIRED] Use this selector to provide the file hash for the &xt parameter of the magnet URI
hash:
# [REQUIRED] the selector to use to find the file hash
selector: a[href^="magnet:?xt="]
attribute: href
# [OPTIONAL] a list of filters which should be applied to the result of this selector
filters:
- name: querystring
args: xt
- name: replace
args: ["urn:btih:", ""]
# [REQUIRED] Use this selector to provide the title for the &dn parameter of the magnet URI
title:
# [REQUIRED] The selector used to find the title
selector: meta[property="og:title"]
attribute: content
# [OPTIONAL] a list of filters which should be applied to the result of this selector
filters:
- name: trim |
Happy to help :) |
True, the sad part of testing. |
Intends to provide a way to fix #11585 and #11389.
Following is the infohash block which is added to the download block which uses 2 selectors, one to get the hash and one to get the title of the torrent which are needed to generate the magnet URL.
As mentioned in the issues, since I was unable to directly test on the aforementioned sites, I modified the torrentv.yml definition to demonstrate the working of infohash block which is published as a gist.
The infohash block looks like this which is an excerpt directly taken from the modified definition:
The only concern of mine is with the code where a lot of duplication has risen and I can understand if it doesn't meet the quality of the project. However, if it is showing good results with other trackers, we can definitely improve it and add documentation.