New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[vxxx] add new extractors for vxxx and "friend" sites #31288
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work!
I've checked that none of provided URLs violate any copyrights
Really?
Generally we assume that even an apparently fly-by-night site like these has permission to serve the media if it appears to operate a DMCA policy. A non-JS example page (that yt-dl would see if it downloaded the target URL) doesn't show evidence of this but perhaps the JS-enabled pages do?
I've made some suggestions. The main thing is that this should all go in one module and avoid duplicated code in derived extractor classes. Otherwise it's all pretty good.
Co-authored-by: dirkf <fieldhouse@gmx.net>
Sorry I've been quite busy over the last few weeks. I've applied your suggested changes and rebased onto the latest master.
Yah they're quite sketchy. They do have DMCA policy pages, but some are just straight-out blank... (Again, all sites below are NSFW.)
Maybe we could remove supports for those with empty DMCA? |
Please exclude any sites that don't have a working DMCA page (minimal requirement: valid email address or working contact page). If you want to make it easier to revert any excluded sites, omit them from |
Done. Thank you very much! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some changes needed based on the CI tests.
unified_timestamp, | ||
url_or_none, | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
str.maketrans()
and str.translate()
need to be shimmed for Python 2.
If this works the definitions can eventually be moved into compat.py
.
try: | |
compat_str_maketrans, compat_str_translate = ( | |
compat_str.maketrans, | |
lambda s, table: s.translate(table) | |
) | |
except AttributeError: | |
# Python 2 | |
def compat_str_maketrans(x, *args): | |
if not args: | |
return x | |
y, z = args[0], args[1] if len(args) > 1 else '' | |
if len(x) != len(y): | |
raise ValueError( | |
'the first two maketrans arguments must have equal length') | |
tbl = dict(zip(x, y)) | |
tbl.update((k, None) for k in z) | |
return tbl | |
def compat_str_translate(s, table): | |
def xlate(c): | |
try: | |
return table[c] or '' | |
except LookupError: | |
return c | |
return ''.join(xlate(c) for c in s) | |
def get_trans_tbl(from_, to, tbl={}): | ||
k = (from_, to) | ||
if not tbl.get(k): | ||
tbl[k] = str.maketrans(from_, to) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbl[k] = str.maketrans(from_, to) | |
tbl[k] = compat_str_maketrans(from_, to) |
trans_tbl = get_trans_tbl( | ||
'\u0410\u0412\u0421\u0415\u041c.,~', | ||
'ABCEM+/=') | ||
return base64.b64decode(e.translate(trans_tbl)).decode() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return base64.b64decode(e.translate(trans_tbl)).decode() | |
return base64.b64decode(compat_str_translate(e, trans_tbl)).decode() |
self._BASE_URL, | ||
self._decode_base164(format_object[0]['video_url']) | ||
), | ||
video_id, 'mp4') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try this.
video_id, 'mp4') | |
video_id, 'mp4', entry_protocol='m3u8_native') |
Otherwise the download tests will have to be tweaked to skip the actual download.
'categories': ['Asian', 'Brunette', 'Casting', 'HD', 'Japanese', | ||
'JAV Uncensored'], | ||
'age_limit': 18, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If m3u8_native
doesn't work, put this in (here and in the other tests). The skip_download
line can be commented out for local testing.
}, | |
}, | |
'params': { | |
# ffmpeg download | |
'skip_download': True, | |
}, |
Before submitting a pull request make sure you have:
In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:
What is the purpose of your pull request?
Description of your pull request and other information
This pull request adds extractor for vxxx.com (NSFW!) and its "friend" sites, presumably using the same technology stack, therefore, can be extracted in a similar way. These sites are:
All sites below are NSFW!
Since there is no existing issue asking for supporting the above-mentioned site, I'm attaching a site support request info here:
Checklist
Example URLs
All links below are NSFW!
None of these sites supports playlists.