[vxxx] add new extractors for vxxx and "friend" sites #31288

tabjy · 2022-10-14T06:49:17Z

Before submitting a pull request make sure you have:

Searched the bugtracker for similar pull requests
Read adding new extractor tutorial
Read youtube-dl coding conventions and adjusted the code to meet them
Covered the code with tests (note that PRs without tests will be REJECTED)
Checked the code with flake8

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

I am the original author of this code and I am willing to release it under Unlicense
I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

Bug fix
Improvement
New extractor
New feature

Description of your pull request and other information

This pull request adds extractor for vxxx.com (NSFW!) and its "friend" sites, presumably using the same technology stack, therefore, can be extracted in a similar way. These sites are:

All sites below are NSFW!

Since there is no existing issue asking for supporting the above-mentioned site, I'm attaching a site support request info here:

Checklist

I'm reporting a new site support request
I've verified that I'm running youtube-dl version 2021.12.17
I've checked that all provided URLs are alive and playable in a browser
I've checked that none of provided URLs violate any copyrights
I've searched the bugtracker for similar site support requests including closed ones

Example URLs

All links below are NSFW!

Single video: https://vxxx.com/video-80747/
Single video: https://bdsmx.tube/video/127583/latex-puppy-leashed/
Single video: https://inporn.com/video/533613/2k-t-2nd-season-parm-151/
Single video: https://xmilf.com/video/143777/big-boob-brunette-masturbates3/
Single video: https://blackporn.tube/video/10043813/young-ebony-babe-gets-super-wet/
Single video: https://mrgay.com/video/10169199/jpn-crossdresser-6/

None of these sites supports playlists.

dirkf

Thanks for your work!

I've checked that none of provided URLs violate any copyrights

Really?

Generally we assume that even an apparently fly-by-night site like these has permission to serve the media if it appears to operate a DMCA policy. A non-JS example page (that yt-dl would see if it downloaded the target URL) doesn't show evidence of this but perhaps the JS-enabled pages do?

I've made some suggestions. The main thing is that this should all go in one module and avoid duplicated code in derived extractor classes. Otherwise it's all pretty good.

youtube_dl/extractor/bdsmxtube.py

youtube_dl/extractor/xmilf.py

youtube_dl/extractor/bdsmxtube.py

youtube_dl/extractor/vxxx.py

youtube_dl/extractor/xmilf.py

youtube_dl/extractor/blackporntube.py

youtube_dl/extractor/vxxx.py

Co-authored-by: dirkf <fieldhouse@gmx.net>

tabjy · 2022-10-29T06:53:50Z

@dirkf

Sorry I've been quite busy over the last few weeks. I've applied your suggested changes and rebased onto the latest master.

Generally we assume that even an apparently fly-by-night site like these has permission to serve the media if it appears to operate a DMCA policy.

Yah they're quite sketchy. They do have DMCA policy pages, but some are just straight-out blank...

(Again, all sites below are NSFW.)

https://vxxx.com/information/dmca
https://bdsmx.tube/information/dmca/ (empty content)
https://inporn.com/information/dmca/
https://xmilf.com/information/dmca/ (empty content)
https://blackporn.tube/information/dmca/ (empty content)
https://mrgay.com/information/dmca/

Maybe we could remove supports for those with empty DMCA?

dirkf · 2022-10-30T12:38:44Z

Please exclude any sites that don't have a working DMCA page (minimal requirement: valid email address or working contact page).

If you want to make it easier to revert any excluded sites, omit them from extractor/extractors.py and either set the class var _WORKING to False with an appropriate comment or just wrap a block of excluded sites in """...""", so that yt-dl can't see the sites.

tabjy · 2022-11-02T18:26:49Z

@dirkf

Please exclude any sites that don't have a working DMCA page

Done. Thank you very much!

dirkf

Some changes needed based on the CI tests.

dirkf · 2023-02-03T03:26:30Z

youtube_dl/extractor/vxxx.py

+    unified_timestamp,
+    url_or_none,
+)
+


str.maketrans() and str.translate() need to be shimmed for Python 2.

If this works the definitions can eventually be moved into compat.py.

Suggested change

try:

compat_str_maketrans, compat_str_translate = (

compat_str.maketrans,

lambda s, table: s.translate(table)

)

except AttributeError:

# Python 2

def compat_str_maketrans(x, *args):

if not args:

return x

y, z = args[0], args[1] if len(args) > 1 else ''

if len(x) != len(y):

raise ValueError(

'the first two maketrans arguments must have equal length')

tbl = dict(zip(x, y))

tbl.update((k, None) for k in z)

return tbl

def compat_str_translate(s, table):

def xlate(c):

try:

return table[c] or ''

except LookupError:

return c

return ''.join(xlate(c) for c in s)

dirkf · 2023-02-03T03:30:39Z

youtube_dl/extractor/vxxx.py

+        def get_trans_tbl(from_, to, tbl={}):
+            k = (from_, to)
+            if not tbl.get(k):
+                tbl[k] = str.maketrans(from_, to)


Suggested change

tbl[k] = str.maketrans(from_, to)

tbl[k] = compat_str_maketrans(from_, to)

dirkf · 2023-02-03T03:32:27Z

youtube_dl/extractor/vxxx.py

+        trans_tbl = get_trans_tbl(
+            '\u0410\u0412\u0421\u0415\u041c.,~',
+            'ABCEM+/=')
+        return base64.b64decode(e.translate(trans_tbl)).decode()


Suggested change

return base64.b64decode(e.translate(trans_tbl)).decode()

return base64.b64decode(compat_str_translate(e, trans_tbl)).decode()

dirkf · 2023-02-03T03:34:41Z

youtube_dl/extractor/vxxx.py

+                self._BASE_URL,
+                self._decode_base164(format_object[0]['video_url'])
+            ),
+            video_id, 'mp4')


Let's try this.

Suggested change

video_id, 'mp4')

video_id, 'mp4', entry_protocol='m3u8_native')

Otherwise the download tests will have to be tweaked to skip the actual download.

dirkf · 2023-02-03T03:37:57Z

youtube_dl/extractor/vxxx.py

+            'categories': ['Asian', 'Brunette', 'Casting', 'HD', 'Japanese',
+                           'JAV Uncensored'],
+            'age_limit': 18,
+        },


If m3u8_native doesn't work, put this in (here and in the other tests). The skip_download line can be commented out for local testing.

Suggested change

},

},

'params': {

# ffmpeg download

'skip_download': True,

},

dirkf added nsfw site-support-request Add extractor(s) for a new domain labels Oct 15, 2022

dirkf requested changes Oct 15, 2022

View reviewed changes

tabjy and others added 11 commits October 29, 2022 02:42

[VXXX] Implement extractor for vxxx.com

6b7441e

[VXXX] Fix the non-standard base164 encoding

c0bda23

[VXXX] Support "friend" site: bdsmx.tube

a59f77e

[VXXX] Support "friend" site: inporn.com

ba4c5b3

[VXXX] Support "friend" site: xmilf.com

aaafaa2

[VXXX] Support "friend" site: blackporn.tube

a6a1c14

[VXXX] Support "friend" site: mrgay.com

f2398c0

[VXXX] Explicitly set age_limit to 18

9c5c778

[VXXX] Switch to HSL for much faster downloads

1e52250

Apply suggestions from code review

8414d8d

Co-authored-by: dirkf <fieldhouse@gmx.net>

[VXXX] Refactor and apply further code review suggestions

9d2b2f9

tabjy force-pushed the vxxx_extractor branch from 23b1ba5 to 9d2b2f9 Compare October 29, 2022 06:42

[VXXX] Remove supports for site missing DMCA notices

191d1d0

tabjy requested a review from dirkf November 11, 2022 21:05

[VXXX] fix liting

76738e4

dirkf requested changes Feb 3, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[vxxx] add new extractors for vxxx and "friend" sites #31288

[vxxx] add new extractors for vxxx and "friend" sites #31288

tabjy commented Oct 14, 2022 •

edited

dirkf left a comment

tabjy commented Oct 29, 2022

dirkf commented Oct 30, 2022

tabjy commented Nov 2, 2022

dirkf left a comment

dirkf Feb 3, 2023

dirkf Feb 3, 2023

dirkf Feb 3, 2023

dirkf Feb 3, 2023

dirkf Feb 3, 2023

+try:
+    compat_str_maketrans, compat_str_translate = (
+        compat_str.maketrans,
+        lambda s, table: s.translate(table)
+    )
+except AttributeError:
+    # Python 2
+    def compat_str_maketrans(x, *args):
+        if not args:
+            return x
+        y, z = args[0], args[1] if len(args) > 1 else ''
+        if len(x) != len(y):
+            raise ValueError(
+                'the first two maketrans arguments must have equal length')
+        tbl = dict(zip(x, y))
+        tbl.update((k, None) for k in z)
+        return tbl
+    def compat_str_translate(s, table):
+        def xlate(c):
+            try:
+                return table[c] or ''
+            except LookupError:
+                return c
+        return ''.join(xlate(c) for c in s)

	tbl[k] = str.maketrans(from_, to)
	tbl[k] = compat_str_maketrans(from_, to)

	return base64.b64decode(e.translate(trans_tbl)).decode()
	return base64.b64decode(compat_str_translate(e, trans_tbl)).decode()

	video_id, 'mp4')
	video_id, 'mp4', entry_protocol='m3u8_native')

-        },
+        },
+        'params': {
+            # ffmpeg download
+            'skip_download': True,
+        },

[vxxx] add new extractors for vxxx and "friend" sites #31288

Are you sure you want to change the base?

[vxxx] add new extractors for vxxx and "friend" sites #31288

Conversation

tabjy commented Oct 14, 2022 • edited

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

What is the purpose of your pull request?

Description of your pull request and other information

Checklist

Example URLs

dirkf left a comment

Choose a reason for hiding this comment

tabjy commented Oct 29, 2022

dirkf commented Oct 30, 2022

tabjy commented Nov 2, 2022

dirkf left a comment

Choose a reason for hiding this comment

dirkf Feb 3, 2023

Choose a reason for hiding this comment

dirkf Feb 3, 2023

Choose a reason for hiding this comment

dirkf Feb 3, 2023

Choose a reason for hiding this comment

dirkf Feb 3, 2023

Choose a reason for hiding this comment

dirkf Feb 3, 2023

Choose a reason for hiding this comment

tabjy commented Oct 14, 2022 •

edited