Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move porn sites extractors to a separate branch or repo or simple way to delete them #6497

Closed
remitamine opened this issue Aug 8, 2015 · 9 comments

Comments

@remitamine
Copy link
Collaborator

@remitamine remitamine commented Aug 8, 2015

some way to make the presence of this type of extractors optional and not installed by default.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Aug 8, 2015

I don't see much sense in this since this will result in additional headache and bogus bugreports. There is an --age-limit option that can disable sites for adults.

@dstftw dstftw closed this Aug 8, 2015
@remitamine
Copy link
Collaborator Author

@remitamine remitamine commented Aug 8, 2015

i know about --age-limit.
but i think it's better to get them in a seperate place and not shipped by default.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Aug 8, 2015

No. Not shipping them by default will definitely result in tons of bogus reports for unsupported URLs, that are actually supported but not included in default package.

@remitamine
Copy link
Collaborator Author

@remitamine remitamine commented Aug 8, 2015

i don't mean remove them at all, just move them, i mean if someone want them he can get them from a separate place(for example in separate repo).
and may be keep the issue open for discussion as a feature request.
may be others have ideas about better way to do this.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Aug 8, 2015

Once again. An average user will never ever bother doing anything more complex than grabbing the binary or copy pasting the install command needless to say of some additional actions. Especially now when adult sites are incorporated in youtube-dl for a bunch of years already imagine what will happen if we stop shipping them by default - we will drown in bogus bugreports (same as gentoo no-offensive-sites package but on much larger scale). Probably it make some sense to make a non-default lite-version but it still does not resolve aforementioned problem since one may mistakenly grab wrong package.

@remitamine
Copy link
Collaborator Author

@remitamine remitamine commented Aug 8, 2015

simple way to delete them

than what about ship a small script to remove them, if someone don't need them just execute it.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Aug 8, 2015

I don't know if this will work for py2exe binary.

@remitamine
Copy link
Collaborator Author

@remitamine remitamine commented Aug 8, 2015

if it is possible to add it also as a compile time flag(option)? so it is possible to make version with and without this type of sites.
but i can't test it, i'm working in linux.

@remitamine
Copy link
Collaborator Author

@remitamine remitamine commented Aug 8, 2015

i created this script to remove the uneeded extractors:

from youtube_dl.extractor import *
import re
import os

def list_non_suitable_extractors(age_limit):
    """
    Return a list of extractors that are not suitable for the given age,
    sorted by extractor ID.
    """

    return sorted(
        filter(lambda ie: not ie.is_suitable(age_limit), gen_extractors()),
        key=lambda ie: ie.IE_NAME.lower())

non_suitable_extractors = list_non_suitable_extractors(0)

extractors_dir = 'youtube_dl/extractor/'

with open(extractors_dir + '__init__.py') as f:
    content = f.read()
    for extractor in non_suitable_extractors:
        content = re.sub(r'from \.' + extractor.__module__.split('.')[-1] + r' import (?:[A-Za-z0-9,\s]+|\([^\)]+\))\n', '', content)

with open(extractors_dir + '__init__.py', 'w') as f:
    f.write(content)

with open(extractors_dir + 'generic.py') as f:
    content = f.read()
    for extractor in non_suitable_extractors:
        content = re.sub(r'from \.' + extractor.__module__.split('.')[-1] + r' import (?:[A-Za-z0-9,\s]+|\([^\)]+\))\n', '', content)
        content = re.sub(r'#[^#]+' + extractor.__module__.split('.')[-1] + r'[^#]+', '', content)

with open(extractors_dir + 'generic.py', 'w') as f:
    f.write(content)

for extractor in non_suitable_extractors:
    extractor_filename = extractors_dir + extractor.__module__.split('.')[-1] + '.py'
    if os.path.isfile(extractor_filename):
        os.remove(extractor_filename)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.