Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vshare.io seems revision #14473

Closed
awei78 opened this issue Oct 12, 2017 · 3 comments
Closed

vshare.io seems revision #14473

awei78 opened this issue Oct 12, 2017 · 3 comments
Labels
bug

Comments

@awei78
Copy link

@awei78 awei78 commented Oct 12, 2017

Please follow the guide below

  • You will be asked some questions and requested to provide some information, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your issue (like this: [x])
  • Use the Preview tab to see what your issue will actually look like

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2017.10.12. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

  • I've verified and I assure that I'm running youtube-dl 2017.10.12

Before submitting an issue make sure you have:

  • At least skimmed through the README, most notably the FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

It seems the vshare.io website revision, and we can not get video url from such as [https://vshare.io/d/0f64ce6] link any more. From it's source page, we get null, it's code like:
document.getElementById('download-link').innerHTML = '<a style="text-decoration:none;" href="">Click here</a> to download requested file.';

How to do it?

@awei78
Copy link
Author

@awei78 awei78 commented Oct 12, 2017

Direct decryption the packed js code! you need PyExecJS package in python.
codes like this:

    def _real_extract(self, url):
        # 先走官方解析
        try:
            return super(VShareIE, self)._real_extract(url)
        except:
            pass

        # 旧版解析title异常,弃旧版方式
        video_id = self._match_id(url)
        # 取缩略图
        webpage = self._download_webpage(url, video_id)
        # 直解加密的js代码,取其中视频信息
        formats = self.get_formats(webpage)
        if len(formats) == 0:
            return super(VShareIE, self)._real_extract(url)

        title = self._html_search_regex(
            r'(?<=<title>)(.*)(?=</title>)', webpage, 'video title',
            default='video')
        ridx = title.rfind(' - ')
        if ridx != -1:
            title = title[0:ridx]
        description = self._og_search_description(webpage)
        thumbnail = self._html_search_regex(
            r'poster="([^"]+)"', webpage, 'thumbnail')
        thumbnail = thumbnail if thumbnail.startswith('http') else 'https:' + thumbnail
        # 取实际视频内容...此处vshare改版,不再放其下载url,直解其packed js代码
        # webpage = self._download_webpage(
        #     'https://vshare.io/d/%s' % video_id, video_id)

        # title = self._html_search_regex(
        #     r'(?s)<div id="root-container">\s+<[^\n]*\s+(.+?)<br\s?/>', webpage, 'title')
        # video_url = self._search_regex(
        #     r'<a[^>]+href=(["\'])(?P<url>(?:https?:)?//.+?)\1[^>]*>[Cc]lick\s+here',
        #     webpage, 'video url', group='url')

        return {
            'id': video_id,
            'title': title,
            'thumbnail': thumbnail,
            'formats': formats,
            'description': description
        }

    def get_formats(self, webpage):
        import re
        import execjs

        # 找出eval执行体
        pattern = r'eval\((.+)\)'
        packed_js = self._search_regex(pattern, webpage, 'js_code')
        js = execjs.eval(packed_js)

        # 处理为可解析格式
        base = self._search_regex(r'-(\d+)\)', js, 'base')
        args =self._search_regex(r'(\[[^\]]+\])', js, 'args')
        js = '''
            function f(){
                var r='';
                var v=%s;
                for (var k = 0, length = v.length; k < length; k++) {
                    r+=String.fromCharCode(parseInt(v[k])-%s);
                }
                return r;
            }
        ''' % (args, base)
        ctx = execjs.compile(js)
        source = ctx.call('f', '')
        pattern = r'src="([^"]+)"[^>]type="([^"]+)"[^>]label="([^"]+)"[^>]res="([^"]+)"'
        formats = []
        for url, ext, format_id, height in re.findall(pattern, source):
            formats.append({
                'url': url,
                'ext': ext.replace('video/', ''),
                'format_id': format_id,
                'height': height,
            }
        )

        self._sort_formats(formats)
        return formats
@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Oct 12, 2017

FYI: youtube_dl.utils.decode_packed_codes can handle packed JS codes. Don't use PyExecJS, which is dangerous.

@yan12125 yan12125 added the bug label Oct 12, 2017
@awei78
Copy link
Author

@awei78 awei78 commented Oct 12, 2017

@yan12125 It is really! thank you very much! and I will to optimize my codes.

@timendum timendum mentioned this issue Oct 23, 2017
5 of 9 tasks complete
@dstftw dstftw closed this in 0987f2d Nov 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.