Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wholecloud] Unable to extract filekey #10472

Closed
ghost opened this issue Aug 28, 2016 · 6 comments
Closed

[wholecloud] Unable to extract filekey #10472

ghost opened this issue Aug 28, 2016 · 6 comments
Assignees
Labels
bug

Comments

@ghost
Copy link

@ghost ghost commented Aug 28, 2016

  • I've verified and I assure that I'm running youtube-dl 2016.08.28

Before submitting an issue make sure you have:

  • At least skimmed through README and most notably FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other
$ y -v 'http://www.wholecloud.net/video/559e28be54d96'
[debug] System config: []
[debug] User config: ['-4', '-f', 'best', '--prefer-free-formats', '--no-cache-dir', '--no-mtime', '--youtube-skip-dash-manifest']
[debug] Command-line args: ['-v', 'http://www.wholecloud.net/video/559e28be54d96']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2016.08.28
[debug] Python version 3.6.0a4 - Linux-4.7.0-rc7-686-i686-with-debian-stretch-sid
[debug] exe versions: ffmpeg 3.1.3, ffprobe 3.1.3, rtmpdump 2.4
[debug] Proxy map: {}
[wholecloud] 559e28be54d96: Downloading video page
[wholecloud] 559e28be54d96: Downloading continue to the video page
ERROR: Unable to extract filekey; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/youtube_dl/YoutubeDL.py", line 691, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/lib/python3.6/dist-packages/youtube_dl/extractor/common.py", line 347, in extract
    return self._real_extract(url)
  File "/usr/local/lib/python3.6/dist-packages/youtube_dl/extractor/novamov.py", line 79, in _real_extract
    filekey = extract_filekey()
  File "/usr/local/lib/python3.6/dist-packages/youtube_dl/extractor/novamov.py", line 55, in extract_filekey
    self._FILEKEY_REGEX, webpage, 'filekey', default=default)
  File "/usr/local/lib/python3.6/dist-packages/youtube_dl/extractor/common.py", line 650, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract filekey; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

http://www.wholecloud.net/video/559e28be54d96

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Sep 2, 2016

I see no videos in Firefox. Can you watch it?

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Sep 2, 2016

Related: #10414

@yan12125 yan12125 mentioned this issue Sep 2, 2016
4 of 8 tasks complete
@abejfehr
Copy link

@abejfehr abejfehr commented Mar 24, 2017

This issue doesn't seem to be resolved with wholecloud

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Mar 24, 2017

Nothing happens here as the original video http://www.wholecloud.net/video/559e28be54d96 is broken in browsers, too, so I can't test it. @abejfehr Do you have another example?

@abejfehr
Copy link

@abejfehr abejfehr commented Mar 24, 2017

This is the link I was using: http://www.wholecloud.net/video/fb7c8fc058bf0

@yan12125 yan12125 added bug and removed clarification-needed labels Mar 24, 2017
@yan12125 yan12125 self-assigned this Mar 25, 2017
@FDMX2
Copy link

@FDMX2 FDMX2 commented Apr 7, 2017

Novamov completely changed their webpages by presenting the user with a post form with a random key. After posting the form, the webpage with the link (embedded via html5) is presented.
Thus the old way of resolving the video link is obsolete and no filekey is available anymore.

This includes following services:

  • WholeCloud
  • NowVideo
  • (VideoWeed) BitVid
  • CloudTime
  • AuroraVid

Steps to get the video url:

  1. get webpage from original url
  2. get post key (called 'stepkey' now) from downloaded webpage
  3. post form with extracted 'stepkey'
  4. extract video link from post response

VideoWeed changed its name (and hostname [old urls will directly redirect to it]) to BitVid

The issues linked to this are:

Right now i have no access to git, so here is my implementation of novamov.py (License: Unlicense)

from __future__ import unicode_literals

import re

from .common import InfoExtractor
from ..utils import (
    ExtractorError,
    NO_DEFAULT,
    sanitized_Request,
    urlencode_postdata,
)


class NovaMovIE(InfoExtractor):
    IE_NAME = 'novamov'
    IE_DESC = 'NovaMov'

    _VALID_URL_TEMPLATE = r'''(?x)
                            http://
                                (?:
                                    (?:www\.)?%(host)s/(?:file|video|mobile/\#/videos)/|
                                    (?:(?:embed|www)\.)%(host)s/embed(?:\.php|/)?\?(?:.*?&)?\bv=
                                )
                                (?P<id>[a-z\d]{13})
                            '''
    _VALID_URL = _VALID_URL_TEMPLATE % {'host': r'novamov\.com'}

    _HOST = 'www.novamov.com'

    _FILE_DELETED_REGEX = r'This file no longer exists on our servers!</h2>'
    _STEPKEY_REGEX = r'<input type="hidden" name="stepkey" value="(?P<stepkey>"?[^"]+"?)">'
    _URL_REGEX =r'<source src="(?P<url>"?[^"]+"?)" type=\'video/mp4\'>'
    _TITLE_REGEX = r'<meta name="title" content="Watch (?P<title>"?[^"]+"?) online | [a-zA-Z_] " />'
    _URL_TEMPLATE = 'http://%s/video/%s'

    _TEST = None

    def _check_existence(self, webpage, video_id):
        if re.search(self._FILE_DELETED_REGEX, webpage) is not None:
            raise ExtractorError('Video %s does not exist' % video_id, expected=True)

    def _real_extract(self, url):
        video_id = self._match_id(url)

        url = self._URL_TEMPLATE % (self._HOST, video_id)

        # 1. get the website
        webpage = self._download_webpage(
            url, video_id, 'Downloading video page')

        self._check_existence(webpage, video_id)

        # 2. extract the 'stepkey' value from form
        def extract_stepkey(default=NO_DEFAULT):
            stepkey = self._search_regex(
                self._STEPKEY_REGEX, webpage, 'stepkey', default=default)
            return stepkey

        stepkey = extract_stepkey(default=None)

        if not stepkey:
            raise ExtractorError('stepkey could not be read of %s, please report this error' % video_id, expected=True)

        # 3. send the post request
        data = urlencode_postdata({
            'stepkey' : stepkey,
            'submit' : 'submit',
        })
        request = sanitized_Request(url, data)
        request.add_header('Content-Type', 'application/x-www-form-urlencoded')

        webpage = self._download_webpage(request, url)

        # 4. extract the real video url from response
        video_url = self._search_regex(self._URL_REGEX, webpage, 'stepkey')

        if hasattr(self, '_TITLE_REGEX'):
            title = self._search_regex(self._TITLE_REGEX, webpage, 'title')
        else:
            title = str(id)
        if hasattr(self, '_DESCRIPTION_REGEX'):
            description = self._html_search_regex(self._DESCRIPTION_REGEX, webpage, 'description', default='', fatal=False)
        else:
            description = ''

        return {
            'id': video_id,
            'url': video_url,
            'title': title,
            'description': description
        }


class WholeCloudIE(NovaMovIE):
    IE_NAME = 'wholecloud'
    IE_DESC = 'WholeCloud'

    _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': r'(?:wholecloud\.net|movshare\.(?:net|sx|ag))'}

    _HOST = 'www.wholecloud.net'

    _FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'
    #_TITLE_REGEX = r'<strong>Title:</strong> ([^<]+)</p>'
    _TITLE_REGEX = r'<meta name="title" content="Watch (?P<title>"?[^"]+"?) online | [a-zA-Z_] " />'
    _DESCRIPTION_REGEX = r'<strong>Description:</strong> ([^<]+)</p>'

    _TEST = None


class NowVideoIE(NovaMovIE):
    IE_NAME = 'nowvideo'
    IE_DESC = 'NowVideo'

    _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': r'nowvideo\.(?:to|ch|ec|sx|eu|at|ag|co|li)'}

    _HOST = 'www.nowvideo.to'

    _FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'
    _TITLE_REGEX = r'<h4>([^<]+)</h4>'
    _DESCRIPTION_REGEX = r'</h4>\s*<p>([^<]+)</p>'

    _TEST = None

 # VideoWeed is now BitVid
class BitVidIE(NovaMovIE):
    IE_NAME = 'bitvid'
    IE_DESC = 'Bitvid'

    _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': r'bitvid\.(?:sx)'}

    _HOST = 'www.bitvid.sx'

    _FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'
    _TITLE_REGEX = r'<h1 class="text_shadow">([^<]+)</h1>'
    _URL_TEMPLATE = 'http://%s/file/%s'

    _TEST = None


class CloudTimeIE(NovaMovIE):
    IE_NAME = 'cloudtime'
    IE_DESC = 'CloudTime'

    _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': r'cloudtime\.to'}

    _HOST = 'www.cloudtime.to'

    _FILE_DELETED_REGEX = r'>This file no longer exists on our servers.<'

    _TEST = None


class AuroraVidIE(NovaMovIE):
    IE_NAME = 'auroravid'
    IE_DESC = 'AuroraVid'

    _VALID_URL = NovaMovIE._VALID_URL_TEMPLATE % {'host': r'auroravid\.to'}

    _HOST = 'www.auroravid.to'

    _FILE_DELETED_REGEX = r'This file no longer exists on our servers!<'

    _TESTS = None

My upload is too bad right now to provide any example videos

Edit:
The final version with tests can be found here #12704

@remitamine remitamine closed this Mar 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.