Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extraction for theChive.com seems to return the same wrong video for Kaltura video embeds #8155

Open
bubblme opened this issue Jan 6, 2016 · 3 comments

Comments

@bubblme
Copy link

@bubblme bubblme commented Jan 6, 2016

Hi -
I'm trying to extract video from theChive.com that is played in a Kaltura embed player but the extractor always returns the same unrelated video (a dog with some background music) that has nothing to do with the actual video, at least for three pages I've tried - an example:

youtube-dl http://thechive.com/2016/01/05/lighing-mortars-under-the-ice-leads-to-interesting-results-video/

output:
[generic] lighing-mortars-under-the-ice-leads-to-interesting-results-video: Requesting header
WARNING: Falling back on generic information extractor.
[generic] lighing-mortars-under-the-ice-leads-to-interesting-results-video: Downloading webpage
[generic] lighing-mortars-under-the-ice-leads-to-interesting-results-video: Extracting information
[download] Downloading playlist: Guy lights a mortar under the ice and it explodes : theCHIVE
[generic] playlist Guy lights a mortar under the ice and it explodes : theCHIVE: Collected 1 video ids (downloading 1 of them)
[download] Downloading video 1 of 1
[youtube] y9K18CGEeiI: Downloading webpage
[youtube] y9K18CGEeiI: Downloading video info webpage
[youtube] y9K18CGEeiI: Extracting video information
[youtube] y9K18CGEeiI: Downloading DASH manifest
[youtube] y9K18CGEeiI: Downloading DASH manifest
[download] Destination: DOG-y9K18CGEeiI.f135.mp4
[download] 100% of 1.17MiB in 00:00
[download] Destination: DOG-y9K18CGEeiI.f141.m4a
[download] 100% of 434.44KiB in 00:00
[ffmpeg] Merging formats into "DOG-y9K18CGEeiI.mp4"
Deleting original file DOG-y9K18CGEeiI.f135.mp4 (pass -k to keep)
Deleting original file DOG-y9K18CGEeiI.f141.m4a (pass -k to keep)
[download] Finished downloading playlist: Guy lights a mortar under the ice and it explodes : theCHIVE

youtube-dl http://thechive.com/2016/01/06/amazing-sportsmanship-at-australian-tennis-match-video/
also returns the same file, DOG-y9K18CGEeiI.mp4, and every url I've tried with a Kaltura embed player returns this same video. Any idea what is going on? I read in a previous (closed) issue that theChive is supported by the generic extractor (#5072) - any idea why this is happening?

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Jan 6, 2016

They have this videos embeded on every page:

    <!-- easter egg modal -->
    <div id="konami-modal" class="modal konami-modal">
  <div class="modal-content">
    <iframe id="konami-video" width="420" height="315" src="https://www.youtube.com/embed/y9K18CGEeiI" frameborder="0" allowfullscreen></iframe>
  </div>
</div>
    <!-- end easter egg modal -->
@bubblme
Copy link
Author

@bubblme bubblme commented Jan 6, 2016

Ah, so the generic extractor is pulling that video instead of the Kaltura embed on the page. I'd have to assume they added this gem after Kaltura embed support for theChive was added to the generic extractor. Is there any way for the generic extractor to ignore this easter egg in favor of the Kaltura embed (wishful thinking here), or does an extractor specifically for theChive need to be written?

@InfiniteStyles
Copy link

@InfiniteStyles InfiniteStyles commented Nov 14, 2016

I get this error when trying to use that same command. Why the difference?

Command:
youtube-dl** http://thechive.com/2016/01/05/lighing-mortars-under-the-ice-leads-to-interesting-results-video/

[generic] lighing-mortars-under-the-ice-leads-to-interesting-results-video: Requesting header
WARNING: Falling back on generic information extractor.
[generic] lighing-mortars-under-the-ice-leads-to-interesting-results-video: Downloading webpage
[generic] lighing-mortars-under-the-ice-leads-to-interesting-results-video: Extracting information
[generic] 1289861?iframeembed=true&playerId=kaltura_player_1370631785&entry_id=0_4r7awiw0: Requesting header
Traceback (most recent call last):
  File "C:\Users\Anton\AppData\Local\Programs\Python\Python35\Scripts\youtube-dl-script.py", line 9, in <module>
    load_entry_point('youtube-dl==2016.9.27', 'console_scripts', 'youtube-dl')()
  File "c:\users\anton\appdata\local\programs\python\python35\lib\site-packages\youtube_dl\__init__.py", line 449, in main
    _real_main(argv)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\site-packages\youtube_dl\__init__.py", line 439, in _real_main
    retcode = ydl.download(all_urls)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\site-packages\youtube_dl\YoutubeDL.py", line 1791, in download
    url, force_generic_extractor=self.params.get('force_generic_extractor', False))
  File "c:\users\anton\appdata\local\programs\python\python35\lib\site-packages\youtube_dl\YoutubeDL.py", line 705, in extract_info
    return self.process_ie_result(ie_result, download, extra_info)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\site-packages\youtube_dl\YoutubeDL.py", line 758, in process_ie_result
    extra_info=extra_info)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\site-packages\youtube_dl\YoutubeDL.py", line 694, in extract_info
    ie_result = ie.extract(url)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\site-packages\youtube_dl\extractor\common.py", line 355, in extract

    return self._real_extract(url)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\site-packages\youtube_dl\extractor\generic.py", line 1560, in _real_extract
    fatal=False)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\site-packages\youtube_dl\extractor\common.py", line 402, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\site-packages\youtube_dl\YoutubeDL.py", line 2001, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\urllib\request.py", line 466, in open
    response = self._open(req, data)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\urllib\request.py", line 484, in _open
    '_open', req)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\urllib\request.py", line 444, in _call_chain
    result = func(*args)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\site-packages\youtube_dl\utils.py", line 1011, in https_open
    req, **kwargs)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\urllib\request.py", line 1254, in do_open
    h.request(req.get_method(), req.selector, req.data, headers)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\http\client.py", line 1106, in request
    self._send_request(method, url, body, headers)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\http\client.py", line 1151, in _send_request
    self.endheaders(body)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\http\client.py", line 1102, in endheaders
    self._send_output(message_body)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\http\client.py", line 934, in _send_output
    self.send(msg)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\http\client.py", line 877, in send
    self.connect()
  File "c:\users\anton\appdata\local\programs\python\python35\lib\http\client.py", line 1260, in connect
    server_hostname=server_hostname)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\ssl.py", line 377, in wrap_socket
    _context=self)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\ssl.py", line 752, in __init__
    self.do_handshake()
  File "c:\users\anton\appdata\local\programs\python\python35\lib\ssl.py", line 988, in do_handshake
    self._sslobj.do_handshake()
  File "c:\users\anton\appdata\local\programs\python\python35\lib\ssl.py", line 638, in do_handshake
    match_hostname(self.getpeercert(), self.server_hostname)
  File "c:\users\anton\appdata\local\programs\python\python35\lib\ssl.py", line 297, in match_hostname
    % (hostname, ', '.join(map(repr, dnsnames))))
ssl.CertificateError: hostname 'cdnapi.kaltura.com' doesn't match either of 'a248.e.akamai.net', '*.akamaized.net', '*.akamaihd-staging.net', '*.akamaihd.net', '*.akamaized-staging.net'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.