Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support request for cbc.ca (CBCNEWS) #5309

Closed
defiantredpill opened this issue Mar 29, 2015 · 9 comments
Closed

Support request for cbc.ca (CBCNEWS) #5309

defiantredpill opened this issue Mar 29, 2015 · 9 comments
Labels
bug

Comments

@defiantredpill
Copy link

@defiantredpill defiantredpill commented Mar 29, 2015

Please add support for CBCNEWS at cbc.ca such as
http://www.cbc.ca/news/canada/windsor/man-beats-cancer-four-times-in-five-years-1.1005720

The widow of this man would be eternally grateful.
Thanks in advance.

Maybe get two birds with one stone.
#5156 (Unsupported URL with CBC radio)

@julianrichen
Copy link
Contributor

@julianrichen julianrichen commented Apr 25, 2015

I'm not talented enough to find the source url but you can take a url like so:
http://www.cbc.ca/news/canada/windsor/man-beats-cancer-four-times-in-five-years-1.1005720

and add json right after the domain:
http://www.cbc.ca/json/news/canada/windsor/man-beats-cancer-four-times-in-five-years-1.1005720

This returns the pages info in json.

You also do the following which redirects the the url abobe:
http://www.cbc.ca/json/1.1005720

If you look in the json it has a link to the "player":
http://www.cbc.ca/player/News/Canada/Windsor/ID/2164403761/

After that I'm not sure :( Maybe this will help push this issue & #5156

@misterhat
Copy link
Contributor

@misterhat misterhat commented Apr 25, 2015

I tried looking into ripping CBC players earlier, and it turns out they use ThePlatform. CBC's ThePlatform subsection is cbc, and you can find a show PID by using the following (normally accomplished through XHR), http://tpfeed.cbc.ca/f/h9dtGB/5akSXx4Ng_Zn?range=1-1&byContent=byReleases=byId%3D2164403761.

From there you should be able to download it with youtube-dl with the following URL: http://link.theplatform.com/s/cbc/agUOFlsSe1xtiuqw_DJ5HYGKTqi5SA7c

But I've noticed that this doesn't work for certain videos, including this one for some reason.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Feb 11, 2016

Example URL from OP does not seem to work but can be watched in browser:

[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'http://www.cbc.ca/news/canada/windsor/man-beats-cancer-four-times-in-five-years-1.1005720', u'-F', u'-v']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2016.02.10
[debug] Git HEAD: fc3810f
[debug] Python version 2.6.6 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-77953-gcc83177, ffprobe N-77953-gcc83177, rtmpdump 2.4
[debug] Proxy map: {}
[CBC] man-beats-cancer-four-times-in-five-years-1.1005720: Downloading webpage
[ThePlatformFeed] 2164402062: Downloading JSON metadata
[ThePlatformFeed] 2164402062: Downloading SMIL data for PZF7qicxxPwS
ERROR: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "C:\Dev\git\youtube-dl\master\youtube_dl\extractor\theplatform.py", line 37, in _extract_theplatform_smil
    for n in meta.findall(_x('.//smil:ref'))
StopIteration
Traceback (most recent call last):
  File "C:\Dev\git\youtube-dl\master\youtube_dl\YoutubeDL.py", line 666, in extract_info
    ie_result = ie.extract(url)
  File "C:\Dev\git\youtube-dl\master\youtube_dl\extractor\common.py", line 315, in extract
    return self._real_extract(url)
  File "C:\Dev\git\youtube-dl\master\youtube_dl\extractor\theplatform.py", line 291, in _real_extract
    cur_formats, cur_subtitles = self._extract_theplatform_smil(smil_url, video_id, 'Downloading SMIL data for %s' % cur_video_id)
  File "C:\Dev\git\youtube-dl\master\youtube_dl\extractor\theplatform.py", line 49, in _extract_theplatform_smil
    transform_rtmp_url=lambda streamer, src: (streamer, 'mp4:' + src))
  File "C:\Dev\git\youtube-dl\master\youtube_dl\extractor\common.py", line 1267, in _parse_smil_formats
    self._sort_formats(formats)
  File "C:\Dev\git\youtube-dl\master\youtube_dl\extractor\common.py", line 828, in _sort_formats
    raise ExtractorError('No video formats found')
ExtractorError: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
@dstftw dstftw reopened this Feb 11, 2016
@remitamine
Copy link
Collaborator

@remitamine remitamine commented Feb 11, 2016

the problem is not in the cbc extractor it's an error in ThePlatformFeedIE because it requests formats=MPEG4,F4M

@remitamine
Copy link
Collaborator

@remitamine remitamine commented Feb 11, 2016

you can check again with a7cab4d.

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented May 20, 2016

Still not working. My Firefox on Linux issues the following request:

$ curl "http://link.theplatform.com/s/ExhSPC/media/xBC_TtO8DS6_?mbr=true&format=SMIL&player=default-prod-vms&policy=68761604&manifest=f4m&Tracking=true&Embedded=true&formats=MPEG4,F4M,FLV,MP3" 
<smil xmlns="http://www.w3.org/2005/SMIL21/Language">
<head>
    <meta name="refreshToken" content="0154cded14ee129a42ac1c6c65ab3625a108d88d10d47378c9f42dc2acda230e4be97ce6c163"/>
    <metadata>
    <seq>
    <ref src="http://pubads.g.doubleclick.net/gampad/ads?sz=320x240&amp;iu=/5876/&amp;ciu_szs=300x250&amp;impl=s&amp;gdfp_req=1&amp;env=vp&amp;output=xml_vast2&amp;unviewed_position_start=1&amp;correlator=[time]&amp;cmsid=5491&amp;vid=2164402062&amp;ad_rule=1&amp;url=[url]&amp;cust_params=kgender%3Dm" type="application/vmap+xml" no-skip="true" tags="midroll">
    </ref>
    </seq>
    </metadata>
</head>
<body>
<seq>
    <ref src="http://pubads.g.doubleclick.net/gampad/ads?sz=320x240&amp;iu=/5876/&amp;ciu_szs=300x250&amp;impl=s&amp;gdfp_req=1&amp;env=vp&amp;output=xml_vast2&amp;unviewed_position_start=1&amp;correlator=[time]&amp;cmsid=5491&amp;vid=2164402062&amp;ad_rule=1&amp;url=[url]&amp;cust_params=kgender%3Dm" type="application/vmap+xml" no-skip="true" tags="preroll">
    </ref>
    <video src="http://mobilehls-vh.akamaihd.net/z/prodVideo/news/CBC_News_VMS/277/67/cancer__090573.flv,,.csmil/manifest.f4m?hdnea=ip=140.112.230.216~st=1463743437~exp=1463743827~acl=/z/*~id=a8f235e0-7705-44bf-bfae-e1fb1d356eef~hmac=339165761bc8b95ea17db41b1d72dc31ece8b89437e6daec8cfb01d5d6e2a97c" title="Cancer survivor four times over" author="Allison Johnson" abstract="Tim Mayer has beaten three different forms of cancer four times in five years." copyright="CBC Production" dur="186866ms" guid="2164402062" categories="News/Canada/Windsor" keywords="cancer" provider="CBC News VMS" type="application/f4m+xml" height="360" width="640">
        <param name="adCategory" value="News"/>
        <param name="adSite" value="cbc.news.ca"/>
        <param name="aired" value="false"/>
        <param name="allTime" value="0"/>
        <param name="audioVideo" value="Video"/>
        <param name="availableInMobile" value="false"/>
        <param name="contentArea" value="News"/>
        <param name="dayPlayCount" value="0"/>
        <param name="genre" value="News"/>
        <param name="last30Days" value="0"/>
        <param name="last7Days" value="0"/>
        <param name="lastDay" value="1"/>
        <param name="liveOndemand" value="On-Demand"/>
        <param name="region" value="Windsor"/>
        <param name="shareable" value="Yes"/>
        <param name="show" value="CBC News: Windsor at 6:00"/>
        <param name="sport" value="(not specified)"/>
        <param name="sportGroup" value="(not specified)"/>
        <param name="subtitles" value="No"/>
        <param name="syndicate" value="Yes"/>
        <param name="type" value="Excerpt"/>
        <param name="unapprovedDate" value="2524626000000"/>
        <param name="trackingData" value="aid=2655402169|b=500000000|bc=CBCC-NEW|ci=1|cid=505954371947|d=1463743467009|l=186866|mediaPid=xBC_TtO8DS6_|pd=1320410746000|pid=PZF7qicxxPwS|pl=default-prod-vms|prid=68761604|pvid=455224899801|rid=505956419612"/>
    </video>
</seq>
</body>
</smil>

While in youtube-dl a different URL is requested:

$ youtube-dl -vF "http://www.cbc.ca/news/canada/windsor/man-beats-cancer-four-times-in-five-years-1.1005720"   
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-vF', 'http://www.cbc.ca/news/canada/windsor/man-beats-cancer-four-times-in-five-years-1.1005720']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2016.05.16
[debug] Git HEAD: 31a7019
[debug] Python version 3.5.1 - Linux-4.5.1-1-ARCH-x86_64-with-arch
[debug] exe versions: ffmpeg 3.0.2, ffprobe 3.0.2, rtmpdump 2.4
[debug] Proxy map: {}
[CBC] man-beats-cancer-four-times-in-five-years-1.1005720: Downloading webpage
http://link.theplatform.com/s/ExhSPC/media/guid/2655402169/2164402062?mbr=true
[ThePlatform] 2164402062: Downloading SMIL data
ERROR: None of the allowed formats for this user agent were available.
Traceback (most recent call last):
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 676, in extract_info
    ie_result = ie.extract(url)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/common.py", line 341, in extract
    return self._real_extract(url)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/theplatform.py", line 242, in _real_extract
    formats, subtitles = self._extract_theplatform_smil(smil_url, video_id)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/theplatform.py", line 38, in _extract_theplatform_smil
    raise ExtractorError(error_element.attrib['abstract'], expected=True)
youtube_dl.utils.ExtractorError: None of the allowed formats for this user agent were available.
@yan12125 yan12125 added bug and removed site-support-request labels May 20, 2016
@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented May 20, 2016

By the way, http://link.theplatform.com/s/ExhSPC/media/guid/2655402169/2164402062?mbr=true&format=SMIL&formats=MPEG4,F4M,FLV,MP3 returns a correct result. I don't know when is formats necessary.

@remitamine
Copy link
Collaborator

@remitamine remitamine commented May 20, 2016

it doesn't work again after 52f7c75.

By the way, http://link.theplatform.com/s/ExhSPC/media/guid/2655402169/2164402062?mbr=true&format=SMIL&formats=MPEG4,F4M,FLV,MP3 returns a correct result. I don't know when is formats necessary.

i will check it and see if it affect other urls.

@remitamine
Copy link
Collaborator

@remitamine remitamine commented May 20, 2016

adding &formats=MPEG4,FLV,MP3,M3U doesn't work

[ThePlatform] 2164402062: Downloading SMIL data
[ThePlatform] 2164402062: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 403: Forbidden
ERROR: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

adding &formats=MPEG4,FLV,MP3,F4M doesn't work

[ThePlatform] 2164402062: Downloading SMIL data
[ThePlatform] 2164402062: Downloading f4m manifest
WARNING: Unable to download f4m manifest: HTTP Error 403: Forbidden
ERROR: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

adding &formats=MPEG4,FLV,MP3 works

[CBC] man-beats-cancer-four-times-in-five-years-1.1005720: Downloading webpage
[ThePlatform] 2164402062: Downloading SMIL data
[ThePlatform] 2164402062: Checking video URL
[ThePlatform] 2164402062: Downloading JSON metadata
[download] Destination: Cancer survivor four times over-2164402062.flv
[#befde4 11MiB/12MiB(98%) CN:2 DL:216KiB]                                      
05/20 12:54:19 [NOTICE] Download complete: /home/amine/youtube-dl/youtube_dl/Cancer survivor four times over-2164402062.flv.part

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
befde4|OK  |   198KiB/s|/home/amine/youtube-dl/youtube_dl/Cancer survivor four times over-2164402062.flv.part

Status Legend:
(OK):download completed.
[aria2c] Downloaded 12704185 bytes
[download] 100% of 12.12MiB
[ffmpeg] There aren't any subtitles to convert
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
9 participants
@julianrichen @defiantredpill @jaimeMF @misterhat @dstftw @yan12125 @remitamine and others
You can’t perform that action at this time.