Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Am I reading correctly that updated youtube-dl still can't download videos from bbc.com's web site? #27125

Closed
3 tasks done
antdude opened this issue Nov 21, 2020 · 6 comments
Closed
3 tasks done
Labels

Comments

@antdude
Copy link

antdude commented Nov 21, 2020

Checklist

  • I'm asking a question
  • I've looked through the README and FAQ for similar questions
  • I've searched the bugtracker for similar questions including closed ones

Question

WRITE QUESTION HERE
Am I reading correctly that updated youtube-dl still can't download videos from bbc.com's web site? https://github.com/ytdl-org/youtube-dl/issues?q=is%3Aissue+is%3Aopen+bbc.com shows #23232. Results seem to be different as shown below:

$ youtube-dl https://www.bbc.com/reel/video/p08yxrlb/why-our-dreams-could-be-the-key-to-time-travel
[bbc] why-our-dreams-could-be-the-key-to-time-travel: Downloading webpage
ERROR: no suitable InfoExtractor for URL https://www.bbc.co.uk/programmes/None

Or is this a different issue that I need to report as a new bug issue?

Thank you for reading and hopefully answering soon. :)

@october262
Copy link

i just used the Firefox addon called the stream detector to successfully grab the master m3u8 file
and downloaded this video - https://www.bbc.com/reel/video/p08yxrlb/why-our-dreams-could-be-the-key-to-time-travel

@hairycactus
Copy link

youtube-dl broke for bbc.com & bbc.co.uk videos as early as v2019.11.28 onwards. ie. back in Nov 2019. (Yeah, I was taking notes for every version until 2020 Q1 when I gave up hoping it would be fixed.)

It was also broken for some audio at bbc.co.uk/sounds, but the latest v2020.11.21.1 now seems to work okay for that domain, although I haven't tested every URL.

For BBC Reel (but not non-Reel) videos, previously one could work around the no suitable InfoExtractor error by specifying the Programme ID (PID) instead -- or at least until sometime in early 2020 (still okay in Jan/Feb 2020).

Eg. For https://www.bbc.com/reel/video/p08yxrlb/why-our-dreams-could-be-the-key-to-time-travel

And youtube-dl https://www.bbc.co.uk/programmes/p08yxrlb would have been able to fetch the video (back in Jan/Feb 2020 & earlier). However, with v2020.11.21.1, it now shows ERROR: No video formats found.

I also tried youtube-dl https://www.bbc.com/programmes/p08yxrlb -- but it shows ERROR: no suitable InfoExtractor for URL https://www.bbc.co.uk/programmes/None.

As such, the latest youtube-dl is totally broken for all BBC videos, unless perhaps one resorts to using 3rd-party manual extraction methods.

@Vangelis66
Copy link

@hairycactus :

_MEDIASELECTOR_URLS = [
# Provides HQ HLS streams with even better quality that pc mediaset but fails
# with geolocation in some cases when it's even not geo restricted at all (e.g.
# http://www.bbc.co.uk/programmes/b06bp7lf). Also may fail with selectionunavailable.
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/%s',
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s',
]

As you say, for

https://www.bbc.com/reel/video/p08yxrlb/why-our-dreams-could-be-the-key-to-time-travel

you'd have to manipulate it to

https://www.bbc.co.uk/programmes/p08yxrlb

for the bbc.co.uk InfoExtractor (IE) to recognise it...

For pid=p08yxrlb (included in the clip's URI), yt-dl correctly retrieves that vpid=p08yxrld, as can be seen by

youtube-dl -F "https://www.bbc.co.uk/programmes/p08yxrlb" -v =>

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-F', 'https://www.bbc.co.uk/programmes/p08yxrlb', '
-v']
[debug] Encodings: locale cp1253, fs mbcs, out cp737, pref cp1253
[debug] youtube-dl version 2020.11.24
[debug] Python version 3.4.4 (CPython) - Windows-Vista-6.0.6003-SP2
[debug] exe versions: ffmpeg N-97309-g4e0cf81b49, ffprobe N-97309-g4e0cf81b49, p
hantomjs 2.1.1, rtmpdump 2.4
[debug] Proxy map: {}
[bbc.co.uk] p08yxrlb: Downloading video page
[bbc.co.uk] p08yxrld: Downloading media selection XML
ERROR: No video formats found; please report this issue on https://yt-dl.org/bug
 . Make sure you are using the latest version; type  youtube-dl -U  to update. B
e sure to call youtube-dl with the --verbose flag and include its complete outpu
t.
<redacted>

However, as instructed by the code referenced above, that vpid string is only tried with the first mediaselector URI, the one with mediaset=iptv-all:

https://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/p08yxrld

which doesn't yield any media streams info (only subs/captions info) 😭 ; however, and this is a yt-dl bug in this case, the vpid string isn't tried with the second mediaselector URI (mediaset=pc), which is actually the one that does return media streams info:

https://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/p08yxrld

But BBC Reel video-clips constitute edge cases for the bbcIE: They are (usually) globally available (non-geofenced), served from the bbc.com domain, which the bbcIE does not officially support; bbcIE focuses mainly on video content from BBC iPlayer (geofenced) and audio content from BBC Sounds (partly geofenced, overseas locations are served lower bitrates), not random bbc.co* clips...

Workaround: Unfortunately, I don't "speak" Python, so can not offer a PR to fix this... Should you wish to fetch above BBC Reel video, you could comment out line 56 of provided code snippet inside bbc.py

#        'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/%s',

recompile yt-dl (or invoke directly from source) and issue:
youtube-dl "https://www.bbc.co.uk/programmes/p08yxrlb" =>

[bbc.co.uk] p08yxrlb: Downloading video page
[bbc.co.uk] p08yxrld: Downloading media selection XML
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[dashsegments] Total fragments: 103
[download] Destination: BBC - Could your dreams predict the future-p08yxrld.fstr
eam-nonuk-pc_streaming_concrete_combined_sd_mf_limelight_world_dash_https-video=
5070000.mp4
[download]  11.1% of ~198.10MiB at 816.44KiB/s ETA 04:08

@Vangelis66
Copy link

Vangelis66 commented Nov 24, 2020

Another workaround would be to move away completely from the deprecated mediaselector/5 API and change to the current mediaselector/6 one; however, v6 produces, by default, JSON-formatted content, while the existing parser inside bbc.py expects XML-formatted one; you can still force request compatible XML-formatted response by appending /format/xml:

-        'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/%s', 
-        'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s',
+        'http://open.live.bbc.co.uk/mediaselector/6/select/version/2.0/mediaset/iptv-all/vpid/%s/format/xml',
+        'http://open.live.bbc.co.uk/mediaselector/6/select/version/2.0/mediaset/pc/vpid/%s/format/xml',
[bbc.co.uk] p08yxrlb: Downloading video page
[bbc.co.uk] p08yxrld: Downloading media selection XML
[bbc.co.uk] p08yxrld: Downloading m3u8 information
[bbc.co.uk] p08yxrld: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 403: Forbidden
[bbc.co.uk] p08yxrld: Downloading m3u8 information
[bbc.co.uk] p08yxrld: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 403: Forbidden
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[bbc.co.uk] p08yxrld: Downloading m3u8 information
[bbc.co.uk] p08yxrld: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 403: Forbidden
[bbc.co.uk] p08yxrld: Downloading m3u8 information
[bbc.co.uk] p08yxrld: Downloading m3u8 information
WARNING: Failed to download m3u8 information: HTTP Error 403: Forbidden
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[bbc.co.uk] p08yxrld: Downloading MPD manifest
[dashsegments] Total fragments: 103
[download] Destination: BBC - Could your dreams predict the future-p08yxrld.f_de
precated__mf_limelight-video=5070000-1.mp4
[download]   2.2% of ~171.52MiB at 960.82KiB/s ETA 04:04

@ajj8
Copy link

ajj8 commented Dec 12, 2020

This has been fixed for AGES by my pull request (almost a year now) which the youtube-dl maintenance team is refusing to merge
#23415

@dirkf
Copy link
Contributor

dirkf commented Mar 2, 2021

Fixed in e465b25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants