bbc cannot extract playlist #28115

johnnytornado3 · 2021-02-08T11:51:14Z

Checklist

I'm reporting a broken site support issue
I've verified that I'm running youtube-dl version 2021.02.04.1
I've checked that all provided URLs are alive and playable in a browser
I've checked that all URLs and arguments with special characters are properly quoted or escaped
I've searched the bugtracker for similar bug reports including closed ones
I've read bugs section in FAQ

Verbose log

PASTE VERBOSE LOG HERE

Description

WRITE DESCRIPTION HERE
Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

N:\Movies>youtube-dl --version
2021.02.04.1

N:\Movies>youtube-dl --verbose https://www.bbc.com/reel/playlist/mind-matters?vp
id=p0962h5x
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'https://www.bbc.com/reel/playlist/mind
-matters?vpid=p0962h5x']
[debug] Encodings: locale cp1252, fs mbcs, out cp437, pref cp1252
[debug] youtube-dl version 2021.02.04.1
[debug] Python version 3.4.4 (CPython) - Windows-XP-5.1.2600-SP3
[debug] exe versions: ffmpeg N-77883-gd7c75a5, ffprobe N-77883-gd7c75a5, phantom
js 1.9.7
[debug] Proxy map: {}
[bbc] mind-matters: Downloading webpage
ERROR: Unable to extract playlist data; please report this issue on https://yt-d
l.org/bug . Make sure you are using the latest version; type youtube-dl -U to
update. Be sure to call youtube-dl with the --verbose flag and include its compl
ete output.
Traceback (most recent call last):
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpgi7ngq
0n\build\youtube_dl\YoutubeDL.py", line 806, in wrapper
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpgi7ngq
0n\build\youtube_dl\YoutubeDL.py", line 827, in __extract_info
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpgi7ngq
0n\build\youtube_dl\extractor\common.py", line 532, in extract
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpgi7ngq
0n\build\youtube_dl\extractor\bbc.py", line 1176, in _real_extract
File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpgi7ngq
0n\build\youtube_dl\extractor\common.py", line 1010, in _search_regex
youtube_dl.utils.RegexNotFoundError: Unable to extract playlist data; please rep
ort this issue on https://yt-dl.org/bug . Make sure you are using the latest ver
sion; type youtube-dl -U to update. Be sure to call youtube-dl with the --verb
ose flag and include its complete output.

N:\Movies>

The text was updated successfully, but these errors were encountered:

Vangelis66 · 2021-02-10T01:32:31Z

This is yet another duplicate of
#27125
#23660
#21870
#18308
and, possibly, many others...

TL;DR (or DS=didn'r search):

*bbc.com/reel/* URIs are not supported by current bbcIE; or, if support was meant, it's currently broken...

Workarounds:

For single clip, like the one in OP's log:

Reformat
https://www.bbc.com/reel/playlist/mind-matters?vpid=p0962h5x
to
https://www.bbc.co.uk/programmes/p0962h5x
and feed that to yt-dl
(auto-redirection to
https://www.bbc.co.uk/programmes/p095rkvg
will take place)

For the whole 10 clips of the "mind-matters" playlist

Inspect Page Source and search for clipPID":" string; you'll find ten instances, like below:

clipPID&quot;:&quot;p095rkvg
clipPID&quot;:&quot;p07rr51d
clipPID&quot;:&quot;p08d15ny
clipPID&quot;:&quot;p07jmww3
clipPID&quot;:&quot;p06l4bv9
clipPID&quot;:&quot;p05vt4yl
clipPID&quot;:&quot;p06qhcmy
clipPID&quot;:&quot;p06s9whb
clipPID&quot;:&quot;p06rw723
clipPID&quot;:&quot;p084qhnf

What's important is the pid values, e.g. p07rr51d for the second clip; create the following list of bbc.co.uk URIs:

https://www.bbc.co.uk/programmes/p095rkvg
https://www.bbc.co.uk/programmes/p07rr51d
https://www.bbc.co.uk/programmes/p08d15ny
https://www.bbc.co.uk/programmes/p07jmww3
https://www.bbc.co.uk/programmes/p06l4bv9
https://www.bbc.co.uk/programmes/p05vt4yl
https://www.bbc.co.uk/programmes/p06qhcmy
https://www.bbc.co.uk/programmes/p06s9whb
https://www.bbc.co.uk/programmes/p06rw723
https://www.bbc.co.uk/programmes/p084qhnf

save it as a text file named mind-matters-pl.txt, put it adjacent to youtube-dl.exe and then issue:
youtube-dl -a "mind-matters-pl.txt"

dirkf · 2021-03-24T15:51:32Z

Fixed in a400024.

remitamine · 2021-03-24T16:31:46Z

BBC reel playlist URLs are not handled properly(does not handle --no-playlist/--yes-playlist option), so this issue will be kept until this is fixed.

dirkf · 2021-03-27T12:33:36Z

It would be easy to fix if there was a way of distinguishing when the noplaylist option is False by default and when it is set by --yes-playlist. Apparently self._downloader.params.get('noplaylist') is False instead of None when neither --xx-playlist option was set. Surely these are boolean options that should set params['noplaylist] only if given:

--- a/youtube_dl/options.py
+++ b/youtube_dl/options.py
@@ -330,11 +330,11 @@
         ))
     selection.add_option(
         '--no-playlist',
-        action='store_true', dest='noplaylist', default=False,
+        action='store_true', dest='noplaylist', default=None,
         help='Download only the video, if the URL refers to a video and a playlist.')
     selection.add_option(
         '--yes-playlist',
-        action='store_false', dest='noplaylist', default=False,
+        action='store_false', dest='noplaylist', default=None,
         help='Download the playlist, if the URL refers to a video and a playlist.')
     selection.add_option(
         '--age-limit',

remitamine · 2021-03-27T13:00:53Z

the extractor doesn't need to know whether the parameter has been set explicitly by the user or it's the default value(the default value has to treated in the same way as if the user did pass the option), you can look at other extractors that handle the noplaylist option.

dirkf · 2021-03-27T13:49:23Z

The default is wrong: it should be None. A simple boolean value can't represent the three cases. Otherwise there needs to be a boolean params['yesplaylist'] set by --yes-playlist and params['noplaylist'] should only be set by --no-playlist.

Apparently the other extractors can't handle the case where basically the same page is fetched using different URLs that imply distinct default playlist handling. For instance:

The correct logic for a URL that has both a single video and a playlist is:

URL implies a playlist and --no-playlist => single video (params['noplaylist'] == False)~
URL implies a playlist and --no-playlist not used => playlist (params['noplaylist'] == False)~
URL implies a playlist and --yes-playlist => playlist (params['noplaylist'] == True)
URL implies single video and --yes-playlist => playlist (params['noplaylist'] == False)*
URL implies single video and --yes-playlist not used => single video (params['noplaylist'] == False)*
URL implies single video and --no-playlist => single video (params['noplaylist'] == True)

If params['noplaylist'] defaults to False, the cases marked ~ can't be identified, nor can those marked *, because params['noplaylist'] has the same value, but the desired outcome is different.

Even if the different URL formats were handled by separate extractors, it wouldn't help to disambiguate the params['noplaylist'] value.

remitamine · 2021-03-27T15:03:53Z

https://www.bbc.com/reel/playlist/mind-matters?vpid=p0962h5x => video

this URL would not be considered that it implies a video, instead, this will be determined after checking the noplaylist value.

dirkf · 2021-03-27T16:10:17Z

Actually, like the other URL formats of this type, the vpid= type has a focused video; unlike the others it's not the first in the list under the video. So it does imply a video, especially as its vpid is mentioned.

There is also the third case https://www.bbc.com/reel/video/p099tghy/is-phrenology-the-weirdest-pseudoscience-of-them-all- which is apparently identical to https://www.bbc.com/reel/playlist/mind-matters.

I reviewed the results of find youtube_dl -name '*.py' -exec grep -HE "'noplaylist'" "{}" \; again. Some extractors report that a single video is being processed because of --no-playlist. None mention using --yes-playlist. It's clear that --yes-playlist is being used as just a way to turn off --no-playlist (as if it were --no-no-playlist). If params['noplaylist'] defaults to False there's no other reason for it.

Suppose that a user goes to a page with a video and wants to archive the video. The page (say, https://www.bbc.com/reel/video/p099tghy/is-phrenology-the-weirdest-pseudoscience-of-them-all-) happens to have a playlist that can be extracted, so the user ends up with 57 (12, in this case) other unexpected videos. The unhappy user can make --no-playlist the configuration default to avoid such a surprise. Then the same user goes to a playlist page (say, https://www.bbc.com/reel/playlist/mind-matters) that happens to have an active video and finds that only that video is fetched. The user is unhappy again.

Whereas, if --no-playlist and --yes-playlist operate independently (equivalently, params['noplaylist'] defaults to None), with the first page the user gets the one video expected, and could have used --yes-playlist to get the playlist; there is no need to set any non-default configuration; with the second page, the user gets the playlist expected, and could have used --no-playlist to get just the video. Surely that's what was intended?

remitamine · 2021-03-27T16:34:38Z

Suppose that a user goes to a page with a video and wants to archive the video. The page (say, https://www.bbc.com/reel/video/p099tghy/is-phrenology-the-weirdest-pseudoscience-of-them-all-) happens to have a playlist that can be extracted, so the user ends up with 57 (12, in this case) other unexpected videos. The unhappy user can make --no-playlist the configuration default to avoid such a surprise. Then the same user goes to a playlist page (say, https://www.bbc.com/reel/playlist/mind-matters) that happens to have an active video and finds that only that video is fetched. The user is unhappy again.

for https://www.bbc.com/reel/playlist/mind-matters, it's a playlist URL and it will be treated this way regardless of noplaylist value.

dirkf · 2021-03-27T20:28:54Z

The same page can have both a video and a playlist and the interpretation of which is to be processed depends only on the URL.

https://www.bbc.com/reel/playlist/mind-matters is a URL "referring to a video and a playlist", to quote the manual, so --no-playlist ought to be respected.

But the other two URL styles that I quoted, which are plainly video and not playlist URLs, refer to an essentially identical page. They should yield the video by default but then it's impossible to override that with --yes-playlist because the option processing doesn't record that --yes-playlist was used.

In summary, the change from False to None in youtube_dl/options.py as suggested

would have no effect on existing extractors, except to allow them to respond to --yes-playlist meaningfully,
would match the description of the --no/yes-playlist options in the manual, and
would allow potentially confusing URLs to be treated more flexibly and as described in the manual.

See https://github.com/dirkf/youtube-dl/tree/df-bbcreel-playlist-patch.

remitamine · 2021-03-27T23:12:25Z

would have no effect on existing extractors, except to allow them to respond to --yes-playlist meaningfully

it's a breaking change, it would change the default behaviour.

would match the description of the --no/yes-playlist options in the manual, and
would allow potentially confusing URLs to be treated more flexibly and as described in the manual.

the description of the option states this:

if the URL refers to a video and a playlist

so the option would apply only to https://www.bbc.com/reel/playlist/mind-matters?vpid=p0962h5x, because the URL refers to mind-matters playlist and p0962h5x video id.
the https://www.bbc.com/reel/playlist/mind-matters URL links to a playlist because the URL refers only to mind-matters playlist.
the https://www.bbc.com/reel/video/p099tghy/is-phrenology-the-weirdest-pseudoscience-of-them-all- URL links to a video because the URL refers only to p099tghy video id.

dirkf added the duplicate label Jan 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bbc cannot extract playlist #28115

bbc cannot extract playlist #28115

johnnytornado3 commented Feb 8, 2021

Vangelis66 commented Feb 10, 2021

dirkf commented Mar 24, 2021

remitamine commented Mar 24, 2021

dirkf commented Mar 27, 2021 •

edited

Loading

remitamine commented Mar 27, 2021

dirkf commented Mar 27, 2021

remitamine commented Mar 27, 2021

dirkf commented Mar 27, 2021 •

edited

Loading

remitamine commented Mar 27, 2021

dirkf commented Mar 27, 2021

remitamine commented Mar 27, 2021

bbc cannot extract playlist #28115

bbc cannot extract playlist #28115

Comments

johnnytornado3 commented Feb 8, 2021

Checklist

Verbose log

Description

Vangelis66 commented Feb 10, 2021

dirkf commented Mar 24, 2021

remitamine commented Mar 24, 2021

dirkf commented Mar 27, 2021 • edited Loading

remitamine commented Mar 27, 2021

dirkf commented Mar 27, 2021

remitamine commented Mar 27, 2021

dirkf commented Mar 27, 2021 • edited Loading

remitamine commented Mar 27, 2021

dirkf commented Mar 27, 2021

remitamine commented Mar 27, 2021

dirkf commented Mar 27, 2021 •

edited

Loading

dirkf commented Mar 27, 2021 •

edited

Loading