Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some new BBC Radio Programs not downloading (old ones working fine) #17270

Closed
NielsMayer opened this issue Aug 18, 2018 · 2 comments
Closed

Some new BBC Radio Programs not downloading (old ones working fine) #17270

NielsMayer opened this issue Aug 18, 2018 · 2 comments

Comments

@NielsMayer
Copy link

@NielsMayer NielsMayer commented Aug 18, 2018

Please follow the guide below

  • You will be asked some questions and requested to provide some information, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your issue (like this: [x])
  • Use the Preview tab to see what your issue will actually look like

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2018.08.04. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

  • I've verified and I assure that I'm running youtube-dl 2018.08.04

Before submitting an issue make sure you have:

  • At least skimmed through the README, most notably the FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones
  • Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:

Add the -v flag to your command line you run youtube-dl with (youtube-dl -v <your command line>), copy the whole output and insert it here. It should look similar to one below (replace it with your log inserted between triple ```):

/usr/local/bin/youtube-dl --verbose https://www.bbc.co.uk/programmes/m00005xn
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--verbose', u'https://www.bbc.co.uk/programmes/m00005xn']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2018.08.04
[debug] Python version 2.7.15rc1 (CPython) - Linux-4.15.0-32-generic-x86_64-with-Ubuntu-18.04-bionic
[debug] exe versions: ffmpeg 3.4.2-2, ffprobe 3.4.2-2
[debug] Proxy map: {}
[bbc] m00005xn: Downloading webpage
ERROR: Unable to extract playlist data; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 792, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 502, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/bbc.py", line 1114, in _real_extract
    webpage, 'playlist data'),
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 972, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
RegexNotFoundError: Unable to extract playlist data; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Description of your issue, suggested solution and other information

Most BBC radio downloads work fine.
However, I encountered three downloads that have failed just recently.

https://www.bbc.co.uk/programmes/m00005xn
https://www.bbc.co.uk/programmes/m00005xj
https://www.bbc.co.uk/programmes/m00005xq

They all fail the same way -- see transcript above

@Vangelis66
Copy link

@Vangelis66 Vangelis66 commented Aug 18, 2018

The issue is caused by their introduction of Programme IDs (PIDs) starting with the letter m; till now, we've had PIDs starting with b, p and w (last one for World Service Radio); should be easy to fix by adjusting regex code inside bbc.py to mitigate this new change... Sadly, I can't code, but from a quick look at:
https://github.com/rg3/youtube-dl/blob/0a74b4519147aec45dc39756dc55b5755beefbc2/youtube_dl/extractor/bbc.py#L32

-    _ID_REGEX = r'[pbw][\da-z]{7}'
+    _ID_REGEX = r'[bmpw][\da-z]{7}'

should be enough...

EDIT: Looking at the code I quoted, this assumes an 8-character alphanumeric string; however, since last year we've seen PIDs 15-character long for BBC WS Radio, e.g.
https://www.bbc.co.uk/programmes/w172w4dww1jqt5s
so perhaps yt-dl needs to be patched for that, too?

-    _ID_REGEX = r'[pbw][\da-z]{7}'
+    _ID_REGEX = r'[bmpw][\da-z]{7,14}'

Again, not a coder here, just an educated iPlayer user... 😉

@NielsMayer
Copy link
Author

@NielsMayer NielsMayer commented Aug 19, 2018

The first above suggestion by @Vangelis66 works. Here is the patch:

diff --git a/youtube_dl/extractor/bbc.py b/youtube_dl/extractor/bbc.py
index 641bf6073..f802da33a 100644
--- a/youtube_dl/extractor/bbc.py
+++ b/youtube_dl/extractor/bbc.py
@@ -29,7 +29,7 @@ from ..compat import (
 class BBCCoUkIE(InfoExtractor):
     IE_NAME = 'bbc.co.uk'
     IE_DESC = 'BBC iPlayer'
-    _ID_REGEX = r'[pbw][\da-z]{7}'
+    _ID_REGEX = r'[pbwm][\da-z]{7}'
     _VALID_URL = r'''(?x)
                     https?://
                         (?:www\.)?bbc\.co\.uk/

Thanks!

@dstftw dstftw closed this in 6f356cb Aug 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.