Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BBC weather #27780

Closed
SKWDiesel1 opened this issue Jan 12, 2021 · 7 comments
Closed

BBC weather #27780

SKWDiesel1 opened this issue Jan 12, 2021 · 7 comments

Comments

@SKWDiesel1
Copy link

Checklist

  • [ x] I'm reporting a broken site support
  • [x ] I've verified that I'm running youtube-dl version 2021.01.08
  • [x ] I've checked that all provided URLs are alive and playable in a browser
  • [x ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • [x ] I've searched the bugtracker for similar issues including closed ones

Verbose log

PASTE VERBOSE LOG HERE

youtube-dl --verbose https://www.bbc.co.uk/weather/features/55581056
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'https://www.bbc.co.uk/weather/features/55581056']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.01.08
[debug] Python version 3.9.1 (CPython) - Linux-5.4.88-1-lts-x86_64-with-glibc2.32
[debug] exe versions: ffmpeg 4.3.1, ffprobe 4.3.1, rtmpdump 2.4
[debug] Proxy map: {}
[bbc] 55581056: Downloading webpage
ERROR: Unable to extract playlist data; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 803, in wrapper
return func(self, *args, **kwargs)
File "/usr/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 824, in __extract_info
ie_result = ie.extract(url)
File "/usr/lib/python3.9/site-packages/youtube_dl/extractor/common.py", line 532, in extract
ie_result = self._real_extract(url)
File "/usr/lib/python3.9/site-packages/youtube_dl/extractor/bbc.py", line 1174, in _real_extract
self._search_regex(
File "/usr/lib/python3.9/site-packages/youtube_dl/extractor/common.py", line 1010, in _search_regex
raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract playlist data; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Description

WRITE DESCRIPTION HERE
This error is reported when trying to extract the video report from the page...
https://www.bbc.co.uk/weather/features/5558105

@georgeahill
Copy link

Hi! Thanks for creating an issue. I've just checked, and I get this issue too. I think this might be a duplicate of #14168 which hasn't had a response as of yet - it looks like either a bug in the BBC extractor, a lack of proper support for BBC Weather, or some change in BBC's delivery system. This may be fixed by #23415 which hasn't been merged yet.

@Vangelis66
Copy link

Vangelis66 commented Jan 19, 2021

Manual workaround for fetching to disk the "Weather for the Week Ahead" clip found on:

https://www.bbc.com/weather/features/55581056

  1. Use your browser to inspect Page Source
  2. Search for data-parent-pid string
  3. Note down its value, which, in this case, is p093xhx6
  4. Manually reformulate the original URI to
    https://www.bbc.co.uk/programmes/p093xhx6
  5. Feed yt-dl the above URI
  6. Profit:

youtube-dl -F "https://www.bbc.co.uk/programmes/p093xhx6" =>

[bbc.co.uk] p093xhx6: Downloading video page
[bbc.co.uk] p093xhxl: Downloading media selection JSON
[bbc.co.uk] p093xhxl: Downloading m3u8 information
[bbc.co.uk] p093xhxl: Downloading m3u8 information
[bbc.co.uk] p093xhxl: Downloading MPD manifest
[bbc.co.uk] p093xhxl: Downloading MPD manifest
[bbc.co.uk] p093xhxl: Downloading m3u8 information
[bbc.co.uk] p093xhxl: Downloading m3u8 information
[bbc.co.uk] p093xhxl: Downloading MPD manifest
[bbc.co.uk] p093xhxl: Downloading MPD manifest
[info] Available formats for p093xhxl:
format code                        extension  resolution note
mf_akamai-audio_eng_1=128000-0     m4a        audio only [en] DASH audio  128k , m4a_dash container, mp4a.40.2 (48000Hz)
mf_akamai-audio_eng_1=128000-1     m4a        audio only [en] DASH audio  128k , m4a_dash container, mp4a.40.2 (48000Hz)
mf_limelight-audio_eng_1=128000-0  m4a        audio only [en] DASH audio  128k , m4a_dash container, mp4a.40.2 (48000Hz)
mf_limelight-audio_eng_1=128000-1  m4a        audio only [en] DASH audio  128k , m4a_dash container, mp4a.40.2 (48000Hz)
mf_akamai-video=827000-0           mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_akamai-video=827000-1           mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_limelight-video=827000-0        mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_limelight-video=827000-1        mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_akamai-video=1570000-0          mp4        704x396    DASH video 1570k , mp4_dash container, avc3.64001F, 50fps, video only
mf_akamai-video=1570000-1          mp4        704x396    DASH video 1570k , mp4_dash container, avc3.64001F, 50fps, video only
mf_limelight-video=1570000-0       mp4        704x396    DASH video 1570k , mp4_dash container, avc3.64001F, 50fps, video only
mf_limelight-video=1570000-1       mp4        704x396    DASH video 1570k , mp4_dash container, avc3.64001F, 50fps, video only
mf_akamai-video=2812000-0          mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_akamai-video=2812000-1          mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_limelight-video=2812000-0       mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_limelight-video=2812000-1       mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_akamai-video=5070000-0          mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_akamai-video=5070000-1          mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_limelight-video=5070000-0       mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_limelight-video=5070000-1       mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_akamai-1013-0                   mp4        704x396    1013k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.2@128k
mf_akamai-1013-1                   mp4        704x396    1013k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.2@128k
mf_limelight-1013-0                mp4        704x396    1013k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.2@128k
mf_limelight-1013-1                mp4        704x396    1013k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.2@128k
mf_akamai-1800-0                   mp4        704x396    1800k , avc1.64001F@1570k, 50.0fps, mp4a.40.2@128k
mf_akamai-1800-1                   mp4        704x396    1800k , avc1.64001F@1570k, 50.0fps, mp4a.40.2@128k
mf_limelight-1800-0                mp4        704x396    1800k , avc1.64001F@1570k, 50.0fps, mp4a.40.2@128k
mf_limelight-1800-1                mp4        704x396    1800k , avc1.64001F@1570k, 50.0fps, mp4a.40.2@128k
mf_akamai-3117-0                   mp4        960x540    3117k , avc1.64001F@2812k, 50.0fps, mp4a.40.2@128k
mf_akamai-3117-1                   mp4        960x540    3117k , avc1.64001F@2812k, 50.0fps, mp4a.40.2@128k
mf_limelight-3117-0                mp4        960x540    3117k , avc1.64001F@2812k, 50.0fps, mp4a.40.2@128k
mf_limelight-3117-1                mp4        960x540    3117k , avc1.64001F@2812k, 50.0fps, mp4a.40.2@128k
mf_akamai-5510-0                   mp4        1280x720   5510k , avc1.640020@5070k, 50.0fps, mp4a.40.2@128k
mf_akamai-5510-1                   mp4        1280x720   5510k , avc1.640020@5070k, 50.0fps, mp4a.40.2@128k
mf_limelight-5510-0                mp4        1280x720   5510k , avc1.640020@5070k, 50.0fps, mp4a.40.2@128k
mf_limelight-5510-1                mp4        1280x720   5510k , avc1.640020@5070k, 50.0fps, mp4a.40.2@128k (best)

@SKWDiesel1
Copy link
Author

I have found that the data-parent-pid is shown if you right click on the video. This saves having to inspect the page source.

Thanks for the feedback and work around.

@Vangelis66
Copy link

@SKWDiesel1 last wrote:

I have found that the data-parent-pid is shown if you right click on the video.
This saves having to inspect the page source.

Strictly speaking/being pedantic, etc., what you write is NOT exact... 👎
The clip's PID is only recoverable via inspecting Page Source and, as posted already, is :

data-parent-pid="p093xhx6"

The embedded player's HTML5 context menu (which involves first starting video playback, something not always wanted...) displays the VersionPID (aka vpid - not easily copied from there) of the clip, also found inside Page Source in two instances as:

data-vpid="p093xhxl"
(redacted)
"versionPid":"p093xhxl"

But, pid(=p093xhx6) != vpid(=p093xhxl)

yt-dl must be fed the PID string (and this is a BBC-wide applicable advice), but, luckily for you, what really happens is a silent auto-redirection performed by bbc.co.uk from a vpid to a pid URI:

https://www.bbc.co.uk/programmes/p093xhxl =>

https://www.bbc.co.uk/programmes/p093xhx6

(you can check/verify the redirection in your browser...)
Thus, youtube-dl "https://www.bbc.co.uk/programmes/p093xhxl" simply just works, too... 😜

@SKWDiesel1
Copy link
Author

I like pedantic as it usually means correct! It is fortunate for me/us that the redirection works!

Thanks again.

@dirkf
Copy link
Contributor

dirkf commented Mar 28, 2021

This weather page stashes the page details as JSON in a JS call to Morph.setPayload(), as seems to be typical in non-iPlayer pages. While this pattern is found by the extractor, the current logic may not find the correct instance of the pattern and doesn't capture the required data from the JSON as currently served.

See PR #28577

@dirkf dirkf closed this as not planned Won't fix, can't repro, duplicate, stale Mar 22, 2023
@Vangelis66
Copy link

Vangelis66 commented Mar 23, 2023

... FWIW, the BBC weather clip present in the log of the OP can now be fetched with the git-master version of youtube-dl (or, even, with the "overhauled" bbcIE found here 😜 ):

yt-dl -vF "https://www.bbc.com/weather/features/55581056" => 

[bbc] 55581056: Downloading webpage
[bbc] 55581056: Extracting from __INITIAL_DATA__
[bbc] p093xhxl: Downloading media selection JSON
[bbc] p093xhxl: Downloading m3u8 information
[bbc] p093xhxl: Downloading m3u8 information
[bbc] p093xhxl: Downloading m3u8 information
[bbc] p093xhxl: Downloading m3u8 information
[bbc] p093xhxl: Downloading MPD manifest
[bbc] p093xhxl: Downloading MPD manifest
[bbc] p093xhxl: Downloading MPD manifest
[bbc] p093xhxl: Downloading MPD manifest
[bbc] p093xhxl: Downloading media selection JSON
[download] Downloading playlist: Weather for the Week Ahead
[bbc] playlist Weather for the Week Ahead: Collected 1 video ids (downloading 1of them)
[download] Downloading video 1 of 1
[info] Available formats for p093xhxl:
format code                      extension  resolution note
mf_akamai-audio_eng=96000-0      m4a        audio only [en] DASH audio   96k , m4a_dash container, mp4a.40.5 (48000Hz)
mf_akamai-audio_eng=96000-1      m4a        audio only [en] DASH audio   96k , m4a_dash container, mp4a.40.5 (48000Hz)
mf_cloudfront-audio_eng=96000-0  m4a        audio only [en] DASH audio   96k , m4a_dash container, mp4a.40.5 (48000Hz)
mf_cloudfront-audio_eng=96000-1  m4a        audio only [en] DASH audio   96k , m4a_dash container, mp4a.40.5 (48000Hz)
mf_akamai-video=281000-0         mp4        384x216    DASH video  281k , mp4_dash container, avc3.42C015, 25fps, video only
mf_akamai-video=281000-1         mp4        384x216    DASH video  281k , mp4_dash container, avc3.42C015, 25fps, video only
mf_cloudfront-video=281000-0     mp4        384x216    DASH video  281k , mp4_dash container, avc3.42C015, 25fps, video only
mf_cloudfront-video=281000-1     mp4        384x216    DASH video  281k , mp4_dash container, avc3.42C015, 25fps, video only
mf_akamai-video=437000-0         mp4        512x288    DASH video  437k , mp4_dash container, avc3.4D4015, 25fps, video only
mf_akamai-video=437000-1         mp4        512x288    DASH video  437k , mp4_dash container, avc3.4D4015, 25fps, video only
mf_cloudfront-video=437000-0     mp4        512x288    DASH video  437k , mp4_dash container, avc3.4D4015, 25fps, video only
mf_cloudfront-video=437000-1     mp4        512x288    DASH video  437k , mp4_dash container, avc3.4D4015, 25fps, video only
mf_akamai-video=827000-0         mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_akamai-video=827000-1         mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_cloudfront-video=827000-0     mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_cloudfront-video=827000-1     mp4        704x396    DASH video  827k , mp4_dash container, avc3.4D401F, 25fps, video only
mf_akamai-video=1604000-0        mp4        960x540    DASH video 1604k , mp4_dash container, avc3.64001F, 25fps, video only
mf_akamai-video=1604000-1        mp4        960x540    DASH video 1604k , mp4_dash container, avc3.64001F, 25fps, video only
mf_cloudfront-video=1604000-0    mp4        960x540    DASH video 1604k , mp4_dash container, avc3.64001F, 25fps, video only
mf_cloudfront-video=1604000-1    mp4        960x540    DASH video 1604k , mp4_dash container, avc3.64001F, 25fps, video only
mf_akamai-video=2812000-0        mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_akamai-video=2812000-1        mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_cloudfront-video=2812000-0    mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_cloudfront-video=2812000-1    mp4        960x540    DASH video 2812k , mp4_dash container, avc3.64001F, 50fps, video only
mf_akamai-video=5070000-0        mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_akamai-video=5070000-1        mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_cloudfront-video=5070000-0    mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_cloudfront-video=5070000-1    mp4        1280x720   DASH video 5070k , mp4_dash container, avc3.640020, 50fps, video only
mf_akamai-400-0                  mp4        384x216     400k , avc1.42C015@ 281k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-400-1                  mp4        384x216     400k , avc1.42C015@ 281k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-400-0              mp4        384x216     400k , avc1.42C015@ 281k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-400-1              mp4        384x216     400k , avc1.42C015@ 281k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-565-0                  mp4        512x288     565k , avc1.4D4015@ 437k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-565-1                  mp4        512x288     565k , avc1.4D4015@ 437k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-565-0              mp4        512x288     565k , avc1.4D4015@ 437k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-565-1              mp4        512x288     565k , avc1.4D4015@ 437k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-979-0                  mp4        704x396     979k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-979-1                  mp4        704x396     979k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-979-0              mp4        704x396     979k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-979-1              mp4        704x396     979k , avc1.4D401F@ 827k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-1802-0                 mp4        960x540    1802k , avc1.64001F@1604k, 25.0fps, mp4a.40.5@ 96k
mf_akamai-1802-1                 mp4        960x540    1802k , avc1.64001F@1604k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-1802-0             mp4        960x540    1802k , avc1.64001F@1604k, 25.0fps, mp4a.40.5@ 96k
mf_cloudfront-1802-1             mp4        960x540    1802k , avc1.64001F@1604k, 25.0fps, mp4a.40.5@ 96k (best)
[download] Finished downloading playlist: Weather for the Week Ahead

NB: For some peculiar (?) reason, the (720|540)p50 encodes are being offered solely via DASH; HLS can go as high as 540p25, only; overseas location, BTW...

EDIT: (720|540)p50 encodes over HLS will appear when issuing

yt-dl -vF "https://www.bbc.co.uk/programmes/p093xhx6"

instead of

yt-dl -vF "https://www.bbc.com/weather/features/55581056"

Also, if you compare today's log with the one from more than two years ago, it seems the Beeb have ditched the Limelight CDNs for the Cloudfront ones...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants