Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PBS]PBSKids.org #2440

Closed
7 tasks done
wertercatt opened this issue Jan 22, 2022 · 4 comments · Fixed by #7602
Closed
7 tasks done

[PBS]PBSKids.org #2440

wertercatt opened this issue Jan 22, 2022 · 4 comments · Fixed by #7602
Labels
account-needed Account details are needed to test/fix this can-share-account Someone is willing to provide account details for development site-request Request to support a new website

Comments

@wertercatt
Copy link

wertercatt commented Jan 22, 2022

Checklist

Region

United States

Description

Streaming video content from https://pbskids.org/cyberchase/videos and other pbskids.org pages doesn't trigger the PBS extractor, and is not downloaded. I found a metadata source at https://producerplayer.services.pbskids.org/show-list/?shows=cyberchase&shows_title=Cyberchase&page=1&page_size=200&available=public&sort=-encored_on if that's useful.

Verbose log

C:\Programs\yt-dlp_win>yt-dlp -vU -o "E:/RIPs for Jellyfin/Streaming Shows/%(series)s/Season %(season_number)s/Episode S%(season_number)sE%(episode_number)s.%(ext)s" https://pbskids.org/cyberchase/videos
[debug] Command-line config: ['-vU', '-o', 'E:/RIPs for Jellyfin/Streaming Shows/%(series)s/Season %(season_number)s/Episode S%(season_number)sE%(episode_number)s.%(ext)s', 'https://pbskids.org/cyberchase/videos']
[debug] Portable config "C:\Programs\yt-dlp_win\yt-dlp.conf": ['--write-subs', '--write-description', '--write-info-json', '--write-comments', '--write-thumbnail', '-o', 'D:/YouTube/%(channel)s/%(upload_date)s - %(title)s.%(ext)s', '--embed-subs', '--embed-thumbnail', '--embed-metadata', '--cookies-from-browser', 'firefox']
[Cookies] Extracting cookies from firefox
[debug] Extracting cookies from: "C:\Users\wertc\AppData\Roaming\Mozilla\Firefox\Profiles\duilr32u.default-release-1626019036382\cookies.sqlite"
[Cookies] Extracted 1948 cookies from firefox
[debug] Encodings: locale cp1252, fs utf-8, out utf-8, err utf-8, pref cp1252
[debug] yt-dlp version 2022.01.21 [f20d607] (win_exe)
[debug] Python version 3.8.10 (CPython 64bit) - Windows-10-10.0.19043-SP0
[debug] exe versions: ffmpeg n4.4-80-gbf87bdd3f6-20210822 (setts), ffprobe n4.4-80-gbf87bdd3f6-20210822
[debug] Optional libraries: Cryptodome, mutagen, sqlite, websockets
[debug] Proxy map: {}
Latest version: 2022.01.21, Current version: 2022.01.21
yt-dlp is up to date (2022.01.21)
[debug] [generic] Extracting URL: https://pbskids.org/cyberchase/videos
[generic] videos: Requesting header
WARNING: [generic] Falling back on generic information extractor.
[generic] videos: Downloading webpage
[generic] videos: Extracting information
[debug] Looking for video embeds
ERROR: Unsupported URL: https://pbskids.org/cyberchase/videos
Traceback (most recent call last):
  File "yt_dlp\YoutubeDL.py", line 1367, in wrapper
  File "yt_dlp\YoutubeDL.py", line 1437, in __extract_info
  File "yt_dlp\extractor\common.py", line 612, in extract
  File "yt_dlp\extractor\generic.py", line 3966, in _real_extract
yt_dlp.utils.UnsupportedError: Unsupported URL: https://pbskids.org/cyberchase/videos
@wertercatt wertercatt added site-bug Issue with a specific website triage Untriaged issue labels Jan 22, 2022
@Ashish0804
Copy link
Contributor

Doesn't look like this was ever supported

@Ashish0804 Ashish0804 added site-request Request to support a new website and removed site-bug Issue with a specific website labels Jan 22, 2022
@Ashish0804 Ashish0804 changed the title [PBS] Streaming videos from PBSKids.org can't be downloaded. [PBS]PBSKids.org Jan 22, 2022
@Ashish0804 Ashish0804 added account-needed Account details are needed to test/fix this can-share-account Someone is willing to provide account details for development labels Jan 22, 2022
@pukkandan pukkandan removed the triage Untriaged issue label Jan 22, 2022
@wertercatt
Copy link
Author

Doesn't look like this was ever supported

It's the same video streaming used by the supported PBS sites. Digging into the json metadata and pulling out the 'player.pbs.org/partnerplayer/' link to manually feed to YT-dlp results in it properly engaging the PBS extractor and downloading the video. The main issue is that it doesn't get any metadata that way, and it doesn't properly respond to the page that the player is embedded into.

@wertercatt
Copy link
Author

wertercatt commented Jan 22, 2022

The media.services.pbs.org domain is a red herring by the way, I think only PBS employees and partners can access that API. The public domains are:
PBSKids.org - Public facing human readable site, video players are embedded here that access the other domains
producerplayer.services.PBSKids.org - metadata host for the video players, directly called by the JavaScript and therefore public accessible. Appears to be an internal api that mirrors json from media.services.pbs.org
image.pbs.org - image CDN, hosts thumbnails.
player.pbs.org - A secondary video player, apparently intended for third-party embeds. YT-dlp can already pull video/audio/subtitles out but they lack the metadata provided higher up the chain. The json data for videos provided by producerplayer.services.PBSKids.org includes corresponding links to this domain.
kids.video.cdn.pbs.org - hosts the video, audio, and subtitles. Mainly in m3u8/ts formats. The json only directly links to the subtitles, video and audio links go through:
urs.pbs.org - redirector, used to obscure the direct links to the m3u8 streams in the producerplayer.services.PBSKids.org json.

@wertercatt
Copy link
Author

wertercatt commented Jan 23, 2022

producerplayer.services.pbskids.org API info:
There are only two relevant endpoints.

  • /show-tracking - Provides the entire 87 item list of available series slugs. It takes no variables.
  • /show-list - Provides the metadata and sources for the actual streaming videos. Takes several variables formatted as URL parameters:
  1. shows=cyberchase - Takes the slug for the specific show, which is indexed from the show-tracking endpoint. Cyberchase is cyberchase while Wild Kratts is wild-kratts for example. Technically optional, but then it tries to index into the 8191 item set of all videos and presumably causes undue stress on PBS' servers.
  2. page=1 - The API response is paginated. The pages start counting from 1. If there are more pages available, this should be incremented to get all of them.
  3. page_size=50 - Controls how many video items are returned. It caps out at 50, and the server handles it just fine. Should be static.
  4. available=public - If removed, it returns videos that are no longer accessible. The inaccessible videos don't return any usable links, so it's safe to discard them by using this variable.
  5. sort=-encored_on - defaults to this value if removed, so it's optional. Acceptable values are: encored_on,-encored_on,premiered_on,-premiered_on,title_sortable,-title_sortable. Values without a dash are an ascending sort, with a dash are descending. They correspond to the value from the JSON that's used to sort the items. encored_on is like the YouTube public date for a video, while premiered_on is like the upload date.
  6. type= - optional value, defaults to an empty value. Acceptable values are: ,clip,preview,full_length. It filters the returned videos by the type specified. When set to an empty value or not given, the API will return all available videos. Videos with the clip type are, well, clips taken from episodes or small trailers for games on the PBSKids website. The preview type is unused, filtering by it against the full 8191 item set returned a 204 as there are no videos with that type set. full_length videos are entire episodes, however the type is used for standard length cartoon episodes that aired on TV as well as for short 'minisode' specials that only aired online.

pukkandan pushed a commit that referenced this issue Jul 31, 2023
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this issue Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
account-needed Account details are needed to test/fix this can-share-account Someone is willing to provide account details for development site-request Request to support a new website
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants