Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NHK World vods posted after September 26th, 2023 are broken #8242

Closed
11 tasks done
chrominance opened this issue Sep 30, 2023 · 5 comments · Fixed by #8249
Closed
11 tasks done

NHK World vods posted after September 26th, 2023 are broken #8242

chrominance opened this issue Sep 30, 2023 · 5 comments · Fixed by #8249
Labels
site-bug Issue with a specific website

Comments

@chrominance
Copy link

chrominance commented Sep 30, 2023

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I intentionally remove or skip any mandatory* field

Checklist

Region

Canada (but presumably worldwide)

Provide a description that is worded well enough to be understood

It appears NHK World has changed their video player recently, as well as the underlying infrastructure. As a result, episode VODs posted after the transition can't be downloaded by yt-dlp.

There is a Reddit post with other people who have noticed the issue: https://reddit.com/r/NHKWorldFans/comments/16tkdyu/new_video_player_on_nhk/

Examples of URLs that still work for now:
https://www3.nhk.or.jp/nhkworld/en/ondemand/video/4026201/ (Document 72 Hours, Sept. 12)
https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2049133/ (Japan Railway Journal, Sept. 14)

Examples of URLs that don't work:
https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2049134/ (Japan Railway Journal, Sept. 28)
https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2073139/ (J-Arena, Sept. 29)
https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2093047/ (Zero Waste Life, Sept. 29)

The problem is that the Piksel movie-s.nhk.or.jp URL (https://movie-s.nhk.or.jp/v/refid/nhkworld/prefid/nw_vod_v_en_2049_134_20230928233000_01_1695913422) returns a "no data found" message from the server, instead of the usual HTML output the Piksel extractor uses to retrieve the app token and other stuff. A quick scan of the assets API endpoint (https://movie-s.nhk.or.jp/ws/ws_asset/api/67f5b750-b419-11e9-8a16-0e45e8988f42/mode/json/apiv/5?sortdir=desc) shows nothing new since September 26th, and attempting to retrieve episode details using that API for the affected episodes returns no results (compare https://movie-s.nhk.or.jp/ws/ws_asset/api/67f5b750-b419-11e9-8a16-0e45e8988f42/mode/json/apiv/5?title=%25umineko%25&sortdir=asc&start=0&end=99 which returns the Document 72 Hours episode above that works fine, to https://movie-s.nhk.or.jp/ws/ws_asset/api/67f5b750-b419-11e9-8a16-0e45e8988f42/mode/json/apiv/5?title=%25hokuriku%25&sortdir=asc&start=0&end=99 which returns two Cycle Around Japan episodes but NOT the Japan Railway Journal episode that also includes "Hokuriku" in the title).

It is possible to retrieve video streams from the new episodes, though I can't find any 1080p versions anymore. On the NHK World VOD pages, a script calls api01-platform.stream.co.jp/apiservice/getMediaByParam/, which returns a JSON response with a movie_url object containing several m3u8 URLs. These contain a 320x180, 640x360 and two 1280x720 stream playlists. Dropping one of the playlist URLs directly into yt-dlp works fine, though I do get a warning "Live HLS streams are not supported by the native downloader. If this is a livestream, please add "--downloader ffmpeg --hls-use-mpegts" to your command".

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • If using API, add 'verbose': True to YoutubeDL params instead
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

[debug] Command-line config: ['https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2049134/', '-vU']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2023.09.24 [088add956] (pip)
[debug] Python 3.10.8 (CPython AMD64 64bit) - Windows-10-10.0.22621-SP0 (OpenSSL 1.1.1q  5 Jul 2022)
[debug] exe versions: ffmpeg git-2020-08-09-6e951d0, ffprobe 6.0-full_build-www.gyan.dev
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15.1, mutagen-1.45.1, sqlite3-3.37.2, websockets-10.3
[debug] Proxy map: {}
[debug] Loaded 1886 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Available version: stable@2023.09.24, Current version: stable@2023.09.24
yt-dlp is up to date (stable@2023.09.24)
[NhkVod] Extracting URL: https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2049134/
[NhkVod] 2049-134: Downloading JSON metadata
[Piksel] Extracting URL: https://movie-s.nhk.or.jp/v/refid/nhkworld/prefid/nw_vod_v_en_2049_134_20230928233000_01_1695913422
[Piksel] nw_vod_v_en_2049_134_20230928233000_01_1695913422: Downloading webpage
ERROR: [Piksel] nw_vod_v_en_2049_134_20230928233000_01_1695913422: Unable to extract app token; please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
  File "C:\Users\userX\AppData\Local\Programs\Python\Python310\lib\site-packages\yt_dlp\extractor\common.py", line 715, in extract
    ie_result = self._real_extract(url)
  File "C:\Users\userX\AppData\Local\Programs\Python\Python310\lib\site-packages\yt_dlp\extractor\piksel.py", line 82, in _real_extract
    app_token = self._search_regex([
  File "C:\Users\userX\AppData\Local\Programs\Python\Python310\lib\site-packages\yt_dlp\extractor\common.py", line 1263, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
@chrominance chrominance added site-bug Issue with a specific website triage Untriaged issue labels Sep 30, 2023
@garret1317
Copy link
Collaborator

new system

prod: {
     postMessageTargetOrigin: window.location.origin,
     allowDomainListJsonUrl: "https://movie-a.nhk.or.jp/world/player/config/domain-list.json",
     geoApiUrl: "https://geocontrol1.stream.ne.jp/nhk-a-geo/check.jsonp",
     playerBaseUrl: "/nhkworld/common/player/tv/vod/world/player/",
     eq: {
         apiUrl: "https://api01-platform.stream.co.jp/apiservice/getMediaByParam/",
         subtitleApiUrl: "https://api01-platform.stream.co.jp/apiservice/getSubtitleList/",
         token: "NDc4NThCNTkxQzFCNkQ3ODA4NjcwNTZGREYzNURBNzM=",
         customMetadataNames: {
             domesticFlag: "domesticFlag",
             embedType: "embedType",
             aaType: "aaType"
         }
     },

in https://movie-a.nhk.or.jp/world/player/js/movie-player.js (minified)

https://api01-platform.stream.co.jp/apiservice/getMediaByParam/?token=NDc4NThCNTkxQzFCNkQ3ODA4NjcwNTZGREYzNURBNzM=&type=json&optional_id=nw_vod_v_en_2049_134_20230928233000_01_1695913422&active_flg=1
stream metadata
optional_id is not optional if you want to download the right programme

Screenshot from 2023-09-30 11-19-37
lots of m3u8s
the auto ones are the ones you want
site uses auto_pc, but mb_auto has more info and a higher-bitrate 720p format (in addition to the lower-rate one)

the js has a subtitleApiUrl but im not sure if it's actually needed, they appear to be included in the m3u8

@garret1317 garret1317 removed the triage Untriaged issue label Sep 30, 2023
@october262
Copy link

october262 commented Oct 1, 2023

for this link - https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2049134/
use the extension called "the stream detector" - set it to yt-dlp to grab the m3u8 and download the video.

@Contik
Copy link

Contik commented Oct 7, 2023

While English-language videos at /en/ now download correctly I'm seeing that for example daily noon and evening news videos in Japanese at https://www3.nhk.or.jp/nhkworld/ja/ondemand/video produce:

[debug] Command-line config: ['-v', 'https://www3.nhk.or.jp/nhkworld/ja/ondemand/video/0451269387/']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2023.10.07 [377e85a17]
[debug] Python 3.11.5 (CPython x86_64 64bit) - Linux-6.5.5-arch1-1-x86_64-with-glibc2.38 (OpenSSL 3.1.3 19 Sep 2023, glibc 2.38)
[debug] exe versions: ffmpeg 6.0 (setts), ffprobe 6.0, rtmpdump 2.4
[debug] Optional libraries: Cryptodome-3.12.0, certifi-2023.07.22, sqlite3-3.43.1, websockets-10.4
[debug] Proxy map: {}
[debug] Loaded 1886 extractors
[NhkVod] Extracting URL: https://www3.nhk.or.jp/nhkworld/ja/ondemand/video/0451269387/
[NhkVod] 0451-269: Downloading JSON metadata
ERROR: list index out of range
Traceback (most recent call last):
  File "/usr/lib/python3.11/site-packages/yt_dlp/YoutubeDL.py", line 1567, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/yt_dlp/YoutubeDL.py", line 1702, in __extract_info
    ie_result = ie.extract(url)
                ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/yt_dlp/extractor/common.py", line 715, in extract
    ie_result = self._real_extract(url)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/yt_dlp/extractor/nhk.py", line 205, in _real_extract
    return self._extract_episode_info(url)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/yt_dlp/extractor/nhk.py", line 77, in _extract_episode_info
    episode = self._call_api(
              ^^^^^^^^^^^^^^^
IndexError: list index out of range

Not sure if these videos were ever intended to function correctly. When I first tried this a few days ago I couldn't download them and simply assumed it had the same root cause mentioned in this ticket.

These videos are intended for Japanese out of country so downloadable only outside of Japan. The site sends different HTTP response bodies depending on whether or not it perceives a request source IP address to be within Japan or outside of Japan. When outside of Japan the page shows:



By my understanding the videos highlighted with red border don't have any retention, NHK only ever offers the current day's video for download. These two examples are today's 7 pm news (ニュース7 aka nyusu 7) and noon news (正午のニュース aka shogo no nyusu).

"Within" Japan you'll get:


Please use NHK Plus or NHK World Premium


Basically asking you to to use you NHK Plus account to watch a show you missed or to get yourself an NHK World Premium subscription.

Is the list index out of range situation intentional for videos from /ja/ URLs?

@bashonly
Copy link
Member

bashonly commented Oct 7, 2023

@Contik list index out of range is not intentional. Could you open a new issue about this, please?

@Contik
Copy link

Contik commented Oct 7, 2023

@bashonly sure thing, opened issue #8303

aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this issue Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-bug Issue with a specific website
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants