Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken novaplus.nova.cz extractor #24700

Closed
petr-ujezdsky opened this issue Apr 8, 2020 · 4 comments
Closed

Broken novaplus.nova.cz extractor #24700

petr-ujezdsky opened this issue Apr 8, 2020 · 4 comments

Comments

@petr-ujezdsky
Copy link

@petr-ujezdsky petr-ujezdsky commented Apr 8, 2020

Checklist

  • I'm reporting a broken site support
  • I've verified that I'm running youtube-dl version 2020.03.24
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar issues including closed ones

Verbose log

$ youtube-dl --verbose https://novaplus.nova.cz/porad/masterchef-cesko/epizoda/43352-masterchef-cesko-16-dil
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'--verbose', u'https://novaplus.nova.cz/porad/masterchef-cesko/epizoda/43352-masterchef-cesko-16-dil']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2020.03.24
[debug] Python version 2.7.17 (CPython) - Darwin-18.7.0-x86_64-i386-64bit
[debug] exe versions: ffmpeg 4.2.2, ffprobe 4.2.2, rtmpdump 2.4
[debug] Proxy map: {}
[Nova] 43352-masterchef-cesko-16-dil: Downloading webpage
[NovaEmbed] F49yZucqsa3: Downloading webpage
ERROR: Unable to extract formats; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 797, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 530, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/nova.py", line 38, in _real_extract
    r'(?s)(?:src|bitrates)\s*=\s*({.+?})\s*;', webpage, 'formats'),
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 1005, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
RegexNotFoundError: Unable to extract formats; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Description

Unfortunately none of the site URLs work now. They worked like one week ago.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Apr 8, 2020

You must provide CZ proxy/VPN/SSH tunnel.

@dstftw dstftw closed this Apr 8, 2020
dstftw added a commit that referenced this issue Apr 8, 2020
@petr-ujezdsky
Copy link
Author

@petr-ujezdsky petr-ujezdsky commented Apr 8, 2020

Is this useful somehow or there are some api calls?

curl https://novaplus.nova.cz/porad/masterchef-cesko/epizoda/43352-masterchef-cesko-16-dil > out.txt

out.txt

@cotwitch
Copy link

@cotwitch cotwitch commented Apr 23, 2020

@dstftw Hello Sergey, is anywhere prepared, for example, some docker image, for cases like this? I'm from Czech republic (geo-allowed country for Nova.cz sites) and I should be able to prepare/manage tunnel for you, just let me know. Best regards.

@Vangelis66
Copy link

@Vangelis66 Vangelis66 commented May 2, 2020

You must provide CZ proxy/VPN/SSH tunnel.

I'm from Czech republic (geo-allowed country for Nova.cz sites)
and I should be able to prepare/manage tunnel

... Perhaps it should be useful (e.g. for future troubleshooting scenarios and/or for Czech expats) to clarify that while the playlist API itself is indeed geo-fenced, the master MPEG-DASH manifest (one is being offered in a desktop browser) as well as the stream fragments are not!

I used one of the freely available VPN extensions in Chrome (in reality they are HTTPS proxies) with a Czech node and successfully loaded the player in the VOD URI posted by the OP:

https://novaplus.nova.cz/porad/masterchef-cesko/epizoda/43352-masterchef-cesko-16-dil

I then started playback (while being connected through the browser proxy) and in the Network tab of Dev Tools I managed to sniff the below MPD URI:

https://nova-ott-vod-prep-sec.ssl.cdn.cra.cz/Goa3-x2oTUX7HvkAXv2hwQ==,1588464979/vod_Nova/_definst_/0060/6437/cze-sd1-sd2-sd3-sd4-hd1-hd2-bz4nvVWp.smil/manifest.mpd

Fortunately it is not tokenised (but has an expiration unix timestamp, 1588464979 ), so I then (quickly) used yt-dl from my non-Czech IP to issue:

youtube-dl -F "https://nova-ott-vod-prep-sec.ssl.cdn.cra.cz/Goa3-x2oTUX7HvkAXv2hwQ==,1588464979/vod_Nova/_definst_/0060/6437/cze-sd1-sd2-sd3-sd4-hd1-hd2-bz4nvVWp.smil/manifest.mpd"

which returned:

[generic] manifest: Requesting header
WARNING: Falling back on generic information extractor.
[generic] manifest: Downloading webpage
[generic] manifest: Extracting information
[info] Available formats for manifest:
format code     extension  resolution note
p0aa0br128443   m4a        audio only [cze] DASH audio  128k , m4a_dash container, mp4a.40.2 (44100Hz)
p0va0br399738   mp4        512x288    DASH video  399k , mp4_dash container, avc1.4d4015, 25fps, video only
p0va0br799257   mp4        640x360    DASH video  799k , mp4_dash container, avc1.4d401e, 25fps, video only
p0va0br998948   mp4        768x432    DASH video  998k , mp4_dash container, avc1.4d401e, 25fps, video only
p0va0br1497710  mp4        1024x576   DASH video 1497k , mp4_dash container, avc1.4d401f, 25fps, video only
p0va0br3496750  mp4        1280x720   DASH video 3496k , mp4_dash container, avc1.4d401f, 25fps, video only
p0va0br4988812  mp4        1920x1080  DASH video 4988k , mp4_dash container, avc1.4d4028, 25fps, video only (best)

I then verified that the fragments aren't blocked either by issuing:

youtube-dl --console-title -c --no-part -f "p0va0br3496750+p0aa0br128443" "https://nova-ott-vod-prep-sec.ssl.cdn.cra.cz/Goa3-x2oTUX7HvkAXv2hwQ==,1588464979/vod_Nova/_definst_/0060/6437/cze-sd1-sd2-sd3-sd4-hd1-hd2-bz4nvVWp.smil/manifest.mpd" -o "MasterChef Česko - 16. díl.mp4"

...which indeed started fetching the 720p25 encode (from a DIRECT, non-proxied connection):

[generic] manifest: Requesting header
WARNING: Falling back on generic information extractor.
[generic] manifest: Downloading webpage
[generic] manifest: Extracting information
[dashsegments] Total fragments: 1222
[download] Destination: MasterChef ?esko - 16. d?l.fp0va0br3496750.mp4
[download]  19.9% of ~1.50GiB at 658.51KiB/s ETA 40:24

...So, @dstftw, wouldn't it be nice if the nova.py InfoExtractor was made to work with the --geo-verification-proxy switch? Some preliminary tests I conducted with verified Czech HTTPS proxies indicate that it doesn't currently:

youtube-dl --geo-verification-proxy="http://94.230.153.117:80" -F "https://novaplus.nova.cz/porad/masterchef-cesko/epizoda/43352-masterchef-cesko-16-dil" =>

[Nova] 43352-masterchef-cesko-16-dil: Downloading webpage
[NovaEmbed] F49yZucqsa3: Downloading webpage
ERROR: Unable to extract formats; please report this issue on https://yt-dl.org/bug . 
Make sure you are using the latest version; type  youtube-dl -U  to update. 
Be sure to call youtube-dl with the --verbose flag and include its complete output.

while, OTOH:

youtube-dl --proxy="http://94.230.153.117:80" -F "https://novaplus.nova.cz/porad/masterchef-cesko/epizoda/43352-masterchef-cesko-16-dil" => 

[Nova] 43352-masterchef-cesko-16-dil: Downloading webpage
[NovaEmbed] F49yZucqsa3: Downloading webpage
[NovaEmbed] F49yZucqsa3: Downloading m3u8 information
[NovaEmbed] F49yZucqsa3: Downloading MPD manifest
[info] Available formats for F49yZucqsa3:
format code          extension  resolution note
dash-p0aa0br128443   m4a        audio only [cze] DASH audio  128k , m4a_dash container, mp4a.40.2 (44100Hz)
dash-p0va0br399738   mp4        512x288    DASH video  399k , mp4_dash container, avc1.4d4015, 25fps, video only
dash-p0va0br799257   mp4        640x360    DASH video  799k , mp4_dash container, avc1.4d401e, 25fps, video only
dash-p0va0br998948   mp4        768x432    DASH video  998k , mp4_dash container, avc1.4d401e, 25fps, video only
dash-p0va0br1497710  mp4        1024x576   DASH video 1497k , mp4_dash container, avc1.4d401f, 25fps, video only
dash-p0va0br3496750  mp4        1280x720   DASH video 3496k , mp4_dash container, avc1.4d401f, 25fps, video only
dash-p0va0br4988812  mp4        1920x1080  DASH video 4988k , mp4_dash container, avc1.4d4028, 25fps, video only
hls-122              mp4        160x90      122k
hls-540              mp4        512x288     540k
hls-950              mp4        640x360     950k
hls-1155             mp4        768x432    1155k
hls-1667             mp4        1024x576   1667k
hls-3715             mp4        1280x720   3715k
hls-5251             mp4        1920x1080  5251k  (best)

Many thanks for your exquisite job 👍, stay safe from Covid-19 !

bbepis referenced this issue in animelover1984/youtube-dl May 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.