Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VrtNU extractor broken #27707

Open
5 tasks done
covert8 opened this issue Jan 7, 2021 · 1 comment
Open
5 tasks done

VrtNU extractor broken #27707

covert8 opened this issue Jan 7, 2021 · 1 comment

Comments

@covert8
Copy link

covert8 commented Jan 7, 2021

Checklist

  • I'm reporting a broken site support
  • I've verified that I'm running youtube-dl version 2021.01.03
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar issues including closed ones

Verbose log

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-f', 'best', 'https://www.vrt.be/vrtnu/a-z/terzake/', '--verbose', '--print-traffic']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.01.03
[debug] Python version 3.9.1 (CPython) - Linux-5.9.14-arch1-1-x86_64-with-glibc2.32
[debug] exe versions: ffmpeg 4.3.1, ffprobe 4.3.1, rtmpdump 2.4
[debug] Proxy map: {}
[VrtNU] terzake: Downloading webpage
send: b'GET /vrtnu/a-z/terzake/ HTTP/1.1\r\nHost: www.vrt.be\r\nUser-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3775.5 Safari/537.36\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Encoding: gzip, deflate\r\nAccept-Language: en-us,en;q=0.5\r\nConnection: close\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Content-Type: text/html;charset=utf-8
header: Content-Length: 7721
header: Connection: close
header: Date: Thu, 07 Jan 2021 10:23:07 GMT
header: X-Content-Type-Options: nosniff
header: Expires: Thu, 07 Jan 2021 10:24:01 GMT
header: Cache-Control: max-age=300
header: X-UA-Compatible: IE=edge
header: Content-Encoding: gzip
header: X-Served-By: i-0632883e90d7e8d22
header: Accept-Ranges: bytes
header: Vary: Accept-Encoding
header: X-Cache: Miss from cloudfront
header: Via: 1.1 7d12bef71f48487e9202b581d949876e.cloudfront.net (CloudFront)
header: X-Amz-Cf-Pop: BRU50-C1
header: X-Amz-Cf-Id: TVWg6_jEmI_EI5KtRYQZEIYXxXz5UK52rlU9mc3BoKH_JVJTlOrIAw==
header: Age: 245
[VrtNU] terzake: Downloading JSON metadata
send: b'GET /vrtnu/a-z/terzake.mssecurevideo.json HTTP/1.1\r\nHost: www.vrt.be\r\nUser-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3775.5 Safari/537.36\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Encoding: gzip, deflate\r\nAccept-Language: en-us,en;q=0.5\r\nConnection: close\r\n\r\n'
reply: 'HTTP/1.1 410 Gone\r\n'
header: Content-Type: text/html; charset=utf-8
header: Content-Length: 238
header: Connection: close
header: Date: Thu, 07 Jan 2021 10:23:07 GMT
header: Server: Varnish
header: X-Varnish: 179536045
header: X-Robots-Tag: noindex, nofollow
header: Cache-Control: max-age=604800
header: Retry-After: 5
header: X-Cache: Error from cloudfront
header: Via: 1.1 32e3b86ae254a231182567c0124af893.cloudfront.net (CloudFront)
header: X-Amz-Cf-Pop: FRA2-C2
header: X-Amz-Cf-Id: 8fI1TmRF_DZywQ5k0BDRZfiAuK-ioTVYDRlO7EkzFRReDmL_e5A0vA==
ERROR: Unable to download JSON metadata: HTTP Error 410: Gone (caused by <HTTPError 410: 'Gone'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
  File "/usr/lib/python3.9/site-packages/youtube_dl/extractor/common.py", line 632, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/usr/lib/python3.9/site-packages/youtube_dl/YoutubeDL.py", line 2248, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/usr/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.9/urllib/request.py", line 561, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.9/urllib/request.py", line 641, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)

Description

The website vrt.nu has updated its delivery to akami and the relevant ID's seem to be located in a different json (eg: https://remix-cf-vrt.akamaized.net/remix/$ID/remix.ism/.m3u8). The mentioned ID is part of the json response from https://media-services-public.vrt.be. The origin of the request seems to originate from sentry (https://github.com/getsentry/sentry-javascript). I don't have the time to investigate further, hopefully i'll be looking in to this myself at a later date. Pointers on how to find the source of the magic json would be much gladly received.

@covert8 covert8 changed the title VRT extractor broken VrtNU extractor broken Jan 7, 2021
@covert8
Copy link
Author

covert8 commented Jan 7, 2021

#11873

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant