Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aljazeera.com: Unable to download videos not embedded in articles #29517

Open
5 tasks done
sebix opened this issue Jul 10, 2021 · 2 comments
Open
5 tasks done

aljazeera.com: Unable to download videos not embedded in articles #29517

sebix opened this issue Jul 10, 2021 · 2 comments

Comments

@sebix
Copy link

sebix commented Jul 10, 2021

Checklist

  • I'm reporting a broken site support
  • I've verified that I'm running youtube-dl version 2021.06.06
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar issues including closed ones

Verbose log

$ youtube-dl -v https://www.aljazeera.com/program/generation-change/2021/7/7/us-police-brutality-and-black-lives-matter
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', 'https://www.aljazeera.com/program/generation-change/2021/7/7/us-police-brutality-and-black-lives-matter']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.06.06
[debug] Python version 3.8.10 (CPython) - Linux-5.12.13-1-default-x86_64-with-glibc2.2.5
[debug] exe versions: ffmpeg 4.4, ffprobe 4.4
[debug] Proxy map: {}
[AlJazeera] us-police-brutality-and-black-lives-matter: Downloading JSON metadata
ERROR: Unable to download JSON metadata: HTTP Error 400: Bad Request (caused by <HTTPError 400: 'Bad Request'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type  youtube-dl -U  to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
  File "/usr/bin/youtube-dl/youtube_dl/extractor/common.py", line 634, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/usr/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 2288, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib64/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/lib64/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/usr/lib64/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/lib64/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/usr/lib64/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)

Description

Taking any video from https://www.aljazeera.com/videos/ results in the error shown above. Using the stream detector FF add-on shows some m3u8 files, of which some again point to m3u8 files, but I have so far only seen video streams (ts), of which some also return errors. I was able to extract all the video streams, but without audio so far.

Videos which are embedded in articles work, e.g. https://www.aljazeera.com/economy/2021/7/10/g20-signs-off-on-landmark-global-tax-reform
Here, youtube-dl downloads the video just fine, e.g.:

$ youtube-dl https://www.aljazeera.com/economy/2021/7/10/g20-signs-off-on-landmark-global-tax-reform
[generic] g20-signs-off-on-landmark-global-tax-reform: Requesting header
WARNING: Falling back on generic information extractor.
[generic] g20-signs-off-on-landmark-global-tax-reform: Downloading webpage
[generic] g20-signs-off-on-landmark-global-tax-reform: Extracting information
[download] Downloading playlist: G20 backs landmark global tax reform
[generic] playlist G20 backs landmark global tax reform: Collected 3 video ids (downloading 3 of them)
[download] Downloading video 1 of 3
[brightcove:new] 6257606655001: Downloading JSON metadata
[brightcove:new] 6257606655001: Downloading JSON metadata
[brightcove:new] 6257606655001: Downloading m3u8 information
[brightcove:new] 6257606655001: Downloading m3u8 information
[brightcove:new] 6257606655001: Downloading m3u8 information
[brightcove:new] 6257606655001: Downloading m3u8 information
[brightcove:new] 6257606655001: Downloading MPD manifest
[brightcove:new] 6257606655001: Downloading MPD manifest
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 16
[download] Destination: G7 nations reach historic deal to tax multinational corporations-6257606655001.fhls-4521-1.mp4
[download] 100% of 75.97MiB in 02:43
[dashsegments] Total fragments: 27
[download] Destination: G7 nations reach historic deal to tax multinational corporations-6257606655001.fdash-61c93800-abf5-4f58-8a09-4db01cac7056-1.m4a
[download] 100% of 2.41MiB in 01:29
[ffmpeg] Merging formats into "G7 nations reach historic deal to tax multinational corporations-6257606655001.mp4"
Deleting original file G7 nations reach historic deal to tax multinational corporations-6257606655001.fhls-4521-1.mp4 (pass -k to keep)
Deleting original file G7 nations reach historic deal to tax multinational corporations-6257606655001.fdash-61c93800-abf5-4f58-8a09-4db01cac7056-1.m4a (pass -k to keep)
[download] Downloading video 2 of 3
...

I'm sorry that I can't contribute more here. Please let me know if there's anything that I could help with.

@8chanAnon
Copy link

The relevant HTML snippet looks like this:

"embedUrl": "https://players.brightcove.net/665003303001/6tKQRAx7lu_default/index.html?videoId=6262384329001"

Al Jazeera pages normally only contain the Brightcove account number and video id, not the full video link.

@sebix
Copy link
Author

sebix commented Aug 23, 2021

yt-dlp/yt-dlp#763 could be a related fix

nixxo pushed a commit to nixxo/yt-dlp that referenced this issue Nov 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants