Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[instagram] "General metadata extraction failed" / extractor needs better fallback(s) #7165

Open
9 of 11 tasks
clemantino opened this issue May 29, 2023 · 5 comments
Open
9 of 11 tasks
Labels
site-bug Issue with a specific website

Comments

@clemantino
Copy link

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I intentionally remove or skip any mandatory* field

Checklist

Region

Austria

Provide a description that is worded well enough to be understood

Whenever I attempt to download an Instagram Reel using yt-dlp, I receive a "General metadata extraction failed" error message. This error suggests that the metadata extraction process is encountering difficulties, resulting in some missing metadata.
The process therefore gets canceled and the video won't be downloaded

I have attached the code that showcases the error message for reference.

Code:
.\yt-dlp.exe -f bestvideo[ext=mp4]+bestaudio[ext=m4a]/mp4 -o "%(title)s.%(ext)s" https://www.instagram.com/reel/CrkqPiFBUvl/
[Instagram] Extracting URL: https://www.instagram.com/reel/CrkqPiFBUvl/
[Instagram] CrkqPiFBUvl: Setting up session
WARNING: [Instagram] CrkqPiFBUvl: No csrf token set by Instagram API
[Instagram] CrkqPiFBUvl: Downloading JSON metadata
WARNING: [Instagram] CrkqPiFBUvl: General metadata extraction failed (some metadata might be missing).
[Instagram] CrkqPiFBUvl: Downloading webpage
WARNING: [Instagram] unable to extract shared data; please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U
WARNING: [Instagram] Main webpage is locked behind the login page. Retrying with embed webpage (some metadata might be missing).
[Instagram] CrkqPiFBUvl: Downloading embed webpage
WARNING: [Instagram] unable to extract additional data; please report this issue on https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using yt-dlp -U
ERROR: [Instagram] CrkqPiFBUvl: Requested content is not available, rate-limit reached or login required. Use --cookies, --cookies-from-browser, --username and --password, or --netrc (instagram) to provide account credentials

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • If using API, add 'verbose': True to YoutubeDL params instead
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

.\yt-dlp.exe -vU .\yt-dlp.exe -f bestvideo[ext=mp4]+bestaudio[ext=m4a]/mp4 -o "%(title)s.%(ext)s" https://www.instagram.com/reel/CrkqPiFBUvl/
[debug] Command-line config: ['-vU', '.\\yt-dlp.exe', '-f', 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/mp4', '-o', '%(title)s.%(ext)s', 'https://www.instagram.com/reel/CrkqPiFBUvl/']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version nightly@2023.05.29.101744 [2d306c03d] (win_exe)
[debug] Python 3.8.10 (CPython AMD64 64bit) - Windows-10-10.0.22621-SP0 (OpenSSL 1.1.1k  25 Mar 2021)
[debug] exe versions: ffmpeg 6.0-essentials_build-www.gyan.dev (setts), ffprobe 6.0-essentials_build-www.gyan.dev
[debug] Optional libraries: Cryptodome-3.18.0, brotli-1.0.9, certifi-2023.05.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-11.0.3
[debug] Proxy map: {}
[debug] Loaded 1836 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp-nightly-builds/releases/latest
Available version: nightly@2023.05.29.105216, Current version: nightly@2023.05.29.101744
Current Build Hash: 78139f063bf3c5939c7f8a473cc1b1eafbae74b5e5d327a1ca140080caab244d
[debug] Downloading _update_spec from https://github.com/yt-dlp/yt-dlp-nightly-builds/releases/latest/download/_update_spec
Updating to nightly@2023.05.29.105216 ...
[debug] Downloading yt-dlp.exe from https://github.com/yt-dlp/yt-dlp-nightly-builds/releases/latest/download/yt-dlp.exe
[debug] Downloading SHA2-256SUMS from https://github.com/yt-dlp/yt-dlp-nightly-builds/releases/latest/download/SHA2-256SUMS
Updated yt-dlp to nightly@2023.05.29.105216
[debug] Restarting: "C:\Users\cleme\yt-dlp.exe" -vU ".\yt-dlp.exe" -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/mp4" -o "%(title)s.%(ext)s" "https://www.instagram.com/reel/CrkqPiFBUvl/"
[debug] Command-line config: ['-vU', '.\\yt-dlp.exe', '-f', 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/mp4', '-o', '%(title)s.%(ext)s', 'https://www.instagram.com/reel/CrkqPiFBUvl/']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version nightly@2023.05.29.105216 [489f51279] (win_exe)
[debug] Python 3.8.10 (CPython AMD64 64bit) - Windows-10-10.0.22621-SP0 (OpenSSL 1.1.1k  25 Mar 2021)
[debug] exe versions: ffmpeg 6.0-essentials_build-www.gyan.dev (setts), ffprobe 6.0-essentials_build-www.gyan.dev
[debug] Optional libraries: Cryptodome-3.18.0, brotli-1.0.9, certifi-2023.05.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-11.0.3
[debug] Proxy map: {}
[debug] Loaded 1837 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp-nightly-builds/releases/latest
Available version: nightly@2023.05.29.105216, Current version: nightly@2023.05.29.105216
Current Build Hash: e8f51980cb49f239307fa9c3cbc477e9ac9328a72ca8805205951405486a00b5
yt-dlp is up to date (nightly@2023.05.29.105216)
[generic] Extracting URL: .\yt-dlp.exe
ERROR: [generic] None: '.\\yt-dlp.exe' is not a valid URL. Set --default-search "ytsearch" (or run  yt-dlp "ytsearch:.\yt-dlp.exe" ) to search YouTube
  File "yt_dlp\extractor\common.py", line 698, in extract
  File "yt_dlp\extractor\generic.py", line 2403, in _real_extract

[Instagram] Extracting URL: https://www.instagram.com/reel/CrkqPiFBUvl/
[Instagram] CrkqPiFBUvl: Setting up session
WARNING: [Instagram] CrkqPiFBUvl: No csrf token set by Instagram API
[Instagram] CrkqPiFBUvl: Downloading JSON metadata
WARNING: [Instagram] CrkqPiFBUvl: General metadata extraction failed (some metadata might be missing).
[Instagram] CrkqPiFBUvl: Downloading webpage
WARNING: [Instagram] unable to extract shared data; please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
WARNING: [Instagram] Main webpage is locked behind the login page. Retrying with embed webpage (some metadata might be missing).
[Instagram] CrkqPiFBUvl: Downloading embed webpage
WARNING: [Instagram] unable to extract additional data; please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
ERROR: [Instagram] CrkqPiFBUvl: Requested content is not available, rate-limit reached or login required. Use --cookies, --cookies-from-browser, --username and --password, or --netrc (instagram) to provide account credentials
  File "yt_dlp\extractor\common.py", line 698, in extract
  File "yt_dlp\extractor\instagram.py", line 456, in _real_extract
  File "yt_dlp\extractor\common.py", line 1158, in raise_login_required
@clemantino clemantino added site-bug Issue with a specific website triage Untriaged issue labels May 29, 2023
@Irevirt
Copy link

Irevirt commented May 30, 2023

I have this problem too. I also noticed that trying to open a reel on a browser results in a blank page (note that I don't have an account).

@bashonly bashonly mentioned this issue May 30, 2023
11 tasks
@bashonly bashonly changed the title General metadata extraction failed (some metadata might be missing) [instagram] General metadata extraction failed - extractor needs better fallback(s) May 30, 2023
@bashonly bashonly removed the triage Untriaged issue label May 30, 2023
@bashonly bashonly changed the title [instagram] General metadata extraction failed - extractor needs better fallback(s) [instagram] "General metadata extraction failed" / extractor needs better fallback(s) May 30, 2023
@bashonly
Copy link
Member

bashonly commented May 30, 2023

WARNING: [Instagram] Main webpage is locked behind the login page.
ERROR: [Instagram] CrkqPiFBUvl: Requested content is not available, rate-limit reached or login required. Use --cookies, --cookies-from-browser

The warning/error messages provide some insight to a workaround -- pass cookies from a logged-in browser session. NOTE: this is risky, as IG can detect yt-dlp usage and does not hesitate to ban accounts.

Also note that IG blocks data center IP addresses, and it seems even some residential IP users can't access IG's graphql API at all (unrelated to the rate-limit). Because of this, the extractor should be improved to have a better fallback when possible. IG has added a quasi-JSON-LD block to the html of some post pages; we could check for this.

However:

I have this problem too. I also noticed that trying to open a reel on a browser results in a blank page (note that I don't have an account).

If you can't view the post in your browser without logging in, then yt-dlp is not going to be able to download it without cookies.

@ahm750

This comment was marked as spam.

@gamer191

This comment was marked as duplicate.

@bashonly

This comment was marked as duplicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-bug Issue with a specific website
Projects
Status: instagram
Development

No branches or pull requests

5 participants