Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[archive.org] HQ download fails for Ireland: The Making of a Republic #3832

Closed
6 of 7 tasks
NanaMizukiAnother7 opened this issue May 22, 2022 · 2 comments · Fixed by #4461
Closed
6 of 7 tasks

[archive.org] HQ download fails for Ireland: The Making of a Republic #3832

NanaMizukiAnother7 opened this issue May 22, 2022 · 2 comments · Fixed by #4461
Assignees
Labels
account-needed Account details are needed to test/fix this site-bug Issue with a specific website

Comments

@NanaMizukiAnother7
Copy link

Checklist

Region

United States

Description

Downloading a "Ireland: The Making of a Republic" video from Internet Archive will fail with "HTTP Error 403: Forbidden". Fortunately, I can watch a video from the browser.

Verbose log

[debug] Command-line config: ['https://archive.org/details/irelandthemakingofarepublic', '-o', 'D:\\Google Drive & YouTube Video Downloads and more\\%(title)s.%(ext)s', '-v']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8 (No ANSI), error utf-8 (No ANSI), screen utf-8 (No ANSI)
[debug] yt-dlp version 2022.05.18 [b14d523] (win_exe)
[debug] Python version 3.8.10 (CPython 64bit) - Windows-7-6.1.7601-SP1
[debug] Checking exe version: ffprobe -bsfs
[debug] Checking exe version: ffmpeg -bsfs
[debug] exe versions: ffmpeg 2022-05-04-git-0914e3a14a-full_build-www.gyan.dev (setts), ffprobe N-106528-g4fbf3c828b-sherpya
[debug] Optional libraries: Cryptodome-3.14.1, brotli-1.0.9, certifi-2021.10.08, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] [archive.org] Extracting URL: https://archive.org/details/irelandthemakingofarepublic
[archive.org] irelandthemakingofarepublic: Downloading webpage
[archive.org] irelandthemakingofarepublic: Downloading JSON metadata
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), acodec, filesize, fs_approx, tbr, vbr, abr, asr, pro
to, vext, aext, hasaud, source, id
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), acodec, filesize, fs_approx, tbr, vbr, abr, asr, pro
to, vext, aext, hasaud, source, id
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), acodec, filesize, fs_approx, tbr, vbr, abr, asr, pro
to, vext, aext, hasaud, source, id
[download] Downloading playlist: Ireland: The Making of a Republic
[archive.org] playlist Ireland: The Making of a Republic: Collected 3 videos; downloading 3 of them
[download] Downloading video 1 of 3
[debug] Default format spec: bestvideo*+bestaudio/best
[info] irelandthemakingofarepublic/irelandthemakingofarepublicreel1_01.mov: Downloading 1 format(s): 2
[debug] Invoking http downloader on "https://archive.org/download/irelandthemakingofarepublic/irelandthemakingofarepublicreel1_01.mov"
ERROR: unable to download video data: HTTP Error 403: Forbidden
Traceback (most recent call last):
  File "yt_dlp\YoutubeDL.py", line 3130, in process_info
  File "yt_dlp\YoutubeDL.py", line 2838, in dl
  File "yt_dlp\downloader\common.py", line 445, in download
  File "yt_dlp\downloader\http.py", line 373, in real_download
  File "yt_dlp\downloader\http.py", line 130, in establish_connection
  File "yt_dlp\YoutubeDL.py", line 3596, in urlopen
  File "urllib\request.py", line 531, in open
  File "urllib\request.py", line 640, in http_response
  File "urllib\request.py", line 563, in error
  File "urllib\request.py", line 502, in _call_chain
  File "urllib\request.py", line 755, in http_error_302
  File "urllib\request.py", line 531, in open
  File "urllib\request.py", line 640, in http_response
  File "urllib\request.py", line 569, in error
  File "urllib\request.py", line 502, in _call_chain
  File "urllib\request.py", line 649, in http_error_default
urllib.error.HTTPError: HTTP Error 403: Forbidden

[download] Downloading video 2 of 3
[debug] Default format spec: bestvideo*+bestaudio/best
[info] irelandthemakingofarepublic/irelandthemakingofarepublicreel1_02.mov: Downloading 1 format(s): 2
[debug] Invoking http downloader on "https://archive.org/download/irelandthemakingofarepublic/irelandthemakingofarepublicreel1_02.mov"
ERROR: unable to download video data: HTTP Error 403: Forbidden
Traceback (most recent call last):
  File "yt_dlp\YoutubeDL.py", line 3130, in process_info
  File "yt_dlp\YoutubeDL.py", line 2838, in dl
  File "yt_dlp\downloader\common.py", line 445, in download
  File "yt_dlp\downloader\http.py", line 373, in real_download
  File "yt_dlp\downloader\http.py", line 130, in establish_connection
  File "yt_dlp\YoutubeDL.py", line 3596, in urlopen
  File "urllib\request.py", line 531, in open
  File "urllib\request.py", line 640, in http_response
  File "urllib\request.py", line 563, in error
  File "urllib\request.py", line 502, in _call_chain
  File "urllib\request.py", line 755, in http_error_302
  File "urllib\request.py", line 531, in open
  File "urllib\request.py", line 640, in http_response
  File "urllib\request.py", line 569, in error
  File "urllib\request.py", line 502, in _call_chain
  File "urllib\request.py", line 649, in http_error_default
urllib.error.HTTPError: HTTP Error 403: Forbidden

[download] Downloading video 3 of 3
[debug] Default format spec: bestvideo*+bestaudio/best
[info] irelandthemakingofarepublic/irelandthemakingofarepublicreel2.mov: Downloading 1 format(s): 2
[debug] Invoking http downloader on "https://archive.org/download/irelandthemakingofarepublic/irelandthemakingofarepublicreel2.mov"
ERROR: unable to download video data: HTTP Error 403: Forbidden
Traceback (most recent call last):
  File "yt_dlp\YoutubeDL.py", line 3130, in process_info
  File "yt_dlp\YoutubeDL.py", line 2838, in dl
  File "yt_dlp\downloader\common.py", line 445, in download
  File "yt_dlp\downloader\http.py", line 373, in real_download
  File "yt_dlp\downloader\http.py", line 130, in establish_connection
  File "yt_dlp\YoutubeDL.py", line 3596, in urlopen
  File "urllib\request.py", line 531, in open
  File "urllib\request.py", line 640, in http_response
  File "urllib\request.py", line 563, in error
  File "urllib\request.py", line 502, in _call_chain
  File "urllib\request.py", line 755, in http_error_302
  File "urllib\request.py", line 531, in open
  File "urllib\request.py", line 640, in http_response
  File "urllib\request.py", line 569, in error
  File "urllib\request.py", line 502, in _call_chain
  File "urllib\request.py", line 649, in http_error_default
urllib.error.HTTPError: HTTP Error 403: Forbidden

[download] Finished downloading playlist: Ireland: The Making of a Republic
@NanaMizukiAnother7 NanaMizukiAnother7 added site-bug Issue with a specific website triage Untriaged issue labels May 22, 2022
@coletdjnz
Copy link
Member

If you go to https://archive.org/download/irelandthemakingofarepublic you'll notice the file it is trying to download is restricted.

yt-dlp detects the other converted formats so should be possible to de-prioritize (or exclude) the restricted ones

yt-dlp -F https://archive.org/details/irelandthemakingofarepublic                                          14s  Sun 22 May 2022 13:52:34
[archive.org] irelandthemakingofarepublic: Downloading webpage
[archive.org] irelandthemakingofarepublic: Downloading JSON metadata
[download] Downloading playlist: Ireland: The Making of a Republic
[archive.org] playlist Ireland: The Making of a Republic: Collected 3 videos; downloading 3 of them
[download] Downloading video 1 of 3
[info] Available formats for irelandthemakingofarepublic/irelandthemakingofarepublicreel1_01.mov:
ID EXT RESOLUTION │  FILESIZE PROTO │ VCODEC  ACODEC
─────────────────────────────────────────────────────
0  ogv 532x300    │   9.30MiB https │ unknown unknown
1  mp4 640x360    │  12.93MiB https │ unknown unknown
2  mov 1280x720   │   1.22GiB https │ unknown unknown
[download] Downloading video 2 of 3
[info] Available formats for irelandthemakingofarepublic/irelandthemakingofarepublicreel1_02.mov:
ID EXT RESOLUTION │   FILESIZE PROTO │ VCODEC  ACODEC
──────────────────────────────────────────────────────
0  ogv 532x300    │   99.74MiB https │ unknown unknown
1  mp4 640x360    │  137.87MiB https │ unknown unknown
2  mov 1280x720   │   13.01GiB https │ unknown unknown
[download] Downloading video 3 of 3
[info] Available formats for irelandthemakingofarepublic/irelandthemakingofarepublicreel2.mov:
ID EXT RESOLUTION │   FILESIZE PROTO │ VCODEC  ACODEC
──────────────────────────────────────────────────────
0  ogv 532x300    │  114.60MiB https │ unknown unknown
1  mp4 640x360    │  158.79MiB https │ unknown unknown
2  mov 1280x720   │   14.93GiB https │ unknown unknown
[download] Finished downloading playlist: Ireland: The Making of a Republic

@coletdjnz coletdjnz removed the triage Untriaged issue label May 22, 2022
@coletdjnz coletdjnz changed the title "Ireland: The Making of a Republic" fails to download [archiveorg] Does not de-prioritize/skip files that are marked as not available for download May 22, 2022
@NanaMizukiAnother7 NanaMizukiAnother7 changed the title [archiveorg] Does not de-prioritize/skip files that are marked as not available for download [archive.org] Does not de-prioritize/skip files that are marked as not available for download May 22, 2022
@NanaMizukiAnother7 NanaMizukiAnother7 changed the title [archive.org] Does not de-prioritize/skip files that are marked as not available for download [archive.org] HQ download fails for Ireland: The Making of a Republic May 25, 2022
@coletdjnz
Copy link
Member

coletdjnz commented Jul 26, 2022

I reckon the formats should be removed if we can't access them. The metadata the extractor uses tells us if a file is set to private, so we can lean on that.

However, I have no idea if this field changes for logged in users with access to the file. This metadata is a different view to that on the /download page. I don't have access to an account with such privileges to any private file to check. So I'm marking this as account required.

I'm thinking a workaround for now could be to skip the format unless archive.org cookies are present.

Until this is fixed, --check-formats can be used.

@coletdjnz coletdjnz added the account-needed Account details are needed to test/fix this label Jul 26, 2022
@coletdjnz coletdjnz self-assigned this Jul 26, 2022
coletdjnz added a commit that referenced this issue Jul 29, 2022
* Ignore private formats if not logged in (fixes #3832)
* Prefer original formats
* Support mpg formats

Authored by: coletdjnz, pukkandan
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
account-needed Account details are needed to test/fix this site-bug Issue with a specific website
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants