Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VLiveWebArchiveIE no longer works, archive.org excluded/banned all vlive links --> vlivearchive.com #8122

Closed
11 tasks done
snwefly opened this issue Sep 16, 2023 · 5 comments · Fixed by #8132
Closed
11 tasks done
Labels
site-request Request to support a new website wontfix This will not be worked on

Comments

@snwefly
Copy link

snwefly commented Sep 16, 2023

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I intentionally remove or skip any mandatory* field

Checklist

Region

Albania

Provide a description that is worded well enough to be understood

VLiveWebArchiveIE extractor is no longer works because archive.org excluded/banned all vlive links. alterative site is vlivearchive.com and i think its uses vlive original website framework. so please replace it with old one.

Thanks

Example links form VLiveWebArchiveIE extractor

https://web.archive.org/web/20221221144331/http://www.vlive.tv/video/1326

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • If using API, add 'verbose': True to YoutubeDL params instead
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

yt-dlp https://web.archive.org/web/20221221144331/http://www.vlive.tv/video/1326 -v
[debug] Command-line config: ['https://web.archive.org/web/20221221144331/http://www.vlive.tv/video/1326', '-v']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version nightly@2023.09.15.171906 [7b71643cc] (win_exe)
[debug] Python 3.8.10 (CPython AMD64 64bit) - Windows-10-10.0.17763-SP0 (OpenSSL 1.1.1k  25 Mar 2021)
[debug] exe versions: ffmpeg 2023-06-21-git-1bcb8a7338-full_build-www.gyan.dev (setts), ffprobe 2023-06-21-git-1bcb8a7338-full_build-www.gyan.dev
[debug] Optional libraries: Cryptodome-3.18.0, brotli-1.1.0, certifi-2023.07.22, mutagen-1.47.0, sqlite3-2.6.0, websockets-11.0.3
[debug] Proxy map: {}
[debug] Loaded 1865 extractors
[web.archive:vlive] Extracting URL: https://web.archive.org/web/20221221144331/http://www.vlive.tv/video/1326
[web.archive:vlive] 1326: Downloading webpage
WARNING: [web.archive:vlive] HTTP Error 403: Forbidden. Retrying (1/3)...
[web.archive:vlive] 1326: Downloading webpage
WARNING: [web.archive:vlive] HTTP Error 403: Forbidden. Retrying (2/3)...
[web.archive:vlive] 1326: Downloading webpage
WARNING: [web.archive:vlive] HTTP Error 403: Forbidden. Retrying (3/3)...
[web.archive:vlive] 1326: Downloading webpage
ERROR: [web.archive:vlive] 1326: Unable to download webpage: HTTP Error 403: Forbidden (caused by <HTTPError 403: Forbidden>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
  File "yt_dlp\extractor\common.py", line 715, in extract
  File "yt_dlp\extractor\archiveorg.py", line 1095, in _real_extract
  File "yt_dlp\extractor\archiveorg.py", line 1051, in _download_archived_page
  File "yt_dlp\utils\_utils.py", line 5092, in __iter__
  File "yt_dlp\extractor\common.py", line 3760, in _error_or_warning
  File "yt_dlp\utils\_utils.py", line 5100, in report_retry
  File "yt_dlp\extractor\archiveorg.py", line 1053, in _download_archived_page
  File "yt_dlp\extractor\common.py", line 1118, in _download_webpage
  File "yt_dlp\extractor\common.py", line 1069, in download_content
  File "yt_dlp\extractor\common.py", line 903, in _download_webpage_handle
  File "yt_dlp\extractor\common.py", line 860, in _request_webpage

  File "yt_dlp\networking\_urllib.py", line 428, in _send
  File "urllib\request.py", line 531, in open
  File "urllib\request.py", line 640, in http_response
  File "urllib\request.py", line 569, in error
  File "urllib\request.py", line 502, in _call_chain
  File "urllib\request.py", line 649, in http_error_default
urllib.error.HTTPError: HTTP Error 403: FORBIDDEN

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "yt_dlp\YoutubeDL.py", line 4059, in urlopen
  File "yt_dlp\networking\common.py", line 114, in send
  File "yt_dlp\networking\_helper.py", line 203, in wrapper
  File "yt_dlp\networking\common.py", line 325, in send
  File "yt_dlp\networking\_urllib.py", line 433, in _send
yt_dlp.networking.exceptions.HTTPError: HTTP Error 403: Forbidden

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "yt_dlp\extractor\common.py", line 847, in _request_webpage
  File "yt_dlp\YoutubeDL.py", line 4078, in urlopen
yt_dlp.networking.exceptions._CompatHTTPError: HTTP Error 403: Forbidden
@snwefly snwefly added site-bug Issue with a specific website triage Untriaged issue labels Sep 16, 2023
@dirkf
Copy link
Contributor

dirkf commented Sep 16, 2023

The page says:

Sorry.

This URL has been excluded from the Wayback Machine.

Extractors that use archive.org should catch 403 and check for this result, since retrying makes no sense, though retrying in a few months/years might work.

@bashonly
Copy link
Member

alterative site is vlivearchive.com and i think its uses vlive original website framework. so please replace it with old one.

can you share example links and a log for this site?

@bashonly bashonly added site-request Request to support a new website incomplete Further information is needed site-bug Issue with a specific website and removed site-bug Issue with a specific website triage Untriaged issue labels Sep 16, 2023
@snwefly
Copy link
Author

snwefly commented Sep 16, 2023

The page says:

Sorry.
This URL has been excluded from the Wayback Machine.

Extractors that use archive.org should catch 403 and check for this result, since retrying makes no sense, though retrying in a few months/years might work.

archive.org banned url list is growing larger each day, not sure why people even bothering with this site.

alterative site is vlivearchive.com and i think its uses vlive original website framework. so please replace it with old one.

can you share example links and a log for this site?

https://vlivearchive.com/post/0-18226748

I think made with https://github.com/jonathanlam/vlive-frontend

seproDev the person who made VLiveWebArchiveIE extractor is also the contributor in the vlive-frontend so i think its easier to marge to the new one since vlive-frontend is already finished.

@seproDev
Copy link
Collaborator

seproDev commented Sep 16, 2023

vlive.tv was most likely removed from the Wayback Machine due to a request from weverse. Nothing the Internet Archive can do about this. The good news is that the WARC files and actual CDN domain are still accessible, so an external lookup service could be built to return vodid/inkey/country, which can then be used to download the files from the Wayback Machine.
As this would require a third party service, this seems more fit as a plug-in than a built in extractor.

As for vlivearchive.com, that is a fan maintained archive I contributed to very slightly. The frontend is a recreation of the original vlive website.
I don't think the site meets the requirements to be added to yt-dlp. There is also a download button right on the video page.

@bashonly
Copy link
Member

As this would require a third party service, this seems more fit as a plug-in than a built in extractor.

Agreed

I don't think the site meets the requirements to be added to yt-dlp. There is also a download button right on the video page.

Also should be a plugin IMO. Thanks for your input @seproDev

Closing as out-of-scope

@bashonly bashonly closed this as not planned Won't fix, can't repro, duplicate, stale Sep 16, 2023
@bashonly bashonly added wontfix This will not be worked on and removed incomplete Further information is needed site-bug Issue with a specific website labels Sep 16, 2023
bashonly added a commit that referenced this issue Sep 17, 2023
@gamer191 gamer191 closed this as not planned Won't fix, can't repro, duplicate, stale Sep 17, 2023
aalsuwaidi pushed a commit to aalsuwaidi/yt-dlp that referenced this issue Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
site-request Request to support a new website wontfix This will not be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants