Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YouTube] Don't download webpage for videos that match in an archive file #17573

Open
mjolnir870 opened this issue Sep 14, 2018 · 1 comment
Open

Comments

@mjolnir870
Copy link

@mjolnir870 mjolnir870 commented Sep 14, 2018

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2018.09.10. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

  • I've verified and I assure that I'm running youtube-dl 2018.09.10

Before submitting an issue make sure you have:

  • At least skimmed through the README, most notably the FAQ and BUGS sections
  • Searched the bugtracker for similar issues including closed ones
  • Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser

What is the purpose of your issue?

  • Bug report (encountered problems with youtube-dl)
  • Site support request (request for adding support for a new site)
  • Feature request (request for a new functionality)
  • Question
  • Other

If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:

Add the -v flag to your command line you run youtube-dl with (youtube-dl -v <your command line>), copy the whole output and insert it here. It should look similar to one below (replace it with your log inserted between triple ```):

[debug] System config: []
[debug] User config: []
[debug] Custom config: ['--no-mtime', '--no-playlist', '--playlist-reverse', '-i', '-w', '--sleep-interval', '5', '--max-sleep-interval', '7']
[debug] Command-line args: ['--config-location', 'C:\\Users\\USER\\Scripts\\youtube-dl-ondemand.conf', '--ffmpeg-location', 'C:\\Users\\USER\\Apps\\ffmpeg\\bin\\ffmpeg.exe', '--download-archive', 'F:\\test\\~youtube-dl_history.txt', '-f', 'bestvideo+bestaudio/best', '--write-thumbnail', '--external-downloader', 'C:\\Users\\USER\\Apps\\aria2\\64bit\\aria2c.exe', '--external-downloader-args', '--file-allocation=none -x 4 -k 1M', '-o', 'F:\\test\\%(upload_date)s %(title)s_%(id)s.%(ext)s', '-v', 'https://www.youtube.com/watch?v=KzPNY5N97Ck']
[debug] Encodings: locale cp1252, fs mbcs, out cp437, pref cp1252
[debug] youtube-dl version 2018.09.10
[debug] Python version 3.4.4 (CPython) - Windows-8.1-6.3.9600
[debug] exe versions: ffmpeg N-90960-gbcff983dc3, ffprobe N-90960-gbcff983dc3
[debug] Proxy map: {}
[youtube] KzPNY5N97Ck: Downloading webpage
[youtube] KzPNY5N97Ck: Downloading embed webpage
[youtube] KzPNY5N97Ck: Refetching age-gated info webpage
[download] Ballantine's Space Glass: Teaser has already been recorded in archive

Description of your issue, suggested solution and other information

In the above log I am trying to download a youtube video that is already recorded in the archive. youtube-dl is still downloading the webpage and then downloading the age-gated webpage to get video info.

Since youtube-dl already has the unique identifier from the URL, it should immediately reject this video BEFORE any downloading begins.

Current:

  1. youtube-dl has unique identifier (KzPNY5N97Ck).
  2. youtube-dl downloads webpage (and age-gated webpage after first fails).
  3. youtube-dl checks archive for KzPNY5N97Ck.
  4. youtube-dl rejects video with "[download] Ballantine's Space Glass: Teaser has already been recorded in archive"

Proposed:

  1. youtube-dl has unique identifier (KzPNY5N97Ck).
  2. youtube-dl checks archive for KzPNY5N97Ck.
  3. youtube-dl rejects video with "[download] video with identifier KzPNY5N97Ck has already been recorded in archive"
@syco
Copy link

@syco syco commented Sep 21, 2018

I noticed the same for the bbc.co.uk extractor, and as you can see below there's a lot more things downloaded in this case.

[bbc.co.uk] p05xxsmp: Downloading video page
[bbc.co.uk] p05xxsmp: Downloading playlist JSON
[bbc.co.uk] p05xxtvp: Downloading media selection XML
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[bbc.co.uk] p05xxtvp: Downloading MPD manifest
[bbc.co.uk] p05xxtvp: Downloading MPD manifest
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[bbc.co.uk] p05xxtvp: Downloading MPD manifest
[bbc.co.uk] p05xxtvp: Downloading MPD manifest
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[bbc.co.uk] p05xxtvp: Downloading m3u8 information
[download] Civilisations, Series 1, Second Moment of Creation has already been recorded in archive

Also because I need to go through a proxy to get these pages it takes a lot longer to check every video.
So, is this in the pipeline to be done?
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
@syco @mjolnir870 and others
You can’t perform that action at this time.