Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize --download-archive #1745

Closed
tukkek opened this issue Nov 9, 2013 · 5 comments
Closed

Optimize --download-archive #1745

tukkek opened this issue Nov 9, 2013 · 5 comments
Labels

Comments

@tukkek
Copy link

@tukkek tukkek commented Nov 9, 2013

When a video is already in the archive created by --download-archive youtube-dl will still fetch the video and info pages (for YouTube, at least). Since I am downloading from a playlist, the code already has the IDs beforehand and could safely skip those steps.

I'm not sure if there is any reason it would want to check the page again but initially it seems unnecessary to download two pages and then decide to skip the download if it already has the information needed to skip all these steps beforehand (the video ID).

Thanks again!

@jaimeMF
Copy link
Collaborator

@jaimeMF jaimeMF commented Nov 10, 2013

That would easily work for Youtube videos and other site where the url contains the id, but this is not true in all sites (for example TED talks). So it will require some additional work.

@tukkek
Copy link
Author

@tukkek tukkek commented Nov 10, 2013

Seems to me this could be implemented as InfoExtractor.should_fetch_pages() (default implementation returns true) to be called near here: https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L340 .

Another option is to create a 'skip information fetching' exception and use it (for now) only on the YouTube extractor.

In both cases you maintain current behavior (always fetch pages) for other websites while YouTube can easily skip playlist items already present in --archive-download.

@phihag phihag closed this in 7012b23 Nov 22, 2013
@phihag
Copy link
Contributor

@phihag phihag commented Nov 22, 2013

Fixed - at least for YouTube playlists/user profiles/channels/... - as of youtube-dl v2013.11.22.2 . Type youtube-dl -U to update.

@Hajitorus
Copy link

@Hajitorus Hajitorus commented Jan 29, 2014

I'm noting that in 2014.01.23.4, this issue is present. (That is, youtube-dl will go out and fetch info for ids in download archive.) I can replicate this by using --download-archive, and running youtube-dl ytuser:therealgiantbomb more than once in a row.

@phihag phihag reopened this Jan 29, 2014
@phihag phihag closed this in b11cec4 Jan 29, 2014
@phihag
Copy link
Contributor

@phihag phihag commented Jan 29, 2014

@Hajitorus Sorry, we introduced a bug there. Fixed in youtube-dl 2014.01.29.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.