Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Optimize --download-archive #10733
Optimize --download-archive #10733
Comments
|
@dstftw Thanks. However, this doesn't fix the issue, just addresses it. Also, I've updated to 2016.09.24, and
|
|
@dstftw can this please be reopened?
|
|
Same issue with Vimeo. |
I'm not sure how well
--download-archiveplays with other sources, but with Soundcloud (my use case) there's a lot of unnecessary overhead. Instead of just taking track ID's found in a playlist and checking those against the archive list,youtube-dlalso fetches track meta from Soundcloud API, hence wasting time on something redundant. In addition to that, it seems thatyoutube-dldoesn't cache (not to mention index) the archive file, making this process even slower.Why does this matter to me? I have a growing list of liked tracks (265 as of now), which I like to backup, in case an artist decides to monetize a track / deletes it / Soundcloud goes down. I add maybe 1 track a week, but a simple sync with
youtube-dltakes almost as much time as re-downloading everything (file download speed is negligible, compared to API querying latency).I know nothing about the architecture of this project, but I'd suggest reading, caching and indexing the archive file on launch, and letting extractors do lookups, so that they can skip files at the earliest opportunity possible (or leave it up to the generic checker, if such opportunity doesn't exist).
Something similar has been suggested in #8757, as a part of a larger refactor.