Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REQ: build information cache; combined search #858

Open
luxifr opened this issue May 28, 2013 · 6 comments
Open

REQ: build information cache; combined search #858

luxifr opened this issue May 28, 2013 · 6 comments
Labels

Comments

@luxifr
Copy link

@luxifr luxifr commented May 28, 2013

I'd love to use youtube-dl to completely automate my day-to-day youtube needs ;) However it misses some functionality, that is:

  • When searching for videos youtube-dl should not download ALL video ids over and over again but instead build a database that holds metadata associated with the ids like date, user, title and so forth
  • also it would be nice to be able to combine searches like all videos from "foo" containing "bar" in the title like "ytuser:foo ytsearch:bar"

with these additions one could automate downloading regular videos from users and only download those that one's interested in... also the cache would speed things up imensely if the ids are retrieved sorted by date...

cheers

@evliu
Copy link

@evliu evliu commented Dec 29, 2013

+1 for the second ability: multiple filters, so i can search for a specific video from a specific user [:

@phihag
Copy link
Contributor

@phihag phihag commented Dec 29, 2013

To avoid issues here on GitHub from becoming too messy, we ignore all but the first request per issue (unless the requests are related). Luckily, your second request is already implemented - youtube-dl http://youtube.com/user/phihag --match-title youtube-dl works fine.

As to the cache: The problem with any form is cache invalidation. In most of the youtube feeds I watch, the videos get uploaded with a title like event2013a-1.mp4, and the title is changed later to Tic-Tac-Toe World Cup 2013 Season 1 - Game 1. I want to be able to search for Tic Tac Toe and not get involved with the cache. I'd rather download metadata asynchronously.

I'm not saying a cache could not be useful, although I do wonder what a cache of say the video uploader would be useful for. Why would that help you? Can you elaborate a bit on your usage scenario, e.g. what fields you need for what purpose, how do you use youtube-dl (Python API, command-line, ...), what a typical command looks like?

@luxifr
Copy link
Author

@luxifr luxifr commented Jan 2, 2014

Simple. I'd like to subscribe to a channel with several thousand videos and I want to check for updates, say, hourly. Now that means redownloading and -parsing loads of metadata on every update. And I have quite some channels I watch, which exaggaretes the problem even more. On the other hand I think I don't watch even one channel where metadata gets changed after publishing so the problem with cache invalidation would be tiny to non-existant.

Besides wouldn't it be possible to fetch the video IDs sorted by date and wouldn't this date be updated when metadata like the title changes? Even if not you could make the cache optional.

It's just that it feels insane to download and parse over 5000 video metadata items every hour. And that's just ONE channel!

@luxifr
Copy link
Author

@luxifr luxifr commented Jan 2, 2014

Just saw that youtube-dl now seems to have a --cache-dir DIR and a --no-cache-dir option defaulting to using a cache. However I can't make it create and use a cache.

Command line: youtube-dl -i --cache-dir ~/.cache/youtube-dl --skip-download http://www.youtube.com/user/diesuperhomies

Expected result: cache of metadata is created and successive calls don't re-download the metadata.

Actual result: no cache what-so-ever is created and successive calls download metadata all over again.

@phihag
Copy link
Contributor

@phihag phihag commented Jan 2, 2014

--cache-dir is always set, you can just change the location. At the moment, the cache does not include anything related to video metadata.

phihag added a commit that referenced this issue Jan 2, 2014
@luxifr
Copy link
Author

@luxifr luxifr commented Jan 3, 2014

The commit doesn't clarify anything. Still one can't know what is actually cached. What does this cache then do? I haven't actually seen youtube-dl write anything in the cache location - not a single file with whatever content.

Also: I hope the elaboration of my use case was detailed enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.