Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Download all videos referenced in online page #7341

Closed
asgh opened this issue Nov 2, 2015 · 14 comments
Closed

Feature request: Download all videos referenced in online page #7341

asgh opened this issue Nov 2, 2015 · 14 comments

Comments

@asgh
Copy link

@asgh asgh commented Nov 2, 2015

Extend the -a option to allow passing in a url, and youtube-dl will get that page, parse it for links to sites it knows about, and any embedded videos (again from sites it knows about), and downloads all the referenced videos.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Nov 2, 2015

It's already implemented via generic extractor.

@dstftw dstftw closed this Nov 2, 2015
@asgh
Copy link
Author

@asgh asgh commented Nov 2, 2015

Doesn't seem to work for me:

youtube-dl 'https://www.reddit.com/r/ArtisanVideos/comments/3r2g3d/multiple_world_record_whip_cracker_419/'

[generic] multiple_world_record_whip_cracker_419: Requesting header
WARNING: Falling back on generic information extractor.
[generic] multiple_world_record_whip_cracker_419: Downloading webpage
[generic] multiple_world_record_whip_cracker_419: Extracting information
ERROR: Unsupported URL: https://www.reddit.com/r/ArtisanVideos/comments/3r2g3d/multiple_world_record_whip_cracker_419/

There are plenty of links to youtube videos on that page.

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Nov 2, 2015

Those are video links but not actual videos.

@asgh
Copy link
Author

@asgh asgh commented Nov 2, 2015

Yes I KNOW they are video links! That's what I asked for in the feature request!

That's why this is a feature request and not a bug report. Can you please reopen it?

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Nov 2, 2015

In my personal idea youtube-dl is a tool for collecting videos and/or audios playing directly on web pages. I'm not sure if there are other developers with a different expectation on youtube-dl. If so, just reopen it.

@asgh
Copy link
Author

@asgh asgh commented Nov 2, 2015

The -a option already accepts a list of videos to download. All I'm asking is that 1: it be extended to handle parsing an HTML page to find the links, and 2: that you can send it a URL as well as a local file.

@Hrxn
Copy link

@Hrxn Hrxn commented Nov 2, 2015

Please, leave the -a option as it is, use a new switch for this feature, if it gets implemented.

Besides, I don't understand what you are trying to solve here, because you can do the following things:

  • Use Dev Tools in FX/Chome to run JS to extract links
  • Find a Userscript that already does that
  • I think you can even use XPath for this kind of stuff
  • Use one of the many extensions for FX/Chrome that are already out there

It should not be hard to get the links. And when you have them, just paste them in a text file, hand over that text file to youtube-dl -a, and voila, profit..

@asgh
Copy link
Author

@asgh asgh commented Nov 2, 2015

It's just a timesaver. I have a page full of videos I want to watch. They play badly in my browser, so I play them with youtube-dl. This way I pass it the page, it finds all the videos and downloads them. Then I play them all in bulk.

There are lots of ways to do this, sure, including your suggestions. It was just a way of quickly grabbing a whole page full of videos, that's all. The -a option is already very similar except it doesn't parse HTML.

@Hrxn
Copy link

@Hrxn Hrxn commented Nov 2, 2015

Yes, nothing wrong with the idea, I would like to use it myself.

But I think that this is not so trivial to implement reliably.

Parsing a single HTML page for exactly one target (a link, or in yt-dl's case, a video, whatever) is one thing, parsing HTML, following the links there to other pages (more HTML), i.e. doing this recursively is entirely a different beast.

That is, for example, why the recursion option in wgethas a default depth limit of five, I think.
And you always have to specify a depth limit manually, there is no way for a program to do this..

@asgh
Copy link
Author

@asgh asgh commented Nov 2, 2015

The idea is only to look for URLs on the page, for sites/url patterns that youtube-dl already knows how to parse.

i.e. for each URL found (in A href and embed) check if there is an extractor dedicated to it (i.e. no generic extractor).

@jaimeMF
Copy link
Collaborator

@jaimeMF jaimeMF commented Nov 2, 2015

The problem with extracting all urls from a page is that with the most naive approach (extracting all <a heref="*" links) you can get a lot of unwanted results, for example if there are links in a sidebar. I think that extracting the links should be done with an external tool.

As @yan12125 has pointed we support videos that are played in the webpage, so if there's some embedded video that youtube-dl doesn't extract feel free to open a new issue.

@asgh
Copy link
Author

@asgh asgh commented Nov 2, 2015

That's why I said to only extract videos if the URL matches a known pattern.

@jaimeMF
Copy link
Collaborator

@jaimeMF jaimeMF commented Nov 2, 2015

That's why I said to only extract videos if the URL matches a known pattern.

Even if we only matched urls supported by youtube-dl, for example on any post at https://www.reddit.com/r/dota2 there are links on the sidebar to twitch livestreams (which are supported by youtube-dl) and therefore they would also be downloaded, which is not the expected behaviour.

That's why I'd suggest you to use a different tool that is able to ignore the sidebars and only look into the main content, which is probably not trivial to implement in a generic way and must be tuned for each website.

@asgh
Copy link
Author

@asgh asgh commented Nov 3, 2015

You make a good point about the sidebar, but can I request the feature anyway (reopen it)? After all, it's on me to use it right, there are times it will be the right tool for the job. No reason to avoid the feature because sometimes it's not the right tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.