Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Instagram photos #9337

Open
EnginePod opened this issue Apr 28, 2016 · 46 comments
Open

Support for Instagram photos #9337

EnginePod opened this issue Apr 28, 2016 · 46 comments
Labels

Comments

@EnginePod
Copy link

@EnginePod EnginePod commented Apr 28, 2016

I'll skip the formality as there already is support for Instagram videos, but not photos.
Here is an example photo: https://www.instagram.com/p/BEvqLm5mcSK/?tagged=test

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Apr 28, 2016

Photos is out of scope of youtube-dl.

@dstftw dstftw closed this Apr 28, 2016
@EnginePod
Copy link
Author

@EnginePod EnginePod commented Apr 28, 2016

Why? While it downloads Instagram videos it can download photos as well.
I can't see why Instagram wouldn't be eligible as an exception.

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Apr 28, 2016

Unlike other websites, right-click on images of instagram does not give a save-to option. There's no easy way to get the image without HTML or network sniffing knowledge. Maybe the policy can be changed.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Apr 28, 2016

Supporting static images will also introduce semantic clash with thumbnails.

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Apr 28, 2016

For static pictures, thumbnail and url can be the same one.

@dstftw
Copy link
Collaborator

@dstftw dstftw commented Apr 28, 2016

So in general you are suggesting duplicating formats in thumbnails?

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Apr 28, 2016

Yeah. url points to the interested file, and for static pictures, thumbnail can be the target itself.

@Hrxn
Copy link

@Hrxn Hrxn commented Apr 29, 2016

You still can do this:
document.querySelector("[id^='pImage']").getAttribute("src");

(Run against Instagram post page, i.e. https://www.instagram.com/p/XXXXXXXXXXX/)

😄

Their source is an awful mess of nested div elements infused with React garbage, boy that is ugly..

But I think that RipMe still works with Instagram:
https://github.com/4pr0n/ripme

(Java based program)

@EnginePod
Copy link
Author

@EnginePod EnginePod commented Apr 29, 2016

That line works perfectly in the console, but it would be useful to have support for it in youtube-dl so all solution would be gathered in one place.

@Hrxn
Copy link

@Hrxn Hrxn commented Apr 29, 2016

Hmm, you can still automate this further. The post IDs are inside some JSON on the profile page, they can be extracted, then you can construct the URLs to the single post pages, fetch the HTML documents from each post page, and then query again.

😄

I just tested some images there, and maybe it is just me, but the picture quality is horrible. Total potato quality. I guess you never notice this when browsing Instagram normally (Web or Mobile), because the user agent always scales down the images and displays them this way...

@EnginePod
Copy link
Author

@EnginePod EnginePod commented Apr 29, 2016

All that needs to be done is to get value of content for the meta image <meta property="og:image" content="">.

This contains the URL to the highest quality version of the image.
Once the value is retrieved it can be set as both the thumbnail and the URL for the output.

@TRox1972
Copy link
Contributor

@TRox1972 TRox1972 commented Jun 2, 2016

Any progress on if this will be implemented or not?

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Jun 2, 2016

There's currently no consensus on whether youtube-dl should support non-audio/video files or not. It's a change to the essence of youtube-dl, so I guess it's better to get a common conclusion among developers before moving on. My personal idea is that supporting static pictures on existing websites is OK, for example Facebook, Tumblr and Instagram, but not adding new website with static pictures only.

@TRox1972
Copy link
Contributor

@TRox1972 TRox1972 commented Jun 3, 2016

@yan12125 I agree. Would this be a big change, requiring much new/changed code?

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Jun 4, 2016

Nope. Just a single line for Instagram.

@ghost
Copy link

@ghost ghost commented Jun 8, 2016

I use Jdownloader for Image download on Social Networks. Most Social Networks make it a pain to download full resolution image and the implementation changes very often. Lots of Pic Hunters have troubles to download a good quality image. Mostly they distribute/share low resolution pics.

I would love if the best open source tool supports images.

@Hrxn
Copy link

@Hrxn Hrxn commented Jun 10, 2016

So there are different resolutions, or formats, in youtube-dl parlance, to deal with.
Is this documented somewhere, for Instagram, for example? I am not so sure anymore if getting the image from the page source is really enough. The API might return something different..

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Jun 10, 2016

@Hrxn Sorry not getting it. What does it mean by "The API might return something different.."?

@Hrxn
Copy link

@Hrxn Hrxn commented Jun 10, 2016

https://www.instagram.com/developer/

Like this
'https://api.instagram.com/v1/users/self/media/recent?access_token=.............&count=40'

I am currently testing this, will see if I get another image,

Because I don't know if anyone else noticed, but the image quality on Instagram seems horrible. You don't realize this normally, when using it on mobile, because the images get scaled downed and the mobile displays have a higher resolution (dpi).
I've looked at a lot of different Instagrams yesterday, and this really seems to be their quality..

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Jun 10, 2016

I see. Usually youtube-dl tries to parse the web page instead using APIs. Parsing user pages should be possible.

@TRox1972
Copy link
Contributor

@TRox1972 TRox1972 commented Jun 10, 2016

On user pages it is possible to get different image resolutions the same way that we do it now for videos.

@Hrxn
Copy link

@Hrxn Hrxn commented Jun 10, 2016

On user pages it is possible to get different image resolutions [...]

Can you please elaborate? I don't see what you mean right now.

API access looks like a dead end. Apparently, all registered API clients run in some sandbox mode in default, which limits the access to what is otherwise considered public data on Instagram. Full access is only available for 'production' clients, which have to get through a review process first, it seems.
Only god knows why...

Whatever, I doubt that this would've helped with the crappy image quality of Instagram ;)

@TRox1972
Copy link
Contributor

@TRox1972 TRox1972 commented Jun 10, 2016

@Hrxn /<user>/media gives some info about the images/videos on the users Instagram, among which is resolutions. Though as you say, they don't seem to be very high quality:

format code          extension  resolution note
low_resolution       jpg        150x150
thumbnail            jpg        320x320
standard_resolution  jpg        640x640    (best)
@Hrxn
Copy link

@Hrxn Hrxn commented Jun 10, 2016

Oh, okay. Guess I misunderstood. With 'user pages', I thought you maybe meant the normal web URL, i.e. instagram.com/<user>.

I feared that I've failed to see some kind of menu button or something ;)

Okay, but <user>/media is an API endpoint, so I guess the sandbox problem still applies. Or could you do this successfully for another <user>, just some random Instagram account?

My initial plan was to try to find if the original upload would still be available somewhere. Because Instagram keeps that, AFAIK. At least I heard that somewhere.

I don't know about their format list, but there is some more I suspect. One Instagram post I chose for my test is a JPEG with 1080 x 1349, for example.

But quality is still not that good. Ridiculous, if compared to Flickr or 500px, for example.

@TRox1972
Copy link
Contributor

@TRox1972 TRox1972 commented Jun 10, 2016

Or could you do this successfully for another , just some random Instagram account?

The method works for all users I've tested so far (not many). It is also the current method we use in the Instagram extractor only for videos. But because it is an API endpoint, as you mentioned, Instagram may treat us differently.

The best way would be to mimic the browser: Using the <user>/media method for a picture yielded the resolutions mentioned above, but looking at the page source 1080x809 resolution is available, which is not in itself very good quality but, compared to the other resolutions, it's significant.

@EnginePod
Copy link
Author

@EnginePod EnginePod commented Jun 10, 2016

From pages that I've checked the photo with the highest quality is the one in the <meta> tag (or <meta property="og:image" content="[HIGH QUALITY URL HERE]"> to be more specific) and this line is available in the source code straight from the beginning so no browser emulation or API needed.

But if you're trying to get all the photos of a user then it'll be a more complicated task.
The /<user>/media page returns some images, but it doesn't provide the highest quality version of the photo.

I came to this conclusion by comparing the highest quality version of a photo from /<user/media and comparing it with the <meta> version that I was talking about earlier.

There is however one solution to getting the highest quality version of the photo, but it's a bit annoying.
The /<user>/media returns the photo IDs of user's images so it'd be possible to get all of those IDs, downloading their sources by visiting /p/<photo_ID>/ and downloading <meta> photo URL that has the highest quality.

@TRox1972
Copy link
Contributor

@TRox1972 TRox1972 commented Jun 10, 2016

The //media returns the photo IDs of user's images so it'd be possible to get all of those IDs, downloading their sources by visiting /p/<photo_ID>/ and downloading photo URL that has the highest quality.

If InstagramIE extracts the high-quality image from /p/<photo_id>, then InstagramUserIE could collect all the photo_ids an parse them to InstagramIE.

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Jun 11, 2016

youtube-dl will NEVER use any API that requires a key. Please discuss API usages privately but not here.

@Hrxn
Copy link

@Hrxn Hrxn commented Jun 11, 2016

This was never my intention. I'm aware that's not feasible, in addition, not even possible. I said that earlier.

The point was only to find out if Instagram stores higher res images at all, or rather the original resolution, i.e. the uploaded file. Because cell phone cameras may be crap, but they're not that bad as what Instagram is serving as pictures.

@TRox1972
Copy link
Contributor

@TRox1972 TRox1972 commented Jul 28, 2016

Should there be a vote between you developers whether or not youtube-dl will support static images?

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Jul 29, 2016

I guess we can have a simple vote: Click +1 on @EnginePod's original post if you like this idea, and -1 if you don't.

@TRox1972
Copy link
Contributor

@TRox1972 TRox1972 commented Sep 28, 2016

It has been some time now. Any development on the issue?

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Sep 28, 2016

There are not so many +1's. Maybe people don't know this debate on expanding youtube-dl's coverage.

@TRox1972
Copy link
Contributor

@TRox1972 TRox1972 commented Oct 5, 2016

Is anyone against implementing this? If not, why not implement it?

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Oct 5, 2016

@Tratoschek
Copy link

@Tratoschek Tratoschek commented Jun 5, 2017

Don't know if I should make a new issue for this, but I already found a comment about Tumblr in #9337 (comment).

For so, extracting (best) images from Tumblr is also not straight forward. Take e.g. http://programmerryangosling.tumblr.com/post/76642406919. This site links directly to the image, but not in the best quality. To get the best quality you also have to parse http://programmerryangosling.tumblr.com/image/76642406919 (and the browser shows the image only with enabled js).

@Hrxn
Copy link

@Hrxn Hrxn commented Jun 5, 2017

?

Tumblr?

But this issue is about Instagram?

@EnginePod
Copy link
Author

@EnginePod EnginePod commented Jun 5, 2017

@Tratoschek, this was about Instagram, but the og:image method seems to work well with tumblr too [see https://github.com//issues/9337#issuecomment-225213571].

In other words, you just need to view the source for the page with the specific post and extract og:image.
For the post you linked to: http://68.media.tumblr.com/77b54f57528c86f6c05ec55d1ce948f1/tumblr_n0zx7rXYk71r8lg7to1_1280.jpg

@Tratoschek
Copy link

@Tratoschek Tratoschek commented Jun 5, 2017

@Hrxn I know, but this issue has evolved to the central "Support images in youtube-dl"-issue.
@EnginePod Thank you, you're right, haven't seen this.

To the generic discussion here. What I get so far, the current facts are:

  • Photos are out of scope (source)
  • Technically no problem
  • Other techniques of youtube-dl not meaningful with pictures, e.g. conversion, good picture title (source)

Maybe some of this can be solved with some rules, what a "good" picture site is, to suggest some:

  • The main content of the site has to be an (only one) image.
  • Only this image would be downloaded (if different qualities are available, youtube-dl can list them).
  • The generic extractor never does anything special for images (just ignore them).

(This rules would include sites such as instagram, tumblr, imgur, ...)
Because the website is dedicated to the image, it is likely the image has a meaningful title. FFmpeg supports conversion of images (if this is wanted). The then extended scope of youtube-dl is clearly defined and partly similar to the current (extract the - one - video out of the site).

@EnginePod
Copy link
Author

@EnginePod EnginePod commented Jun 5, 2017

"The main content of the site has to be an (only one) image."
I agree with you on many points, especially this one.

As you sort of mentioned, it would be easy to implement by just adding the image as a format, example:
format: image, ext: jpg, url: http://68.media.tumblr.com/77b54f57528c86f6c05ec55d1ce948f1/tumblr_n0zx7rXYk71r8lg7to1_1280.jpg and setting thumbnail to either the image URL or leaving it at null.

To keep images and videos separated on sites that provide both (like FB, IG, etc.) you would have the image support only as a fallback. So if there was no video on the page then it'd fallback to extracting the image (unless there's a 404).

The problem doesn't seem to be the technical implementation, but rather that photos shouldn't be supported by youtube-dl (see @dstftw's reply) even though it'd make life easier for everyone.

@fletom
Copy link

@fletom fletom commented Nov 23, 2017

+1 for Twitter and Instagram photo downloads. especially with gifs nowadays the distinction is somewhat baseless

youtube-dl doesn't just work on YouTube anymore, and it doesn't have to just work for video either

@yan12125
Copy link
Collaborator

@yan12125 yan12125 commented Nov 23, 2017

With 14 +1 votes, I consider this a valid request. Implementation still require discussions.

@yan12125 yan12125 reopened this Nov 23, 2017
@yan12125 yan12125 added the request label Nov 23, 2017
@dardo82
Copy link

@dardo82 dardo82 commented Nov 30, 2017

I'd like this feature, how can I help?

@leonklingele leonklingele mentioned this issue Oct 13, 2018
1 of 5 tasks complete
@haasn
Copy link
Contributor

@haasn haasn commented Nov 10, 2018

I realize this is an old issue with a lot of historical discussion on the topic already, so adding my own flavor of "+1" probably isn't going to do much good.

Nonetheless, as a long-time user of youtube-dl I'd like to emphasize just how usable the internet is with youtube-dl existing. Without youtube-dl I would basically be unable to use any video site currently in existence - and it's simply staggering the amount of logic youtube-dl has for nearly every media hoster in use, to the point where youtube-dl even supports crap like dropbox - as long as what's pointed to by it is a video file.

In fact, if somebody sends me, say, a dropbox link that isn't a video file, I'm unable to open it. This restriction has always seemed sort of artificial to me. youtube-dl clearly has logic for downloading things from dropbox - why should this wonderfully useful logic have a built-in check that prevents it from operating if the link you got sent doesn't have the right file type? It's this weird disconnect between what the code supports and what the tool allows it to do that has always sort of bothered me.

By limiting the scope of youtube-dl you're essentially saying that we need to create a second tool that replicates all of the site-specific logic that youtube-dl has but with the intention of allowing non-videos from being downloaded as well. This seems sort of silly to me. Massive amounts of work clearly go into youtube-dl and it seems to me it would be much easier and more logical to extend youtube-dl than it would be to recreate it from scratch.

Add to this the fact that images are sort of a gray area in between videos and general files and it makes my opinion even stronger that youtube-dl should be able to handle them, since images are just more media files. It doesn't matter whether the tool at the other end (e.g. mpv) is opening one type of media file or another, it will still open. Now, normally I don't really care much about the image use case because browsers are normally able to display images. But then you have complete garbage like imgur which these days simply doesn't display anything at all, and I find myself wishing youtube-dl could help me every single time it happens. At some point I may try ripping out the imgur-specific logic from youtube-dl and wrapping it into my own command line tool to grab them, simply so I can view imgur links again. But it still seems like it would be a waste for youtube-dl not to add support directly.

Anyway, since I don't commit to youtube-dl my words are cheap so take them for what they are. Just a frustrated user who's tired of having to ask people to re-upload imgur links to other websites simply so he can view them..

@ytdl-org ytdl-org locked and limited conversation to collaborators May 13, 2019
@ytdl-org ytdl-org deleted a comment from spiralofhope May 13, 2019
@ytdl-org ytdl-org deleted a comment from ealgase May 13, 2019
@ytdl-org ytdl-org deleted a comment from Hrxn May 13, 2019
@ytdl-org ytdl-org deleted a comment from ealgase May 13, 2019
@ytdl-org ytdl-org deleted a comment from dardo82 May 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
9 participants
You can’t perform that action at this time.