Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetchart: Use the right filename extension according to the image's MIME type #2053

Closed
broddo opened this issue Jun 16, 2016 · 12 comments · Fixed by #2068
Closed

fetchart: Use the right filename extension according to the image's MIME type #2053

broddo opened this issue Jun 16, 2016 · 12 comments · Fixed by #2068
Assignees
Labels
bug bugs that are confirmed and actionable

Comments

@broddo
Copy link
Contributor

broddo commented Jun 16, 2016

This is something I noticed today when using fetchart with the fanarttv plugin - one random albumart file it downloaded was actually a PNG file, not a JPG. This wasn't fetchart's fault - I verified that the source image was indeed a PNG from fanarttv despite having the .jpg extension.

But, I think a nice feature would be to include some form of image validty check and perhaps offer to convert to the image to the desired default format.

@sampsyo sampsyo added the needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." label Jun 16, 2016
@sampsyo
Copy link
Member

sampsyo commented Jun 16, 2016

Wow! That's too bad. Is there a way we can reproduce this (i.e., can we fake fetching art for the album you have to get the same image)?

@broddo
Copy link
Contributor Author

broddo commented Jun 16, 2016

Ha, it's hardly a troubling issue, but thought it was worth mentioning! The album in question was 9 by Damien Rice. If you grab that from fanarttv you'll see what I mean.

Gnome image viewer wouldn't open it. Gimp opened it (I assume it ignores file types and goes by the file header instead) and when I checked it with the file command, it said it was a PNG.

@jackwilsdon
Copy link
Member

jackwilsdon commented Jun 16, 2016

$ wget -q https://fanart.tv/detailpreview/fanart/music/2119beb8-6ac5-4f21-82a4-b831c90c0024/albumcover/9-4f31935ae2a25.jpg
$ file 9-4f31935ae2a25.jpg 
9-4f31935ae2a25.jpg: PNG image data, 280 x 280, 8-bit/color RGB, non-interlaced

Yep, certainly seems like an issue! Thanks @broddo 😄.

@broddo
Copy link
Contributor Author

broddo commented Jun 16, 2016

That's the one. Sorry I should have provided you with the link. I'm half asleep here.

@jackwilsdon
Copy link
Member

Haha, no problem! 👍 @sampsyo Do we actually make an attempt at detecting the file type currently and converting if it's not a JPEG?

@sampsyo
Copy link
Member

sampsyo commented Jun 16, 2016

Aha!

We don't check the data itself, but we do check the MIME type that the server reports. Fortunately, it looks like fanart.tv doesn't lie there:

$ curl -I "https://fanart.tv/detailpreview/fanart/music/2119beb8-6ac5-4f21-82a4-b831c90c0024/albumcover/9-4f31935ae2a25.jpg"
HTTP/1.1 200 OK
Date: Thu, 16 Jun 2016 23:03:59 GMT
Content-Type: image/png

The trick is that our fetch_image method only validates that the file is either reported as a PNG or a JPEG; it doesn't assign the extension based on that. In fact, unless I'm reading this wrong, I think the file extension is actually hard-coded as JPEG? In any case, we should instead use the MIME type to decide the extension for the downloaded file.

@sampsyo sampsyo added bug bugs that are confirmed and actionable and removed needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." labels Jun 16, 2016
@sampsyo sampsyo changed the title fetchart: check images for validity fetchart: Use the right filename extension according to the image's MIME type Jun 16, 2016
@wisp3rwind
Copy link
Member

Assigning this to myself as a reminder, but I'll likely not tackle it until in a few weeks. So if anyone wants to step up, feel free to do so, no need to notify me.

@wisp3rwind
Copy link
Member

wisp3rwind commented Jun 21, 2016

Unfortunately the files @jackwilsdon and @sampsyo mentioned are only the website previews, not the full-size images returned by the API, and for the latter fanart.tv does lie:

curl -I https://fanart.tv/fanart/music/2119beb8-6ac5-4f21-82a4-b831c90c0024/albumcover/9-4f31935ae2a25.jpg
HTTP/1.1 200 OK
Date: Tue, 21 Jun 2016 09:20:57 GMT
Content-Type: image/jpeg

so the fix in #2068 will not solve your problem.

On our side, this would need some file-like type detection. I'll probably also report this upstream. IIRC the guideline on fanart.tv is only 1000px jpeg, but apparently, images can slip through that requirement.

EDIT: link to the artist for my own reference

@jackwilsdon
Copy link
Member

jackwilsdon commented Jun 21, 2016

What do you think of using python-magic, @wordofglass? The issue we have then is people need to install libmagic, which can be especially a pain on Windows.

$ wget -q https://fanart.tv/fanart/music/2119beb8-6ac5-4f21-82a4-b831c90c0024/albumcover/9-4f31935ae2a25.jpg
$ python -c "import magic; print(magic.from_file('./9-4f31935ae2a25.jpg'))"
PNG image data, 1000 x 1000, 8-bit colormap, non-interlaced

Admittedly we'd need to parse the magic type but it's much better than any other solutions I can come up with see below.


Another idea I just came up with is using the identify command that comes with ImageMagick (which is probably better, although it would add a hard dependency on ImageMagick for fetchart):

$ wget -q https://fanart.tv/fanart/music/2119beb8-6ac5-4f21-82a4-b831c90c0024/albumcover/9-4f31935ae2a25.jpg
$ identify -format "%m" 9-4f31935ae2a25.jpg
PNG

@wisp3rwind
Copy link
Member

I've already updated the PR (#2068) using the standard library imghdr. This module is pretty basic, but I think it's sufficient for our purposes and does not introduce a new dependency. Unless there's a significant functional advantage of your suggestions, I'd stick to this.
Is it correct that really only jpeg and png are relevant image formats for beets? Then I see no need for an additional library.

@jackwilsdon
Copy link
Member

Ah, I didn't realise you'd already sorted it out (the PR still says "EDIT: doesn't fix #2053, see there").

@wisp3rwind
Copy link
Member

Updated the post ✨

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug bugs that are confirmed and actionable
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants