Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - Scraper and image URL getter can't identify .jpeg #3714

Closed
6 tasks done
xtraorange opened this issue Jun 7, 2024 · 5 comments
Closed
6 tasks done

[BUG] - Scraper and image URL getter can't identify .jpeg #3714

xtraorange opened this issue Jun 7, 2024 · 5 comments
Labels
bug Something isn't working can't reproduce stale

Comments

@xtraorange
Copy link

xtraorange commented Jun 7, 2024

First Check

  • This is not a feature request.
  • I added a very descriptive title to this issue (title field is above this).
  • I used the GitHub search to find a similar issue and didn't find it.
  • I searched the Mealie documentation, with the integrated search.
  • I already read the docs and didn't find an answer.
  • This issue can be replicated on the demo site (https://demo.mealie.io/).

What is the issue you are experiencing?

When either using the scraper or using the get image by URL dialogue, Mealie is unable to identify a .jpeg. Produces a "Url is not an image" error when using the get image by URL dialogue. Interestingly, uploading from a computer works fine.

Steps to Reproduce

  1. Click Create.
  2. Click Import.
  3. Provide a URL that has a feature image with an extension of .jpeg. (example: https://emeals.com/recipes/recipe-47273-303002-Shrimp-and-Sausage-Jambalaya)
  4. Click Create.
  5. Note that no image is detected in the resulting recipe. (using the same website for pages that use an image with .jpg has no issues).

OR

  1. Edit an existing recipe
  2. Click Image.
  3. Provide the URL to an image with a .jpeg extension. (example: https://emeals-menubuilder.s3.amazonaws.com/v1/recipes/766433/pictures/large_shrimp-and-sausage-jambalaya.jpeg)
  4. Click Get.
  5. Note the error.

Please provide relevant logs

INFO 2024-06-07T13:41:45 - Image URL: https://emeals-menubuilder.s3.amazonaws.com/v1/recipes/766433/pictures/large_shrimp-and-sausage-jambalaya.jpeg
INFO 2024-06-07T13:41:45 - HTTP Request: GET https://emeals-menubuilder.s3.amazonaws.com/v1/recipes/766433/pictures/large_shrimp-and-sausage-jambalaya.jpeg "HTTP/1.1 200 OK"
ERROR 2024-06-07T13:41:46 - Content-Type: binary/octet-stream is not an image
ERROR 2024-06-07T13:41:46 - Content-Type: binary/octet-stream is not an image

Mealie Version

v1.8.0 - 583bd742fb7bbeee0191a6e6601677df57d86a11

Deployment

Docker (Linux)

Additional Deployment Details

No response

@xtraorange xtraorange added bug Something isn't working triage labels Jun 7, 2024
@boc-the-git
Copy link
Collaborator

I can upload a .jpeg just fine. Created one in MS Paint and uploaded: https://demo.mealie.io/g/home/r/pannkakor

I expect there's something wrong with the images on that website. Here's the check it's failing:

content_type = r.headers.get("content-type", "")
if "image" not in content_type:
self.logger.error(f"Content-Type: {content_type} is not an image")
raise NotAnImageError(f"Content-Type {content_type} is not an image")

@xtraorange
Copy link
Author

I can upload a .jpeg just fine. Created one in MS Paint and uploaded: https://demo.mealie.io/g/home/r/pannkakor

I expect there's something wrong with the images on that website. Here's the check it's failing:

content_type = r.headers.get("content-type", "")
if "image" not in content_type:
self.logger.error(f"Content-Type: {content_type} is not an image")
raise NotAnImageError(f"Content-Type {content_type} is not an image")

I mentioned that uploading works fine, it's only scraping or attempting to use a URL of an image with a .jpeg.

It looks like the failure is because it thinks that's not an image, which could be because it doesn't recognize .jpeg as a valid image extension.

@hay-kot
Copy link
Collaborator

hay-kot commented Jun 11, 2024

It's the Content-Type header returned from the S3 bucket that's causing the issue. We trust the Content Type to ensure that it's actually an image, we'd either have to try to parse the contents, or determine some other way to check if it's an image

CleanShot 2024-06-10 at 21 51 56@2x

@xtraorange
Copy link
Author

I wonder why the .jpg images on the same site work then? How strange. Well, thanks for looking into it!

Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jul 12, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working can't reproduce stale
Projects
None yet
Development

No branches or pull requests

3 participants