Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PdfPageBuilder.AddJpeg fails to find the correct header values in the image #394

Closed
cremor opened this issue Nov 29, 2021 · 3 comments
Closed

Comments

@cremor
Copy link

cremor commented Nov 29, 2021

I found a Jpeg image file that can not be added to a PDF by calling PdfPageBuilder.AddJpeg(). The resulting image in the PDF just shows a mostly black area.

I debugged the problem a bit and it seems like there are at least two problems:

The image has a width of 3024 pixel and a height of 4032 pixel. But the internal PdfPig call to JpegHandler.GetInformation(fileStream) returns a width of 160 and a height of 120. So the values are not only wrong but also represent a wrong aspect ratio.

The wrong aspect ratio seems to be because the image contains an EXIF orientation tag that specifies a 90 degree rotation. (EXIF tag id 0x112 and value 6.) It seems like PdfPig doesn't check this EXIF tag to get the image dimensions.

I've then used the Windows image viewer to resave the image. This seems to remove the EXIF orientation tag and instead save the image already "correctly" orientated.
But even after that the image can't be added via PdfPig. JpegHandler.GetInformation(fileStream) then returns a width of 192 and a height of 256. So the aspect ratio (orientation) is now correct, but the image dimensions are still way off.

From what I could figure out with https://cyber.meme.tips/jpdump/ the problem might be that PdfPig uses the wrong bytes to get the dimensions. The bytes that represent the dimension values returned by PdfPig start at offset 0x1b8. But accoring to jpdump the correct header starts only at offset 0x299b. So it seems like PdfPig finds the "header marker" value of 0xffc0 in an earlier part of the file that isn't actually the header yet. According to jpdump that early part of the file contains various "applicaton segments" and the first of that "application segment" seems to be quite big and contain a thumbnail of the image.

Sadly, I don't know if I can provide the problematic image file. It contains personal data so I'd have to check that with the owner of the data. And if I try to modify the image to redact the personal data the problem doesn't happen any more.
But I can provide specific bytes of the files if that helps. Or I could provide you with more information from https://cyber.meme.tips/jpdump/

@cremor cremor changed the title PdfPageBuilder.AddJpeg fails with a specific file PdfPageBuilder.AddJpeg fails to find the correct header values in the image Nov 30, 2021
@EliotJones EliotJones added the bug label Jan 10, 2022
@EliotJones
Copy link
Member

I know it's probably not an issue for you since you probably found a workaround but I added a change to handle thumbnails in images. This still might not work for EXIF images but should fix most JPGs.

0b876cb

@cremor
Copy link
Author

cremor commented Jan 18, 2022

I can confirm that version 0.1.6-alpha-20220116-e54cd fixes the problem with thumbnails, thanks!

So only the EXIF orientation problem remains. Do you want to handle that in this issue or should I open a new one?
Btw, you can find EXIF orientation sample images here: https://github.com/recurser/exif-orientation-examples

@EliotJones
Copy link
Member

Sorry, won't get to EXIF stuff and absent any contributors picking it up it's just going to languish here forever 😢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants