Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for reading DPI information from JPEG2000 images #5568

Merged
merged 8 commits into from
Aug 2, 2021

Conversation

rogermb
Copy link
Contributor

@rogermb rogermb commented Jun 30, 2021

Changes proposed in this pull request:

  • Currently, Pillow does not read the DPI information for JPEG2000 images, unlike it does for other image formats.
    This PR seeks to remedy this issue by also parsing the "resc" JPEG2000 header box if it is present.
  • I didn't want to copy & paste the same box header reading code a third time due to the hierarchical format of the JPEG2000 header, so I wrapped the common header reading code in a small helper class (BoxReader), see the first commit.
  • However, this only adds support for reading DPI information, as OpenJPEG currently lacks support for writing the required "res " and "resc" header boxes. There are already years-old open issues addressing this shortcoming: OpenJPEG/#378, OpenJPEG/#788.

This is my first contribution to Pillow, so please let me know if I messed something up :)

@radarhere radarhere changed the title Add support for reading dpi information for JPEG2000 images Add support for reading DPI information from JPEG2000 images Jun 30, 2021
@radarhere radarhere added the JPEG label Jun 30, 2021
@rogermb
Copy link
Contributor Author

rogermb commented Jun 30, 2021

I'm not sure what's up with the pypy-3.6 x86 build failing on Windows. It worked on my fork and I haven't even modified the file that seems to be causing problems, Tests/test_image_getextrema.py.

Perhaps it's just a random failure and running the Windows build again will fix it?

@radarhere
Copy link
Member

I ran the job again and it is passing now

@@ -154,9 +154,6 @@ def _parse_jp2_header(fp):
if reader.read_fields(">4s")[0] == b"jpx ":
mimetype = "image/jpx"

if header is None:
raise SyntaxError("Could not find JP2 header")
Copy link
Contributor Author

@rogermb rogermb Aug 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this is unreachable, reader.has_next_box() will never return False (because reader = BoxReader(fp) is initialized without a length), so the only possible outcome for a malformed image not containing a "jp2h" header is that reader.next_box_type() will eventually fail with some kind of exception.

While not exactly elegant, that's definitely better than having an if statement that misleads the reader. 👍

@@ -33,12 +33,12 @@ def __init__(self, fp, length=-1):
self.remaining_in_box = -1

def _can_read(self, num_bytes):
if self.has_length and self.fp.tell() + num_bytes > self.length:
# Outside box: ensure we don't read past the known file length
return False
if self.remaining_in_box >= 0:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could technically also be an elif for symmetry and whatnot -- not that it matters, since we return from the previous if branch anyway. 👍 on spotting and fixing the bug where we could read past the parent box length!

)
return (254 * num * (10 ** exp)) / (10000 * denom)
if denom != 0:
return num / denom * (10 ** exp) * 0.0254
Copy link
Contributor Author

@rogermb rogermb Aug 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should've probably clarified this in a code comment: the reason for my roundabout way of calculating the resolution was to remain precise by working with ints for as long as possible, and to only have a single (division) operation which introduces floating point errors.

While your code is more elegant and more easily readable, there are now 3 floating point operations that can introduce slight numeric errors.

I'm not sure whether my approach is overkill, though. It's not like those last few mantissa bits are ever realistically going to matter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. I was thinking of being consistent with how this value is used elsewhere in Pillow.

I'll switch back to your version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks! 😄

@rogermb
Copy link
Contributor Author

rogermb commented Aug 1, 2021

Also, thank you for the additional test cases! 🚀

@rogermb
Copy link
Contributor Author

rogermb commented Aug 2, 2021

LGTM! Thank you for all of the code improvements 😄

@radarhere radarhere merged commit 6406dab into python-pillow:master Aug 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants