Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Pillow high depth images #627

Merged
merged 5 commits into from
Oct 23, 2020
Merged

Conversation

bertsky
Copy link
Collaborator

@bertsky bertsky commented Oct 18, 2020

Digitisation sometimes produces 16-bit or even 32-bit grayscale scans, which Pillow still (as of 8.0) has poor support for. This affects us severely, because we rely on PIL's median, convert and paste to generate page and segment images for all processors. Due to the pertinent bugs (at least python-pillow/Pillow#3159, python-pillow/Pillow#3838, python-pillow/Pillow#3011), we then get fully or partially blacked out crops.

This workaround attempts to convert these images to 8 bit depth before doing anything else (which is hard enough to get right). The idea is that as soon as images enter the API, be it original or derived, we prevent PIL from doing any harm (but sacrifice some precision).

Here is an example image (which being TIFF unfortunately Github won't embed directly):

OCR-D-IMG_APBB_Mitteilungen_62.0002.zip

I'm sure you'll (rightly) ask for more "correct" (or at least well-defined) test images, but this may take us some time to curate. I can already point you to Pillow's own testset, but have not properly screened it for good case studies yet.

As an additional difficulty, many image viewer tools don't handle these formats correctly either. IM's display does, but its identify is not easy to interprete here (but does still help). It seems to discern between actual depth (as in pixel statistics?) vs formal depth (as in metadata). You'll get

  • actual depth via -format "%[bit-depth]" or the part after the slash in -verbose's Depth: description (if any)
  • formal depth via -format "%z" or the part before the slash in -verbose's Depth: description (always)

For example, I've seen 16-bit, 16/8-bit, 16/15-bit, 32/16-bit images.

Anyway, I suggest separating the full regression test control from the actual workaround for now and fast-track this. I'll re-run this on a couple of other workspaces and watch for discrepancies.

@codecov-io
Copy link

codecov-io commented Oct 18, 2020

Codecov Report

Merging #627 into master will decrease coverage by 0.39%.
The diff coverage is 22.22%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #627      +/-   ##
==========================================
- Coverage   84.72%   84.32%   -0.40%     
==========================================
  Files          52       52              
  Lines        2952     2966      +14     
  Branches      575      579       +4     
==========================================
  Hits         2501     2501              
- Misses        336      349      +13     
- Partials      115      116       +1     
Impacted Files Coverage Δ
ocrd/ocrd/workspace.py 66.06% <22.22%> (-2.93%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 56c4aa6...e202bb8. Read the comment docs.

@kba
Copy link
Member

kba commented Oct 19, 2020

OK, so while this incurs a bit of a computing penalty, it will also reduce memory consumption, right?

If it improves handling with pillow: fine with me, though I would really like to see a unit test for this.

@kba
Copy link
Member

kba commented Oct 19, 2020

AFAICT, we don't have any 16-or-more-bit images in assets currently:

identify -format "%[filename]: %[bit-depth]\n" **/*tif **/*.png

@bertsky
Copy link
Collaborator Author

bertsky commented Oct 19, 2020

AFAICT, we don't have any 16-or-more-bit images in assets currently:

identify -format "%[filename]: %[bit-depth]\n" **/*tif **/*.png

Yes, only 8-bit and 1-bit so far.

OK, so while this incurs a bit of a computing penalty, it will also reduce memory consumption, right?

Actually I think it reduces both CPU time and memory, because in average workflows, lots of operations follow up on that first open, which will run faster on smaller depths.

The only true cost is precision here – which I would have preferred to avoid, but Pillow does not allow us. (I thought about using RGB for this, because 24-bit should not incur the large quantization error of single-channel 8-bit. But simply having three identical 8-bit channels would not help at all. We would have to pay attention to color profiles everywhere, esp. linear vs logarithmic, so we can afterwards convert back the luma-equivalent grayscale value.)

@kba kba merged commit affe113 into OCR-D:master Oct 23, 2020
@bertsky bertsky mentioned this pull request Jan 13, 2021
@bertsky
Copy link
Collaborator Author

bertsky commented Nov 5, 2021

Sorry, I just found another case where the current implementation does not work yet: 16 bit grayscale which gets imported by Pillow as I (not I;16) and thus also its Numpy interface yields int32, which causes all of our range to get shifted to 0. (Cannot share the image for privacy, but will dig up an equivalent and then open an issue...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants