improve handling of rotation #311

bertsky · 2019-09-11T16:59:08Z

Workspace.image_from_*: when rotation is necessary, do not always fill with white; instead, determine the background color by median, and only use white for binary images; moreover, add a transparency
channel if the input mode allows it

codecov-io · 2019-09-11T17:32:34Z

Codecov Report

Merging #311 into master will decrease coverage by 0.97%.
The diff coverage is 33.33%.

@@            Coverage Diff             @@
##           master     #311      +/-   ##
==========================================
- Coverage   90.82%   89.85%   -0.98%     
==========================================
  Files          30       30              
  Lines        1603     1626      +23     
  Branches      309      317       +8     
==========================================
+ Hits         1456     1461       +5     
- Misses        111      125      +14     
- Partials       36       40       +4

Impacted Files	Coverage Δ
ocrd/ocrd/workspace.py	`62.27% <31.81%> (-3.27%)`	⬇️
ocrd_utils/ocrd_utils/__init__.py	`78.17% <35.71%> (-3.75%)`	⬇️
ocrd/ocrd/resolver.py	`98.79% <0%> (+3.61%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0410153...1d82c42. Read the comment docs.

do not always fill with white; instead, determine the background color by median, and only use white for binary images; moreover, add a transparency channel if the input mode allows it

(image_from_polygon), keep the input image mode; moreover, add a transparency channel if the input image allows it

wrznr · 2019-09-12T07:27:44Z

All CI tests fail!

wrznr

🥇

bertsky · 2019-09-20T06:34:53Z

Please don't merge, yet! Just as we expected earlier, this does cause problems with some consumers:

OSError: cannot write mode LA as JPEG

This is tesserocr saying it cannot convert the image to libleptonica's Pixa, because JPGs must not have a transparency channel. (The choice JPEG is a fixed default in tesserocr.pyx.) Others like Ocropy might complain they suddenly see too many (2 or 4) dimensions in their Numpy arrays (instead of 1 or 3).

It is too much an imposition to expect the transparency reduction to white from the consumer. But it is also hard to tell from the API which consumers want which behaviour. Maybe we should spend a new option fill like so:

white: fill areas between segment polygon and rectangle with white
background: fill with the background colour, measured via median
transparent: add a transparency channel (current state of the PR)

@wrznr @kba what do you think?

bertsky · 2019-09-20T13:02:48Z

Correction: The above error for Tesseract is a shortcoming entirely attributable to tesserocr. And since in Ocropy/ocrolib with pil2image we have full control of what behaviour we want with transparent inputs, I propose to stick with fill as a parameter, but use transparent as default. Thus, we get the current behaviour, but can opt-out. Commit is following shortly. Please re-review!

- image_from_page, image_from_segment, image_from_polygon: add parameter ``fill`` - possible values white/background/transparent, with ``transparent`` (behaviour introduced by this branch) as default

bertsky · 2019-09-20T20:54:57Z

A little elaboration is due on why we want transparency. Transparency can preserve the information which image regions are of interest (as a mask), without clipping the rest to some (not so) random colour like white or background. Especially if we have to "invent" the color as in rotation with reshaping.

But this is only helpful for

image-only consumers (like CNN segmentation, dewarping or recognition), and among that only
those that do not already rely on binarized input, and among that only
those that are capable of interpreting a transparency channel.

By image-only I mean processors that cannot mask image rectangles via coordinates in their inner representation. By 2 I exclude approaches like Ocropy which do everything in binary anyway (where we can and must safely clip to white). And 3 could be e.g. a CNN segmentation without alpha channel in the model input.

So this positively includes Tesseract recognition (as it turns out – at least in principle), and upcoming segmentation tools like ocrd_segment and probably the modules from Würzburg and DFKI.

But by making transparent images the default, we put a burden on all consumers to be able to cope with the extra channels (and usually ignore them). As it stands, Ocropy (in ocrd_cis) is already robust. But we should check the other existing processors (calamari, kraken, anyOCR, typegroups classifier, larex?) before merging this.

bertsky · 2019-09-20T21:07:26Z

And on the producer side: clipping would be much more useful if it did not have to actually insert any colours, but could export its shrinked mask (which is completely free of the restrictions of coordinate polygons) as transparency.

bertsky · 2019-09-21T09:53:53Z

So this positively includes Tesseract recognition (as it turns out – at least in principle)

No, Tesseract can cope with raw colors (by converting them to 8-bit grayscale), but it always converts alpha channels to white background! The model does not have an extra colour for transparency. (See here for relevant links into the code.)

That leaves ocrd_segment et al. as the only use-case. Maybe not enough to swing the fill=transparent default anymore? Rather the contrary: Tesseract with transparent (becoming white) at runtime is worse than filling with background when running on raw images. The network is more perplexed by white/grey mix than by pure grey or pure binarized.

- image_from_page, image_from_segment, image_from_polygon: add parameter ``transparency``, independent of ``fill`` - an alpha channel with the mask will be added iff ``transparency``, colour in ``fill`` will be used regardless (for consumers which cannot handle alpha channels)

…b.com/bertsky/core into rotate-with-background-and-transparency

- image_from_polygon: regardless of the ``transparency`` parameter, if the input already has an alpha channel, then shrink its mask from the polygon mask

bertsky · 2019-09-23T15:31:22Z

The last 3 commits explained:

Having seen how Tesseract can cope with transparency, but does it rather badly (by reducing to white instead of background), and others like Ocropy ignoring transparency completely (thereby reverting to the color information in the other channels outside the mask), I think we need both:

having a way to add/preserve mask information if some workflow configuration demands it, but
not depending on alpha channels at all, but use estimated background colour everywhere outside the mask.

So the idea is to separate the transparency opt-in from the fill decision. And to nevertheless disregard transparency if the input already contains an alpha channel. Thus we get the following behaviour:

1 is satisfied by having some producer in the workflow chain use transparency=True or add images with an alpha mask itself; but by default no alpha channels will be created
2 is satisfied because fill=background is always there as a fallback for dumb consumers

wrznr · 2019-09-24T06:54:12Z

That is rather disappointing, right? Is there a way we can improve Tesseract's handling of transparency? Maybe via tesserocr and/or tesstrain?

bertsky · 2019-09-24T08:36:19Z

That is rather disappointing, right? Is there a way we can improve Tesseract's handling of transparency? Maybe via tesserocr and/or tesstrain?

Not without changing the internal network topology and retraining. Transparency needs to be an extra input, or at least a "special colour".

kba

Looks good, thanks

ocrd/ocrd/workspace.py

cneud

LGTM following #311 (comment)

Update Dockerfile, Makefile and GitHub action for Docker images

bertsky added the enhancement label Sep 11, 2019

bertsky requested a review from kba September 11, 2019 16:59

kba approved these changes Sep 11, 2019

View reviewed changes

bertsky force-pushed the rotate-with-background-and-transparency branch from 114e4f0 to 9409694 Compare September 11, 2019 17:29

Workspace.image_from_*: when rotation is necessary,

fe925a0

do not always fill with white; instead, determine the background color by median, and only use white for binary images; moreover, add a transparency channel if the input mode allows it

bertsky force-pushed the rotate-with-background-and-transparency branch from 9409694 to c3d217b Compare September 11, 2019 22:22

Workspace.image_from_*: when cropping is necessary

a10c9f2

(image_from_polygon), keep the input image mode; moreover, add a transparency channel if the input image allows it

bertsky force-pushed the rotate-with-background-and-transparency branch from c3d217b to a10c9f2 Compare September 11, 2019 22:42

bertsky requested a review from wrznr September 11, 2019 22:49

wrznr approved these changes Sep 12, 2019

View reviewed changes

fix debug message

440863f

bertsky requested review from kba and wrznr September 20, 2019 13:12

cropping/rotation: caller can opt out of transparency:

d4b46a3

- image_from_page, image_from_segment, image_from_polygon: add parameter ``fill`` - possible values white/background/transparent, with ``transparent`` (behaviour introduced by this branch) as default

bertsky force-pushed the rotate-with-background-and-transparency branch from b49648d to d4b46a3 Compare September 20, 2019 19:35

Merge branch 'master' into rotate-with-background-and-transparency

7f62ec9

bertsky added 3 commits September 23, 2019 12:03

Merge branch 'rotate-with-background-and-transparency' of ssh://githu…

dfb7a57

…b.com/bertsky/core into rotate-with-background-and-transparency

cropping: if input already has alpha, then use it unconditionally

215105a

- image_from_polygon: regardless of the ``transparency`` parameter, if the input already has an alpha channel, then shrink its mask from the polygon mask

bertsky mentioned this pull request Sep 24, 2019

move to AlternativeImage feature selectors in OCR-D/core#294: OCR-D/ocrd_tesserocr#75

Merged

polygon_mask: outline must have color of fill

1d82c42

wrznr approved these changes Sep 26, 2019

View reviewed changes

bertsky mentioned this pull request Oct 7, 2019

Fix image API (for rotation) #327

Merged

kba approved these changes Oct 8, 2019

View reviewed changes

ocrd/ocrd/workspace.py Show resolved Hide resolved

This was referenced Oct 22, 2019

fix deskewing OCR-D/ocrd_tesserocr#80

Closed

many more Ocropy improvements cisocrgroup/ocrd_cis#17

Merged

cneud self-requested a review November 4, 2019 17:29

cneud approved these changes Nov 4, 2019

View reviewed changes

kba merged commit 1d82c42 into OCR-D:master Nov 5, 2019

bertsky deleted the rotate-with-background-and-transparency branch October 8, 2020 06:41

bertsky pushed a commit to bertsky/core that referenced this pull request Apr 26, 2023

Merge pull request OCR-D#311 from stweil/docker-action

ad8b3aa

Update Dockerfile, Makefile and GitHub action for Docker images

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve handling of rotation #311

improve handling of rotation #311

bertsky commented Sep 11, 2019

codecov-io commented Sep 11, 2019 •

edited

Loading

wrznr commented Sep 12, 2019

wrznr left a comment

bertsky commented Sep 20, 2019

bertsky commented Sep 20, 2019

bertsky commented Sep 20, 2019

bertsky commented Sep 20, 2019

bertsky commented Sep 21, 2019 •

edited

Loading

bertsky commented Sep 23, 2019

wrznr commented Sep 24, 2019

bertsky commented Sep 24, 2019

kba left a comment

cneud left a comment

improve handling of rotation #311

improve handling of rotation #311

Conversation

bertsky commented Sep 11, 2019

codecov-io commented Sep 11, 2019 • edited Loading

Codecov Report

wrznr commented Sep 12, 2019

wrznr left a comment

Choose a reason for hiding this comment

bertsky commented Sep 20, 2019

bertsky commented Sep 20, 2019

bertsky commented Sep 20, 2019

bertsky commented Sep 20, 2019

bertsky commented Sep 21, 2019 • edited Loading

bertsky commented Sep 23, 2019

wrznr commented Sep 24, 2019

bertsky commented Sep 24, 2019

kba left a comment

Choose a reason for hiding this comment

cneud left a comment

Choose a reason for hiding this comment

codecov-io commented Sep 11, 2019 •

edited

Loading

bertsky commented Sep 21, 2019 •

edited

Loading