Add simple orientation detection #34

robertknight · 2022-06-02T08:43:16Z

Add simple orientation detection using Leptonica's pixOrientDetect function. This was used in Tesseract because Tesseract's implementation requires the legacy (non-LSTM) engine, which is not compiled in. Leptonica's algorithm relies mostly on "the preponderence of ascenders over descenders in languages with roman characters", per this paper. Tesseract's approach which is not being used is described here.

TODO:

Investigate issues with same rotated image producing different results when loaded in different browsers (see notes in second commit)
Perhaps add a way for getOrientation API to indicate uncertainty in the result or errors in the process. Currently it returns 0 in the event of any error, and has no way to represent confidence in the result.

robertknight · 2022-06-02T22:19:35Z

After some local tests I think an alternative approach might be to:

Run layout analysis
Sample a few words or lines and try running text recognition on them in each of the 4 orientations
Pick the orientation which gives the highest mean confidence score

Tesseract's has built-in script and orientation detection but it is part of the classic (pre-LSTM) engine, which has been compiled out to reduce binary size. Hence this initial implementation use Leptonica's more simplistic orientation detection, which is based on counting numbers of ascenders and descenders, as described on pages 12-14 of http://www.leptonica.org/papers/skew-measurement.pdf.

In adding this I encountered issues where the same rotated image dropped into Safari, Chrome and Firefox could give different results. I believe this has to do with how the EXIF rotation information is handled by the various browser APIs used to load and draw images, but this still needs to be debugged.

The confidence value is currently 0 if an error occurred or 1 otherwise. This at least creates a space in the API to include a confidence score in the result.

…method

robertknight mentioned this pull request Jun 2, 2022

OCR fails if image is rotated #29

Open

robertknight added 4 commits June 5, 2022 08:55

Include confidence in orientation detection result

ebd135c

The confidence value is currently 0 if an error occurred or 1 otherwise. This at least creates a space in the API to include a confidence score in the result.

Add notes about the limitations of the current orientation detection …

26893cc

…method

robertknight force-pushed the orient-detect branch from 3803c2f to 26893cc Compare June 5, 2022 08:25

robertknight merged commit 5a5853c into main Jun 5, 2022

robertknight deleted the orient-detect branch June 5, 2022 08:39

robertknight mentioned this pull request Jun 5, 2022

Demo app: Images rotated if added from iOS / macOS photo library #30

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add simple orientation detection #34

Add simple orientation detection #34

robertknight commented Jun 2, 2022 •

edited

robertknight commented Jun 2, 2022

Add simple orientation detection #34

Add simple orientation detection #34

Conversation

robertknight commented Jun 2, 2022 • edited

robertknight commented Jun 2, 2022

robertknight commented Jun 2, 2022 •

edited