Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto correct image rotation (-180, -90, 0, +90) #46

Closed
fritz-hh opened this issue Jan 8, 2014 · 3 comments
Closed

Auto correct image rotation (-180, -90, 0, +90) #46

fritz-hh opened this issue Jan 8, 2014 · 3 comments
Milestone

Comments

@fritz-hh
Copy link
Owner

fritz-hh commented Jan 8, 2014

No description provided.

@fritz-hh
Copy link
Owner Author

it seems that orientation detection will be supported in the next version of the tesseract command line interface:
http://code.google.com/p/tesseract-ocr/issues/detail?id=955

@eloops
Copy link

eloops commented Dec 10, 2014

Have been testing with v3.04 (compiled from git source). With -psm 0 it gives the orientation as well as confidence and an integer, but then that means you have to run tesseract-ocr over the page twice (first for orientation and then for OCR).

In -psm 1 mode it adds a 'textangle ###' attribute to the tags in the hocr file, so at the moment I am using the following to detect the rotation and correct it, after hocrTransform.py is called:

# Code removed

$curOCRedPDFRotated translates to a *.ocred.rotated.pdf file so should still be caught by the gs concatenation.

Unfortunately this doesn't work; If I rotate the image after OCR (and orientation detection), but before calling hocrTransform.py, the image is not rotated correctly (retains original dimensions) and the OCR'ed text is overlaid sideways.

If I rotate the image after the PDF is generated, it doesn't rotate correctly and/or the OCR'ed text is correct but not laid out correctly.

So it looks like the only way to do it properly is to call tesseract-ocr twice. Once to determine orientation, rotate the image if necessary and then a second time to perform OCR duties.

Edit:
Removed code. It really doesn't work. I kludged up an extra bit that runs tesseract in -psm 0 mode over the .pnm file and then gets convert (I use graphicsmagick convert, I'll test both and also econvert to see what speed difference there is) to rotate the image before passing it back to tesseract for OCR'ing. I don't think the second pass added much to it, although it would be nice to only have to do one pass.

@eloops
Copy link

eloops commented Sep 8, 2015

I ported this to a node library (here), part of it was implementing auto-rotation. Added a prototype to find the general rotation (by finding the greatest number of textangles in the hocr). Also by climbing up/down the DOM to the ocr_line class <span> elements and grabbing the textangle I could correct it when writing the words to the canvas. Still not sure yet if it's faster to do a separate -psm 0 (OSD only) and then a -psm 6 for the OCR text or just the -psm 1 (get everything).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants