New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TextLine coordinates too coarse #33
Comments
@bertsky We can set more tight textlines but this also has its own disadvantages. By the way we will publish a new tool which throws contours for textlines not rectangles. however mentioned method costs us more processing time! |
Then why not make that behaviour optional (with an ocrd-tool.json parameter), so the user can decide what is needed (precision or performance) for her workflow?
Where? And why did you close the issue already? |
Dear @bertsky , |
@vahidrezanezhad understood – I'll try to follow. Thanks for clarifying! |
Would it be possible to get good polygonal outlines from the text line segmentation instead of coarse bounding boxes?
There is a stark contrast between the precise contours of the text regions (which never overlap) and the coarse rectangles of text lines inside them (which often extrude beyond their parent and overlap between adjacent lines).
This makes it risky to apply line-level dewarping afterwards, and requires an OCR engine that can cope with intruders in the line image. In the example given in #29, I get these line images from
ocrd-cis-ocropy-dewarp
:The text was updated successfully, but these errors were encountered: