Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented improved character bounding box algorithm #2576

Merged
merged 1 commit into from Jul 16, 2019

Conversation

noahmetzger
Copy link
Contributor

@noahmetzger noahmetzger commented Jul 16, 2019

The algorithm is using the new character bounding boxes which were calculated
to create better symbol choices for the lstm_choice_mode.

Signed-off-by: Noah Metzger noah.metzger@bib.uni-mannheim.de

Signed-off-by: Noah Metzger <noah.metzger@bib.uni-mannheim.de>
@noahmetzger
Copy link
Contributor Author

noahmetzger commented Jul 16, 2019

Here are some examples of the old bounding boxes vs the new ones:

Example 1

Old:

52Old

New:

52New

Example 2

Old:

smaraOld

New:

smaraNew

Example 3

Old:

rbbOld

New:

rbbNew

@stweil stweil changed the title Implemented improved bounding box algorithm Implemented improved character bounding box algorithm Jul 16, 2019
@stweil
Copy link
Contributor

stweil commented Jul 16, 2019

Hopefully this fixes several issues: #1276, #2024, #2521.

@zdenop
Copy link
Contributor

zdenop commented Jul 16, 2019

thanks!

@zdenop zdenop merged commit c8374cc into tesseract-ocr:master Jul 16, 2019
@StephenRUK
Copy link

Fantastic, sure this will fix our issues too 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants