-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't disable built-in TESSDATA_PREFIX #261
Conversation
See related discussion for #240. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, but then we should also follow up on that:
The main problem was where minimal models (eng+osd) would be installed to during install-tesseract – the default under $prefix/share/tessdata would only serve the standalone CLI, but we could not make Tesseract take our new XDG path (at compile time), because it does not end on /tessdata. It's a dilemma. We decided to place them under XDG conventions (so at least our wrapper would work without extra download steps, which is important for the segmentation functionality – something users do not expect), rendering the standalone CLI unusable without --tessdata-dir anyway. We figured it would be more consistent that way – but I guess it is not. The users could still place models into the conventional tessdata path themselves, and then rightly expect the CLI to work as usual. What's more, we could even place a symlink there (if the FS supports that) ourselves at install time.
Could you please add the symlink (or copy) of TESSDATA
to the configure/hardcoded-default path in TESSERACT_TRAINEDDATA
and see if that makes the standalone CLI work out of the box again?
This allows using the `tesseract` cli again without setting an explicit TESSDATA_PREFIX. Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
That is now done in a 2nd commit 067955f. |
Sure, that was the purpose of the 2nd commit.
|
That's not a functional test though. Can you please try the built programs themselves? |
|
...and ocrd_tesserocr? |
The pull request does not change anything for
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
Unrelated side notice: I just tried current Tesseract: 25.39s, 25.35s, 26.31s, 25.48s This is not a representative test, but it indicates that the test time is decreased by more than a second, so that looks like latest Tesseract gives a performance gain of at least 4 percent. |
@kba, can we merge the pull request? |
This allows using the
tesseract
cli again without setting anexplicit TESSDATA_PREFIX.
Signed-off-by: Stefan Weil sw@weilnetz.de