Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FreeBSD tesseract installation - missing configs dir #3674

Open
SKB-CGN opened this issue Dec 9, 2021 · 20 comments
Open

FreeBSD tesseract installation - missing configs dir #3674

SKB-CGN opened this issue Dec 9, 2021 · 20 comments

Comments

@SKB-CGN
Copy link

SKB-CGN commented Dec 9, 2021

Hi,
when installing tesseract on FreeBSD, the configs dir, which should be in "/usr/local/share/tessdata/" is not created.

It was necessary for me, to download the complete package from here, extract the configs folder and place it on my server.

After that, tesseract was working fine.

Please fix that!

@stweil
Copy link
Contributor

stweil commented Dec 9, 2021

Which version of Tesseract did you try to install?
How did you build / install Tesseract?

Additional information is needed to understand, reproduce and finally fix the issue.

@SKB-CGN
Copy link
Author

SKB-CGN commented Dec 9, 2021

HI,
i installed the folllowing:
pkg install tesseract-5.0.0 tesseract-data-4.1.0

After that, i ran the following command:
ocrmypdf -v --redo-ocr 2020_11_30_Kellerdecke_neu.pdf Keller_neu.pdf

This told me, that tesseract was not able to open "params_file" txt and pdf. So, ocrmypdf didnt create such a file.

I downloaded the package from here, extracted the configs dir and placed it into the tessdata folder.

After that, it was working.

@stweil
Copy link
Contributor

stweil commented Dec 9, 2021

Did you report the problem to the maintainer of the FreeBSD package?

I am afraid here is the wrong place for your report.

@stweil
Copy link
Contributor

stweil commented Dec 9, 2021

Cc'ing @pkubaj (maintainer of FreeBSD tesseract-5.0.0).

@SKB-CGN
Copy link
Author

SKB-CGN commented Dec 9, 2021

Where is that package located?

@stweil
Copy link
Contributor

stweil commented Dec 9, 2021

It looks like FreeBSD uses the cmake build. Debian bases distributions use the autotools build (which I normally use, too).

@stweil
Copy link
Contributor

stweil commented Dec 9, 2021

@SKB-CGN
Copy link
Author

SKB-CGN commented Dec 9, 2021

Ok.
And who is maintaining this package? Because i installed it via pkg.

@stweil
Copy link
Contributor

stweil commented Dec 9, 2021

@egorpugin, I just tried make install with a cmake build. It did not install /usr/local/share/tessdata. Is that a regression compared to 4.1.3?

@egorpugin
Copy link
Contributor

Does it work on 4.1.3?

@stweil
Copy link
Contributor

stweil commented Dec 9, 2021

I simply did not know, that's why I asked. Now I have run a test with 4.1.3, and it also does not install tessdata. So this is not a regression. Nevertheless it might be added to the cmake build, too.

@amitdo
Copy link
Collaborator

amitdo commented Dec 9, 2021

https://github.com/marketplace/actions/freebsd-vm

Maybe someone here wants to use this and write an action to test Tesseract on FreeBSD.

@egorpugin
Copy link
Contributor

I did not implement our cmake install.
Probably it was contributed via PRs, so not every expected functionality might present.

@stweil
Copy link
Contributor

stweil commented Dec 9, 2021

Maybe that is also not necessary. @pkubaj, could FreeBSD use autotools like the other distributions?

@pkubaj
Copy link
Contributor

pkubaj commented Dec 12, 2021

I switched back to autotools, please test.

@pkubaj
Copy link
Contributor

pkubaj commented Dec 12, 2021

You should clearly state that cmake support is incomplete to avoid such issues.

@stweil
Copy link
Contributor

stweil commented Dec 12, 2021

Do you have a suggestion where and how we should state that?

@pkubaj
Copy link
Contributor

pkubaj commented Dec 12, 2021

@egorpugin
Copy link
Contributor

Yes, cmake build was never complete.
I uploaded initial bits long time ago.
There were no install() and some important things. Some of these were contributed in PRs, but people ususally tune cmake build for themselves rather than doing really general support.

@zdenop
Copy link
Contributor

zdenop commented Dec 13, 2021

After this commit #2551 there is no need to install configs... ;-) Everything can be replaced by tesseract option -c or with SetVariable.

Installing configs via autotools is IMO relict from tesseract version 3 (autotools installed also langdatas and config files have to be at the same location aslangdatas ). At that time it made sense.

This was changed with tesseract 4: there are 3 sets of language data files and we stop installing langdatas with autotools. At the moment configs are still part of langdatas structure and therefore they should be part of the installation process of langdatas - not the library. Actually, you can run OCR with tesseract without configs, but you can not run it without langdata and nobody complains that make install is broken...

There was some discussion in past/attempt to improve this situation, but AFAIR no progress was done.

Personally, I like the idea to store my preferred settings in text file, but I think we should get rid of old logic (why configs and tessconfigs subdirectories? Why use the same directory for configs and lang data, when I need 3 or more directories for langdata???) enhance searching for user location (~/.config/tessdata ?)

And yes - as (nowadays) windows user I am happy user of tesseract cmake installation process already for several years - without any major problem (usually any new feature is implemented in the same time to autotools and cmake), as langdatas I need to install/update by a separate process...

This was referenced Dec 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants