-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing support for Tesseract5? #338
Comments
The maintainer is long gone. Anyways, since you are on Windows, you shouldn't need to pre-install Tesseract. For Windows, the Tesseract model is bundled with the |
tessocr support tesseract 5 - see tesserocr code. Building tesserocr from source (tesserocr-2.6.2.tar.gz) requires also building tesseract development files (or to build leptonica&tesseract from source), otherwise tesserocr build fails. Details are in Readme. |
He clearly isn't building |
I’m trying to simply pip install it with a GitHub pipeline. Any help is
greatly appreciated.
https://github.com/dickreuter/Poker/blob/master/.github/workflows/windows-build.yml
…On Fri, 29 Dec 2023 at 11:42, Winston H. ***@***.***> wrote:
He clearly isn't building tesserocr from source.
—
Reply to this email directly, view it on GitHub
<#338 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJSW7U6FQ4YE4XRWLMCM2DYL3XH7AVCNFSM6AAAAABALY3YRSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZSGIYTAOJTGA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
@dickreuter I have sent you a PR regarding the pipeline. |
Also, I noticed that you have |
If this is correct:
then he is for 100% building from source. Maybe not intentionally, but this is source code - not a wheel (binary build)... |
The log here already tells you that he is doing a I am using |
And??? pip invoke build from source if it did not find a wheel... Are you familiar with the tools you try to use? |
What exactly is outdated in README? |
Why does this matter? OP is using Windows and installing with
The entire requirements section. Instead, he should add that to a section specifically for building from source / development. |
Much appreciated. Merged the PR.
…On Fri, 29 Dec 2023 at 13:14, Winston H. ***@***.***> wrote:
@dickreuter <https://github.com/dickreuter> I have sent you a PR
regarding the pipeline.
—
Reply to this email directly, view it on GitHub
<#338 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJSW7WLU27AU4CC3GEFN3TYL4CARAVCNFSM6AAAAABALY3YRSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZSGI2TMOJTGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Seriously?? This one?
Do you understand that text? What is outdated there? Please state facts, not vague accusations.
tesserocr (this project where the issue was created) NEVER produced Windows binary version. It was always created externally.
whatever the reason => the latest Windows wheel is 2.6.0 |
It is truly amazing how you missed this entire part
Exactly, and that's the problem. If you are going to commit to supporting a platform, the maintainer should do it well. |
I did not miss it. Is correct and relevant. Or do you claim you can run tesserocr on Debian without these libraries???
It is not a problem. E.g. tesseract and leptonica support many platforms but they never provide binary packages, just a source code. |
I am just saying that there is no longer a need to explicitly install these dependencies. You were even a participant on the PR for this change.
We can agree to disagree then. I believe it's the maintainer's responsibility to ensure that the DX for installing their libraries should always be seamless. In one of my projects, I made sure to bundle the nvidia cublas and cudnn libraries along with the wheel. I know some people may argue that it could be a redundant install if the user already has the dependencies installed in the machine, but relying on the user's PATH to properly resolve these dependencies, in my experience and many others, usually just leads to pain. To reiterate, the only reason why I, and many others are using this library instead of |
... untill you start to face the problems - see e.g. #337. Other problems were reported for Mac. Distributing own binary libraries on Linux is not a good idea. Linux philosophy is using system shared libraries => tesserocr should be linked against system leptonica and tesseract and not against their custom build.
No. It is a packager responsibility. Packager != maintainer. There is a split of tasks and responsibilities and it is right. |
You misread me. I am saying that I prefer
Is this issue not because the maintainer failed to properly pre-compile
And you're right, they don't have to because they do not explicitly support these platforms. This is unlike All I am saying is that |
Is there no support for tessseract 5?
In this pipeline I install tesseract with chocolatey. That works fine, and it installs tesseract 5, but then tesserocr gives the following error:
Supporting tesseract v3.04.00
Collecting tesserocr (from -r requirements.txt (line 31))
Downloading tesserocr-2.6.2.tar.gz (58 kB)
---------------------------------------- 58.9/58.9 kB 3.0 MB/s eta 0:00:00
Installing build dependencies: started
Installing build dependencies: finished with status 'done'
Getting requirements to build wheel: started
Getting requirements to build wheel: finished with status 'error'
error: subprocess-exited-with-error
Getting requirements to build wheel did not run successfully.
exit code: 1
[54 lines of output]
Failed to extract tesseract version number from: tesseract v5.3.3.20231005
leptonica-1.83.1
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found libarchive 3.7.2 zlib/1.3 liblzma/5.4.4 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.5
Found libcurl/8.3.0 Schannel zlib/1.3 brotli/1.1.0 zstd/1.5.5 libidn2/2.3.4 libpsl/0.21.2 (+libidn2/2.3.3) libssh2/1.11.0
Supporting tesseract v3.04.00
The text was updated successfully, but these errors were encountered: