Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tesseract outputs empty files for valid png image #1361

Closed
ghost opened this issue Mar 5, 2018 · 9 comments
Closed

Tesseract outputs empty files for valid png image #1361

ghost opened this issue Mar 5, 2018 · 9 comments

Comments

@ghost
Copy link

ghost commented Mar 5, 2018

I have tesseract installed on gentoo linux with all its components enabled. Heres what I do to run it

$ tesseract test.png out
[DS] Profile file not available (tesseract_opencl_profile_devices.dat); performing profiling.

[DS] Device: "AMD CARRIZO (DRM 3.23.0 / 4.15.6-gentoo, LLVM 5.0.1)" (OpenCL) evaluation...
[OD] write binary[kernel-AMD_CARRIZO_(DRM_3.23.0___4.15.6-gentoo,_LLVM_5.0.1).bin] successfully
[DS] Device: "AMD CARRIZO (DRM 3.23.0 / 4.15.6-gentoo, LLVM 5.0.1)" (OpenCL) evaluated
[DS] composeRGBPixel: 0.060458 (w=1.2)
[DS] HistogramRect: 0.032581 (w=2.4)
[DS] ThresholdRectToPix: 0.018510 (w=4.5)
[DS] getLineMasksMorph: 0.014743 (w=5.0)
[DS] Score: 0.307755

[DS] Device: "(null)" (Native) evaluation...
[DS] Device: "(null)" (Native) evaluated
[DS] composeRGBPixel: 0.041112 (w=1.2)
[DS] HistogramRect: 0.076788 (w=2.4)
[DS] ThresholdRectToPix: 0.020729 (w=4.5)
[DS] getLineMasksMorph: 0.188474 (w=5.0)
[DS] Score: 1.269281
[DS] Scores written to file (tesseract_opencl_profile_devices.dat).
[DS] Device[1] 1:AMD CARRIZO (DRM 3.23.0 / 4.15.6-gentoo, LLVM 5.0.1) score is 0.307755
[DS] Device[2] 0:(null) score is 1.269281
[DS] Selected Device[1]: "AMD CARRIZO (DRM 3.23.0 / 4.15.6-gentoo, LLVM 5.0.1)" (OpenCL)
Tesseract Open Source OCR Engine v3.05.01 with Leptonica

attached files:

png image test

tesseract --print-parameters print-parameters.txt

tesseract -v version-info.txt

I could not install latest tesseract version 4 because it does not compile, it fails to build, however the version I am using is stable. I hope I get some pointers on how to remedy this.

@ghost ghost changed the title Tesseract outputs empty files for valid tiff image Tesseract outputs empty files for valid png image Mar 5, 2018
@stweil
Copy link
Contributor

stweil commented Mar 5, 2018

I‌ suggest to disable OpenCL support when building Tesseract (any version) unless you want to improve the OpenCL code. And of course it should be possible to build Tesseract 4.

@ghost
Copy link
Author

ghost commented Mar 5, 2018

I will try to rebuild it without opencl, thanks for the suggestion. Can you tell me if windows version ships with opencl support or is opencl currently being developed and not intended for use?

@Shreeshrii
Copy link
Collaborator

Shreeshrii commented Mar 5, 2018 via email

@ghost
Copy link
Author

ghost commented Mar 5, 2018

I will rebuild version 4 and post build log, however I must say disabled opencl has resolved the issue and after testing I found tesseract a poor ocr engine, Im sure acrobat is better.

@Shreeshrii
Copy link
Collaborator

Shreeshrii commented Mar 5, 2018 via email

@ghost
Copy link
Author

ghost commented Mar 5, 2018

Strangely disabling opencl for version 4 permitted the build without failure. However here are some things I found, previous experience with acrobat says the box labeled "swanson" will be ocr'ed because most of the text is legible, not so with tesseract. Don't get me wrong I really like it being open source and also support to write text searchable pdfs, which is what I do, but maybe I should disable training support and rebuild for better results? Can better results be achieved?

Test image:
test

resulting pdf:
out.pdf

Please tell me if you achieve better results.

@stweil
Copy link
Contributor

stweil commented Mar 5, 2018

Disabling training would not change the results. Using data from http://github.com/tesseract-ocr/tessdata_fast would (unless you already used that).

@ghost
Copy link
Author

ghost commented Mar 5, 2018

Is there a guide on how to use this?

@Shreeshrii
Copy link
Collaborator

disabled opencl has resolved the issue

Then please close the issue.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants