Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple image one line text not recognized for some mysterious reason #4120

Open
naourass opened this issue Aug 23, 2023 · 2 comments
Open

Simple image one line text not recognized for some mysterious reason #4120

naourass opened this issue Aug 23, 2023 · 2 comments

Comments

@naourass
Copy link

naourass commented Aug 23, 2023

Current Behavior

Among many similar images (same dimension/layout/content) that have been ocr'd correctly, this one returns an empty string:
Input Image

I tried with ara and Arabic, both fast and best, different PSMs. I always get an empty string for that image.

Any idea for what could cause this behavior? And is there any workaround?

Version

tesseract 5.3.0-31-g9d71
leptonica-1.79.0
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.3) : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.1
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found OpenMP 201511
Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.8 liblz4/1.9.2 libzstd/1.4.4
Found libcurl/7.68.0 OpenSSL/1.1.1f zlib/1.2.11 brotli/1.0.7 libidn2/2.3.0 libpsl/0.21.0 (+libidn2/2.2.0) libssh/0.9.3/openssl/zlib nghttp2/1.40.0 librtmp/2.3

Operating System

Ubuntu 20.04 Focal under WSL2 Windows 11
Linux DESKTOP-586TDC4 5.15.90.1-microsoft-standard-WSL2 #1 SMP Fri Jan 27 02:56:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

CPU

Intel(R) Core(TM) i7-9700

@naourass naourass changed the title Image not recognized for some reason Simple image one line text not recognized for some mysterious reason Aug 24, 2023
@amitdo
Copy link
Collaborator

amitdo commented Aug 25, 2023

You can try this command:

convert image.png -bordercolor White -border 4x4 image-4-4-wb.png

@naourass
Copy link
Author

@amitdo
I tried that and it still doesn't work unfortunately. I tried with different crops / margins (tried both odd and even numbers) but it always fails to detect the content of that image!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants