Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid output with vertical japanese #4128

Open
ubitux opened this issue Sep 22, 2023 · 1 comment
Open

Invalid output with vertical japanese #4128

ubitux opened this issue Sep 22, 2023 · 1 comment

Comments

@ubitux
Copy link

ubitux commented Sep 22, 2023

Current Behavior

tesseract -l jpn_vert --psm 5 -c preserve_interword_spaces=1 haiku.png - gives the following output:

酒維包記臣
とても増える、
。 本

2023-09-22-192717-AhW8PaiM

Expected Behavior

減る記憶、
それでも増える、
パスワード

Suggested Fix

No response

tesseract -v

tesseract 5.3.2
 leptonica-1.83.1
  libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 2.1.5.1) : libpng 1.6.40 : libtiff 4.6.0 : zlib 1.2.13 : libwebp 1.3.2 : libopenjp2 2.5.0
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found OpenMP 201511
 Found libarchive 3.7.2 zlib/1.3 liblzma/5.4.4 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.5
 Found libcurl/8.3.0 OpenSSL/3.1.2 zlib/1.3 brotli/1.0.9 zstd/1.5.5 libidn2/2.3.4 libpsl/0.21.2 (+libidn2/2.3.4) libssh2/1.11.0 nghttp2/1.56.0

Operating System

No response

Other Operating System

Archlinux

uname -a

Linux bee 6.4.12-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 24 Aug 2023 00:38:14 +0000 x86_64 GNU/Linux

Compiler

No response

CPU

12th Gen Intel(R) Core(TM) i7-12700

Virtualization / Containers

No response

Other Information

Most of the options on
https://github.com/tesseract-ocr/tessdoc/blob/main/tess3/ControlParams.md#useful-parameters-for-japanese-and-chinese are either already the defaults or removed (renamed?), they are not helpful

@hglee
Copy link

hglee commented Oct 13, 2023

Would you binarize your image manually?

binary

 .\tesseract.exe -l jpn_vert --psm 5 binary.png -
減る記憶、
それでも増える、
パスワード

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants