-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segmentation fault with --psm 0 #821
Comments
Try |
Also try with this image: |
Find below tesseract phototest.tif stdout --oem 0 --psm 0 -l eng tesseract phototest.tif stdout --oem 1 --psm 0 -l eng tesseract phototest.tif stdout --oem 1 --psm 3 -l eng The quick brown dog jumped over the Pritam |
The warnings are ugly but seem harmless. With With For now, the solution is to always use |
BTW, you should use Using |
From what I have read, tesseract v4 greatly improves ocr due to LSTM. If I know that my text is going to be of a certain orientation and script (top to bottom and English), how do I take advantage of the newer engine? Thanks for the help and sorry for the delay in my response. |
The 4.00 version is in alpha stage. It's not yet considered ready to replace the stable 3.05 version. |
Is there any update on this? Let me know if you want me to do any testing etc. Thanks! |
I'm sorry, but there is no update on this issue. |
If you want to OCR English text, use the program (latest version built from master branch in github) with default options or specify language as English.
or |
tesseract4.0.0 alpha, execute the following command: [root@localhost workspace]# /opt/tesseract4.0/bin/tesseract pic/tesseract-chinese-1.png stdout --psm 0 Warning. Invalid resolution 0 dpi. Using 70 instead. Failed loading language 'osd' Tesseract couldn't load any languages! Warning: Auto orientation and script detection requested, but osd language failed to load Estimating resolution as 219 Segmentation fault (core dumped) But exactly, the osd.traineddata is at the right space: [root@localhost workspace]# ls /opt/tesseract4.0/share/tessdata/ chi_sim.traineddata chi_tra.traineddata configs/ ori.traineddata pdf.ttf chi_sim_vert.traineddata chi_tra_vert.traineddata eng.traineddata osd.traineddata tessconfigs/ then I use the "--oem 0" options, it prints the following: [root@localhost workspace]# /opt/tesseract4.0/bin/tesseract pic/tesseract-chinese-1.png stdout --psm 0 --oem 0 Failed loading language 'eng' Tesseract couldn't load any languages! Could not initialize tesseract. Then I use tesseract3.05.1, It seems that tesseract always detect the script is "Latin", not what I expected |
try this:
|
I think this was fixed in commit 27ce472, so this issue could be closed. |
Command below results in a segmentation fault
tesseract a.jpg stdout --oem 1 --psm 0 -l eng
Environment details:
Which operating system - Ubuntu 16.10 Yakkety Yak on x86_64
Which version/commit of tesseract - top of Changelog says 2017-03-24 - v4.00.00-alpha
How was tesseract built or - I compiled it from source
Command above works with --psm 3 is used instead.
Pritam Dodeja
The text was updated successfully, but these errors were encountered: