-
Notifications
You must be signed in to change notification settings - Fork 29
Closed
Description
Hi,
I have an issue after installing latest simple-ocr 2.3.1 with pdfsandwich ocr engine, Alfresco 5.2.0 and Ubuntu 16.04 LTS.
All supporting apps installed with apt-get / dpkg method of installation as follows:
- TESSERACT:
# apt-get install tesseract-ocr tesseract-ocr-eng tesseract-ocr-ind
# tesseract -v
tesseract 3.04.01
leptonica-1.73
libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.4 : libopenjp2 2.1.0
- PDFSANDWICH:
# dpkg -i /media/sf_Downloads/Alfresco/Addons/simple-ocr/apps-supported/pdfsandwich_0.1.6_amd64.deb
# apt-get -fy install
# pdfsandwich -version
pdfsandwich version 0.1.6
Error when try to ocr document by clicking OCR button on document page:
Exception in thread "defaultAsyncAction5" java.lang.RuntimeException: java.lang.RuntimeException: org.alfresco.service.cmr.repository.ContentIOException: 02290026 Failed to perform OCR transformation:
Execution result:
os: Linux
command: /opt/alfresco-community/bin/bw-pdfsandwich.sh -verbose -lang eng+ind -rgb /opt/alfresco-community/tomcat/temp/Alfresco/OCRTransformWorker_source_2287725343312660166.pdf -o /opt/alfresco-community/tomcat/temp/Alfresco/OCRTransformWorker_source_2287725343312660166_ocr.pdf
succeeded: false
exit code: 2
out: pdfsandwich version 0.1.6
Checking for convert:
convert -version
Version: ImageMagick 7.0.5-2 Q16 x86_64 2017-04-04 http://www.imagemagick.org
Copyright: © 1999-2017 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Featur
err: tesseract: /opt/alfresco-community/common/lib/libtiff.so.5: no version information available (required by /usr/lib/liblept.so.5)
tesseract 3.04.01
leptonica-1.73
libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.56 : libtiff 4.0.7 : zli
at es.keensoft.alfresco.ocr.OCRExtractAction.executeImplInternal(OCRExtractAction.java:183)
at es.keensoft.alfresco.ocr.OCRExtractAction.access$200(OCRExtractAction.java:38)
at es.keensoft.alfresco.ocr.OCRExtractAction$1.execute(OCRExtractAction.java:164)
at es.keensoft.alfresco.ocr.OCRExtractAction$1.execute(OCRExtractAction.java:161)
at org.alfresco.repo.transaction.RetryingTransactionHelper.doInTransaction(RetryingTransactionHelper.java:464)
at es.keensoft.alfresco.ocr.OCRExtractAction.executeInNewTransaction(OCRExtractAction.java:169)
at es.keensoft.alfresco.ocr.OCRExtractAction.access$100(OCRExtractAction.java:38)
at es.keensoft.alfresco.ocr.OCRExtractAction$ExtractOCRTask.run(OCRExtractAction.java:151)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: org.alfresco.service.cmr.repository.ContentIOException: 02290026 Failed to perform OCR transformation:
Execution result:
os: Linux
command: /opt/alfresco-community/bin/bw-pdfsandwich.sh -verbose -lang eng+ind -rgb /opt/alfresco-community/tomcat/temp/Alfresco/OCRTransformWorker_source_2287725343312660166.pdf -o /opt/alfresco-community/tomcat/temp/Alfresco/OCRTransformWorker_source_2287725343312660166_ocr.pdf
succeeded: false
exit code: 2
out: pdfsandwich version 0.1.6
Checking for convert:
convert -version
Version: ImageMagick 7.0.5-2 Q16 x86_64 2017-04-04 http://www.imagemagick.org
Copyright: © 1999-2017 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Featur
err: tesseract: /opt/alfresco-community/common/lib/libtiff.so.5: no version information available (required by /usr/lib/liblept.so.5)
tesseract 3.04.01
leptonica-1.73
libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.56 : libtiff 4.0.7 : zli
at es.keensoft.alfresco.ocr.OCRTransformWorker.transform(OCRTransformWorker.java:86)
at es.keensoft.alfresco.ocr.OCRExtractAction.executeImplInternal(OCRExtractAction.java:181)
... 10 more
Caused by: org.alfresco.service.cmr.repository.ContentIOException: 02290026 Failed to perform OCR transformation:
Execution result:
os: Linux
command: /opt/alfresco-community/bin/bw-pdfsandwich.sh -verbose -lang eng+ind -rgb /opt/alfresco-community/tomcat/temp/Alfresco/OCRTransformWorker_source_2287725343312660166.pdf -o /opt/alfresco-community/tomcat/temp/Alfresco/OCRTransformWorker_source_2287725343312660166_ocr.pdf
succeeded: false
exit code: 2
out: pdfsandwich version 0.1.6
Checking for convert:
convert -version
Version: ImageMagick 7.0.5-2 Q16 x86_64 2017-04-04 http://www.imagemagick.org
Copyright: © 1999-2017 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Featur
err: tesseract: /opt/alfresco-community/common/lib/libtiff.so.5: no version information available (required by /usr/lib/liblept.so.5)
tesseract 3.04.01
leptonica-1.73
libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.56 : libtiff 4.0.7 : zli
at es.keensoft.alfresco.ocr.OCRTransformWorker.transform(OCRTransformWorker.java:79)
... 11 more
Following libtiff on server I found:
root@alflab:/usr# ls -l /usr/lib/x86_64-linux-gnu/libtiff.*
lrwxrwxrwx 1 root root 16 Mar 20 23:42 /usr/lib/x86_64-linux-gnu/libtiff.so.5 -> libtiff.so.5.2.4
-rw-r--r-- 1 root root 475496 Mar 20 23:42 /usr/lib/x86_64-linux-gnu/libtiff.so.5.2.4
root@alflab:/usr# ls -l /opt/alfresco-community/common/lib/libtiff.*
-rw-r--r-- 1 root root 781854 Jun 16 2017 /opt/alfresco-community/common/lib/libtiff.a
-rwxr-xr-x 1 root root 1099 Jun 16 2017 /opt/alfresco-community/common/lib/libtiff.la
lrwxrwxrwx 1 root root 16 Mar 24 22:54 /opt/alfresco-community/common/lib/libtiff.so -> libtiff.so.5.2.5
lrwxrwxrwx 1 root root 16 Mar 24 22:54 /opt/alfresco-community/common/lib/libtiff.so.5 -> libtiff.so.5.2.5
-rwxr-xr-x 1 root root 525016 Jun 16 2017 /opt/alfresco-community/common/lib/libtiff.so.5.2.5
It seems Ubuntu's leptonica is not match with Alfresco libtiff version. CMIIW.
How to fix this error?
Thank you,
[bayu]
Metadata
Metadata
Assignees
Labels
No labels