Skip to content
This repository has been archived by the owner on Mar 17, 2022. It is now read-only.

Pdf renderer isn't working with jpg input images #122

Closed
ArjanSchouten opened this issue Oct 29, 2015 · 5 comments
Closed

Pdf renderer isn't working with jpg input images #122

ArjanSchouten opened this issue Oct 29, 2015 · 5 comments

Comments

@ArjanSchouten
Copy link

When I put a jpg wiith 24 bit depth into the pdf renderer the output is a corrupt pdf file. The text is still present in the file however the image is not present. Is this a known issue? Is this because libjpeg is not present?

I tried to compile tess-two with libjpeg with the following make file:

LOCAL_PATH := $(call my-dir)

include $(CLEAR_VARS)

LOCAL_MODULE    := libjpeg

LOCAL_EXPORT_C_INCLUDE_DIRS := $(LOCAL_PATH)

LOCAL_SRC_FILES := jaricom.c jcapimin.c jcapistd.c jcarith.c jccoefct.c jccolor.c jcdctmgr.c jchuff.c jcinit.c jcmainct.c jcmarker.c jcmaster.c jcomapi.c jcparam.c jcprepct.c jcsample.c jctrans.c jdapimin.c jdapistd.c jdarith.c jdatadst.c jdatasrc.c jdcoefct.c jdcolor.c jddctmgr.c jdhuff.c jdinput.c jdmainct.c jdmarker.c jdmaster.c jdmerge.c jdpostct.c jdsample.c jdtrans.c jerror.c jfdctflt.c jfdctfst.c jfdctint.c jidctflt.c jidctfst.c jidctint.c jquant1.c jquant2.c jutils.c jmemmgr.c jmemname.c

include $(BUILD_STATIC_LIBRARY)

I changed the leptonica make file to:

LOCAL_PATH := $(call my-dir)

include $(CLEAR_VARS)

LOCAL_MODULE := liblept

# leptonica (minus freetype)

BLACKLIST_SRC_FILES := \
  %endiantest.c \
  %freetype.c \
  %xtractprotos.c

LEPTONICA_SRC_FILES := \
  $(subst $(LOCAL_PATH)/,,$(wildcard $(LEPTONICA_PATH)/src/*.c))

LOCAL_SRC_FILES := \
  $(filter-out $(BLACKLIST_SRC_FILES),$(LEPTONICA_SRC_FILES))

LOCAL_CFLAGS := \
  -DHAVE_CONFIG_H \
  -DHAVE_LIBJPEG

LOCAL_LDLIBS := \
  -lz

# jni

LOCAL_SRC_FILES += \
  box.cpp \
  boxa.cpp \
  pix.cpp \
  pixa.cpp \
  utilities.cpp \
  readfile.cpp \
  writefile.cpp \
  jni.cpp

LOCAL_C_INCLUDES += \
  $(LOCAL_PATH) \
  $(LEPTONICA_PATH)/src \
  $(LIBPNG_PATH) \
  $(LIBJPEG_PATH)

LOCAL_LDLIBS += \
  -ljnigraphics \
  -llog

# common
LOCAL_SHARED_LIBRARIES := libpngt
LOCAL_STATIC_LIBRARIES += libjpeg
LOCAL_PRELINK_MODULE := false

include $(BUILD_SHARED_LIBRARY)

I can't find that jpg's are not supported by tesseract. However I'm not sure if libjpeg is needed for this. Do I need libjpeg for this and if yes can't we include it into tess-two?

I asked this question also on stackoverflow: http://stackoverflow.com/questions/33394810/tesseract-pdf-renderer-with-24-bit-depth-jpg-image

@ArjanSchouten
Copy link
Author

@rmtheis just got it working. Compiled libjpeg wrong. Now I've compiled tess-two with libjpeg and it's working! Isn't it an idea to bundle libjpeg by default with tess-two?

Final leptonica Android.mk

LOCAL_PATH := $(call my-dir)

include $(CLEAR_VARS)

LOCAL_MODULE := liblept

# leptonica (minus freetype)

BLACKLIST_SRC_FILES := \
  %endiantest.c \
  %freetype.c \
  %xtractprotos.c

LEPTONICA_SRC_FILES := \
  $(subst $(LOCAL_PATH)/,,$(wildcard $(LEPTONICA_PATH)/src/*.c))

LOCAL_SRC_FILES := \
  $(filter-out $(BLACKLIST_SRC_FILES),$(LEPTONICA_SRC_FILES))

LOCAL_CFLAGS := \
  -DHAVE_CONFIG_H \
  -DHAVE_LIBJPEG

LOCAL_LDLIBS := \
  -lz

# jni

LOCAL_SRC_FILES += \
  box.cpp \
  boxa.cpp \
  pix.cpp \
  pixa.cpp \
  utilities.cpp \
  readfile.cpp \
  writefile.cpp \
  jni.cpp

LOCAL_C_INCLUDES += \
  $(LOCAL_PATH) \
  $(LEPTONICA_PATH)/src \
  $(LIBPNG_PATH) \
  $(LIBJPEG_PATH)

LOCAL_LDLIBS += \
  -ljnigraphics \
  -llog

# common
LOCAL_SHARED_LIBRARIES := libpngt libjpegt
LOCAL_PRELINK_MODULE := false

include $(BUILD_SHARED_LIBRARY)

Libjpeg Android.mk

LOCAL_PATH := $(call my-dir)

include $(CLEAR_VARS)

LOCAL_MODULE    := libjpegt

LOCAL_EXPORT_C_INCLUDE_DIRS := $(LOCAL_PATH)

LOCAL_SRC_FILES := jaricom.c jcapimin.c jcapistd.c jcarith.c jccoefct.c jccolor.c jcdctmgr.c jchuff.c jcinit.c jcmainct.c jcmarker.c jcmaster.c jcomapi.c jcparam.c jcprepct.c jcsample.c jctrans.c jdapimin.c jdapistd.c jdarith.c jdatadst.c jdatasrc.c jdcoefct.c jdcolor.c jddctmgr.c jdhuff.c jdinput.c jdmainct.c jdmarker.c jdmaster.c jdmerge.c jdpostct.c jdsample.c jdtrans.c jerror.c jfdctflt.c jfdctfst.c jfdctint.c jidctflt.c jidctfst.c jidctint.c jquant1.c jquant2.c jutils.c jmemmgr.c jmemname.c

include $(BUILD_SHARED_LIBRARY)

@rmtheis
Copy link
Owner

rmtheis commented Oct 29, 2015

Cool--thanks for the follow-up. I'm glad to hear that adding libjpeg built and worked for you. Yes I do think it should be added back in.

@ArjanSchouten
Copy link
Author

Ok should be nice when libjpeg is in the project. Saw you're using android's external png library.
I tried to use the jpeg version of it but that failed. Now I'm using http://www.ijg.org/files/jpegsr8c.zip. But you probably saw that on stackoverflow.

@rmtheis
Copy link
Owner

rmtheis commented Feb 3, 2016

@ArjanSchouten I don't understand how having libjpeg would help resolve this specific problem. But libjpeg has been added to the project and it lets you do some cool stuff generally so I think it's a good addition.

@ArjanSchouten
Copy link
Author

My input image was a color jpg image. You have created a fallback when the image isn't read by leptonica. However this returns a greyscale bitmap. To be able to read a jpg by leptonica you need libjpeg.

I'm sorry that I hadn't enough time to send a PR with libjpeg.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants