Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tesseract: make sure tessdata default is version specific, e.g. tessdata4 #1

Open
GerHobbelt opened this issue May 17, 2021 · 0 comments
Labels
enhancement New feature or request

Comments

@GerHobbelt
Copy link
Owner

After initial installer tests this popped up:

we must ensure that any tessdata install (which will be huge for tesseract 4./5.)) DOES NOT collidee with the old tesseract 3 data directory in Qiqqa.

One of the easiest ways to accomplish this under all circumstances is make the default tessdata directory version-specific, e.g. tessdata4 or tessdata5 instead of tessdata.

We could also consider using a engine-algorithm specific tessdata, e.g. tessdata-lstm but that would be a little more obscure to users going through their setup directories...

@GerHobbelt GerHobbelt added the enhancement New feature or request label May 17, 2021
GerHobbelt pushed a commit that referenced this issue May 21, 2021
The purpose of a "snapshot" save is to allow us to dump the current
state of a PDF document (including edits, but excluding undo/redo
history) in such a way that the version in memory remains unchanged.

There are a couple of use cases for this:

1) Load a form, fill in some fields, print it. In order to do the
print, we need to save the document as a standard valid PDF to send
to a remote print service. After printing, if we then edit the
document some more and save it out, we only want to see 1 incremental
section used, rather than 2 (i.e. the saving for printing should not
cause the 'underlying' document to be updated).

2) When running as an app on a mobile device, when we are put into
the background, we need to save our state so that if the app is killed
and later restarted, we can pick up where we left off. Again this
should not involved writing a new incremental section to the document.

This commit solves for case #1.

Case #2 will require this, plus both the ability to save undo/redo
history, and the ability to 'reopen' the last incremental update.
GerHobbelt pushed a commit that referenced this issue Dec 9, 2022
$ ./build/sanitize/mutool draw -Dst ./x/tiff/segfault/goat.tiff
page ./x/tiff/segfault/goat.tiff 1AddressSanitizer:DEADLYSIGNAL
=================================================================
==3377970==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000018 (pc 0x55dfed459a1b bp 0x7ffecf93ebc0 sp 0x7ffecf93eac0 T0)
==3377970==The signal is caused by a READ memory access.
==3377970==Hint: address points to the zero page.
    #0 0x55dfed459a1b in fz_convert_pixmap_samples source/fitz/colorspace.c:1421
    #1 0x55dfed57bad0 in fz_convert_pixmap source/fitz/pixmap.c:1065
    #2 0x55dfed481194 in convert_pixmap_for_painting source/fitz/draw-device.c:1682
    ArtifexSoftware#3 0x55dfed482e2c in fz_draw_fill_image source/fitz/draw-device.c:1852
    ArtifexSoftware#4 0x55dfed461d34 in fz_fill_image source/fitz/device.c:351
    ArtifexSoftware#5 0x55dfed7841a0 in img_run_page source/cbz/muimg.c:105
    ArtifexSoftware#6 0x55dfed466fe9 in fz_run_page_contents source/fitz/document.c:642
    ArtifexSoftware#7 0x55dfed467358 in fz_run_page source/fitz/document.c:692
    ArtifexSoftware#8 0x55dfed3ebbc9 in drawband source/tools/mudraw.c:624
    ArtifexSoftware#9 0x55dfed3f0e91 in dodrawpage source/tools/mudraw.c:1125
    ArtifexSoftware#10 0x55dfed3f32c1 in drawpage source/tools/mudraw.c:1460
    ArtifexSoftware#11 0x55dfed3f3716 in drawrange source/tools/mudraw.c:1499
    ArtifexSoftware#12 0x55dfed3f8fcf in mudraw_main source/tools/mudraw.c:2501
    ArtifexSoftware#13 0x55dfed3e9736 in main source/tools/mutool.c:152
    ArtifexSoftware#14 0x7fae19829209 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    ArtifexSoftware#15 0x7fae198292bb in __libc_start_main_impl ../csu/libc-start.c:389
    ArtifexSoftware#16 0x55dfed3e8f60 in _start (/home/sebras/src/mupdf/build/sanitize/mutool+0x21bf60)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant