-
Notifications
You must be signed in to change notification settings - Fork 439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding annotations to PDF leads to memory corruption and crashes #1388
Comments
Cannot reproduce: |
Also tested with a new empty file / page ... no issues with 200 annotations. |
Strange, this a boiled down example of something we observed on multiple machines. We're preparing to open source a tool for LaTeX/git annotation that crashes on a regular basis. I'll try to determine to common factor. User @fiee is also collaborating in this project, I'll ask him for feedback. |
How did install PyMuPDF? Wheel? |
Using pip3, so I guess no wheel involved. But I'm not an expert in Python library distribution at all. |
BTW: Did you use valgrind, because it seems the corruption starts already at one annotation but destabilizes the executable over time. |
Depends - probably wheel if PIP was involved. |
No, just python3.8 in a normal terminal wiindow |
I’m on Python 3.9 (python.org) on MacOS 10.14; PyMuPDF was installed with pip3 (yes wheel). |
@fiee - and you experienced crashes on Mac OSX, too? |
Yes, Python segfaults (don’t know if this crash report helps in any way):
On the console:
|
True, it runs blazingly fast, but it seems a sublte memory problem in the C part of things. I've worked with C things extensively for several years but stopped doing much there about ten years ago. |
@fiee - no the trace as such won't help me. I also have no Mac at hand. @fiee - @ptwz - question to both of you: |
I never tested another version. Actually, I can run @ptwz’s example with 1000 annotations without crashes (needs a minute* or so on my 2012 Mac mini), it happens only in my actual code. Might be related to annots on more or less pages? (Test with 10,000 annotations is still running.) EDIT: * no, much less:
But the 10,000-test did never finish. |
Intermittent error always are a pain in the neck. Even more when they do occur on selected platforms only. I repeated the run with my 200 annots under valgrind - never used it before. |
Or it’s related to our TeX-produced PDF files; they seem to use “uncommon” structures like iref streams (see highkite/pdfAnnotate#52 where we tried the same with JavaScript). |
Did you call valgrind using Python has its own memory allocator that totally confuses valgrind, for valgrind the Python team implemented the more conventional memory management. Then you see the true problems, not all the messages generated due to valgrind not understanding what Python does. |
Maybe, but my test file was generated with pdftk. I also used scans from my printer, exports from LibreOffice, LuaTeX, ... all to the same result. |
Python Version: 3.9.6 (with enabled optimisations) Added Code from @ptwz and set Used a new created "hello-world" article made by tex-studio. \documentclass[]{article}
%opening
\title{}
\author{}
\begin{document}
\maketitle
\begin{abstract}
\end{abstract}
\section{}
\end{document}
No issues appeared. |
Your TeX (Live) version might be relevant. I’m on TL2021, @pwtz and our Jenkins server probably also, since we needed current packages for CJK. Linux distros often are a year or so behind. Oh, and we use LuaLaTeX (LuaHBTeX, Version 1.13.2). OTOH pwtz tested with other PDFs with the same result; I get the segfaults still only with my “complete” code. |
@fiee Still no complains with |
Did you run it through valgrind? Use-after-free might not crash the minimal example due to it simplicity but corruptions should be visible. |
Can someone of you let me have a file that reliably crashes on Linux please? |
It would also be helpful if someone could try a failing script with a pre-1.19.0 version, e.g. 1.18.19. |
From my Mac:
Looks like no problems; script ran with 1000 annotations. On my Linux laptop running Python3.7 and PureOS, valgrind found many problems (I reduced to only 10 annotations) and suggested the |
This looks like what I've seen, especially the invalid reads into free'd memory. Leak-Check is not so critical, as we don't seem to eat all RAM.. |
Just noticed: Reverting to version 1.18.19 fixes both the problems in @fiee's code and in my minimal example: Except for some (maybe spurious) warning about uninitialized memory, valgrind is finally happy with my test code.
|
Had it run with valgrind and
Same Program and stuff, just output with PyMuPDF==1.18.19:
With version 1.18.18:
|
Hi everyone, I think I have found the issue: It was introduced in v1.19.x. Thanks to everyone for bringing this up and all the help to track this down! |
Great news! Thank you for staying with us on this quite weird problem. |
If you give me a list of platform / Py version, I can make preliminary wheels to try out / confirm for you. |
For me, it’s Py3.9 / Mac-i64 and Py3.7 / Linux-64. |
@fiee - ok, I will let you know from where to download. |
Ok these are the wheels. Look at the bottom under "Artifacts". The file there is a ZIP which wraps the resp. wheel. Linux Py 3.6: https://github.com/JorjMcKie/py-mupdf/actions/runs/1454208384 |
Works like a charm for 3.7 and 3.8 on linux 64, both on minimal example and production code. Thank you very much, it seems to fix the problem in my test cases. @fiee, @swamper123 : Please recheck. |
Works flawlessly on my Mac, but valgrind still lists many problems on my Linux machine. They’re all about (possibly/definitely) lost blocks in lost records (from vg_replace_malloc.c:299) – I guess that might be harmless? |
Looks good, same output as in the older versions. |
Resolved by version 1.19.2. |
Describe the bug
When adding multiple pop-up annotations to PDF files, python crashes randomly with segfault, bus error, broken linked list or whatever.
To Reproduce
Using a document object created with
fitz.open
, creating an annotation rectangle usingfitz.Rect
and passing it to the pagesadd_highlight_annot
method we set up an annotation.Using
set_info
we set content, subject and title of the pop-up to be displayed.Finally we use
set_popup
to add the popup to the highlight and call theupdate
method of the annotation.The corruption seems to occur somewhere before calling the
update
method. As valgrind complains about a use after free right after callingupdate
. Sorry, printfdebugging...
See this minimized examples:
It seems there is no difference what PDF file to be annotating, but I attached a simple one that definitely shows this behavior anyways.
Screenshots (optional)
If applicable, add screenshots to help explain your problem.
Your configuration (mandatory)
The text was updated successfully, but these errors were encountered: