Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_pixmap method stuck on one page and runs forever #3125

Closed
SofiiaChaban opened this issue Feb 2, 2024 · 11 comments
Closed

get_pixmap method stuck on one page and runs forever #3125

SofiiaChaban opened this issue Feb 2, 2024 · 11 comments
Labels
fix developed release schedule to be determined Fixed in next release upstream bug bug outside this package

Comments

@SofiiaChaban
Copy link

Description of the bug

I have a script that takes a PDF document URL, iterates through all the pages, generates a pixmap for each page, and uses it to create an image. However, the get_pixmap method gets stuck indefinitely on a particular page, and I'm unable to resolve it even after attempting to add timeouts.

How to reproduce the bug

Code snippet:

DPI = 300
pixmap = page.get_pixmap(matrix=fitz.Matrix(DPI/72, DPI/72))

The issue occurs on a specific page in the PDF file, and it's worth noting that other pages also contain numerous graphic elements.
Here is the page on which this method stucks (please, note that other pages have also a lot of graphic elements).
I would appreciate any insights into what might be causing this problem or any guidance on how to handle it, perhaps with the use of timeouts or alternative approaches.

image

PyMuPDF version

1.23.21

Operating system

MacOS

Python version

3.10

@julian-smith-artifex-com
Copy link
Collaborator

I think this is probably a duplicate of #3072.

@JorjMcKie JorjMcKie added bug duplicate fix developed release schedule to be determined labels Feb 6, 2024
@SofiiaChaban
Copy link
Author

Hi @julian-smith-artifex-com !
I have tried to use new PyMuPDF (1.23.22) version but still have the same problem. The program just stucks at this page whn trying to execure get_pixmap().

@julian-smith-artifex-com
Copy link
Collaborator

Ah, interesting. #3072 is definitely fixed in 1.23.22 so this looks like a new (but probably related) problem.

Can you supply your input file and python code that shows the problem?

@mikecummings34
Copy link

same problem here. 1.23.22 did not fix the issue for me either.
Also not that its critical info regarding the bug itself, but the CPU gets pinned at 100 percent and requires either a reboot or a ps kill to stop it. So its kind of a serious issue and a deal breaker for any code using pymupdf.

@julian-smith-artifex-com
Copy link
Collaborator

Just to be clear - we will need a reproducer for this problem if we are to investigate and fix it. If anyone has an example file that they can post, please do so here.

@kblevins
Copy link

Unfortunately I am not able to share files that created this issue for me, but I can tell you that reverting to 1.21.0 has removed the problem. I will try to get a redacted file to you to test the issue with the current version.

@julian-smith-artifex-com
Copy link
Collaborator

Unfortunately I am not able to share files that created this issue for me, but I can tell you that reverting to 1.21.0 has removed the problem. I will try to get a redacted file to you to test the issue with the current version.

Thank you, i'm looking forward to receiving your redacted file.

@kblevins
Copy link

kblevins commented Mar 7, 2024

@julian-smith-artifex-com how do you recommend creating a redacted file for testing this issue?

@julian-smith-artifex-com
Copy link
Collaborator

@julian-smith-artifex-com how do you recommend creating a redacted file for testing this issue?

You could use Document.select(page_number) then Document.ez_save(), to create a document that has only one page that shows the problem.

Alternatively, could you email the document directly to me at julian.smith@artifex.com? Then i'll be able to take a look without it becoming public.

@julian-smith-artifex-com
Copy link
Collaborator

An update on this: a build of PyMuPDF with the latest MuPDF in git, does not hang.

So the problem will be fixed in PyMuPDF very soon after the next MuPDF
release, which should be in the next week or two.

@julian-smith-artifex-com julian-smith-artifex-com added upstream bug bug outside this package fix developed release schedule to be determined and removed example required bug labels Mar 9, 2024
@julian-smith-artifex-com
Copy link
Collaborator

Fixed in 1.24.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix developed release schedule to be determined Fixed in next release upstream bug bug outside this package
Projects
None yet
Development

No branches or pull requests

5 participants