Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault when getting the number of pages #452

Closed
lsamper opened this issue Feb 28, 2023 · 3 comments
Closed

segfault when getting the number of pages #452

lsamper opened this issue Feb 28, 2023 · 3 comments

Comments

@lsamper
Copy link

lsamper commented Feb 28, 2023

Pikepdf throws a segfault when getting the pages attribute directly

>>> import pikepdf
>>> pikepdf.Pdf.open("test_facture_template_word.pdf").pages
Segmentation fault (core dumped)

However, when using a temporary variable, it works

>>> import pikepdf
>>> pdf = pikepdf.Pdf.open("test_facture_template_word.pdf")
>>> pdf.pages
<pikepdf._core.PageList len=1>

It fails on the last version of pikepdf, with any PDF file

Steps to reproduce:

$ python3.8 -m venv ~/.virtualenvs/segfault_pikepdf
$ source ~/.virtualenvs/segfault_pikepdf/bin/activate
$ pip install pikepdf
Collecting pikepdf
  Downloading pikepdf-7.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.3 MB)
     |████████████████████████████████| 2.3 MB 2.7 MB/s 
Collecting lxml>=4.8
  Downloading lxml-4.9.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (7.1 MB)
     |████████████████████████████████| 7.1 MB 11.3 MB/s 
Collecting deprecation
  Using cached deprecation-2.1.0-py2.py3-none-any.whl (11 kB)
Collecting packaging
  Using cached packaging-23.0-py3-none-any.whl (42 kB)
Collecting Pillow>=9.0
  Using cached Pillow-9.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.3 MB)
Installing collected packages: lxml, packaging, deprecation, Pillow, pikepdf
Successfully installed Pillow-9.4.0 deprecation-2.1.0 lxml-4.9.2 packaging-23.0 pikepdf-7.1.1
$ python
Python 3.8.10 (default, Nov 14 2022, 12:59:47) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pikepdf
>>> pikepdf.__version__
'7.1.1'
>>> pikepdf.__libqpdf_version__
'11.2.0'
>>> pikepdf.Pdf.open("Microsoft_invoice_2.pdf").pages
Segmentation fault (core dumped)
@lsamper
Copy link
Author

lsamper commented Feb 28, 2023

This segfault occurs with versions 7.1.0 and 7.1.1

With version 7.0.0, the code doesn't work but throws an exception:

>>> pikepdf.Pdf.open("filename.pdf").pages
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
pikepdf._core.PdfError: attempted to dereference an uninitialized pikepdf.Object

With version 6.2.9, it works.

@mara004
Copy link
Contributor

mara004 commented Feb 28, 2023

Is it possible that this has to do with the v7 change that dependent objects no longer keep their parent alive?

@jbarlow83
Copy link
Member

Yes this is due to the v7 change. The expected behavior is now something like the 7.0 behavior. In any case, I won't have time to fix this for a while. Set pdf = pikepdf.Pdf.open(...) and all will be well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants