-
Notifications
You must be signed in to change notification settings - Fork 660
Description
Description of the bug
Description
When opening PDF documents from byte streams using fitz.open(stream=pdf_content, filetype="pdf") in PyMuPDF version 1.26.4, two bugs occur that prevent normal operation.
Environment
- PyMuPDF Version: 1.26.4
- Python Version: 3.12.11
- Operating System: Linux (Docker container)
- Installation Method: pip/uv
Bug 1: AttributeError in needs_pass property
Error Message
AttributeError: 'FzDocument' object has no attribute 'super'
Stack Trace
Traceback (most recent call last):
File "pdf_processor.py", line 325, in _compress_with_settings
doc = fitz.open(stream=pdf_content, filetype="pdf")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/pymupdf/__init__.py", line 3008, in __init__
if self.needs_pass:
^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/pymupdf/__init__.py", line 5021, in needs_pass
document = self.this if isinstance(self.this, mupdf.FzDocument) else self.this.super()
^^^^^^^^^^^^^^^
AttributeError: 'FzDocument' object has no attribute 'super'Root Cause
In pymupdf/__init__.py line 5021, the code attempts to call self.this.super() when self.this is already an FzDocument object, but FzDocument doesn't have a super() method.
Bug 2: AssertionError in _loadOutline
Error Message
AssertionError
Stack Trace
Traceback (most recent call last):
File "pdf_processor.py", line 362, in _compress_with_settings
doc = fitz.open(stream=pdf_content, filetype="pdf")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/pymupdf/__init__.py", line 3011, in __init__
self.init_doc()
File "/app/.venv/lib/python3.12/site-packages/pymupdf/__init__.py", line 4463, in init_doc
self._outline = self._loadOutline()
^^^^^^^^^^^^^^^^^^^
File "/app/.venv/lib/python3.12/site-packages/pymupdf/__init__.py", line 3491, in _loadOutline
assert isinstance( doc, mupdf.FzDocument)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionErrorRoot Cause
In pymupdf/__init__.py line 3491, an assertion fails when checking if doc is an instance of mupdf.FzDocument, even though it should be.
Reproduction Code
import fitz
import io
# Read any PDF file
with open("sample.pdf", "rb") as f:
pdf_content = f.read()
# This fails with the bugs described above
try:
doc = fitz.open(stream=pdf_content, filetype="pdf")
print(f"Document opened: {len(doc)} pages")
doc.close()
except (AttributeError, AssertionError) as e:
print(f"Error: {type(e).__name__}: {e}")Expected Behavior
The PDF should open successfully from a byte stream without errors.
Actual Behavior
Opening PDFs from byte streams fails with either:
AttributeError: 'FzDocument' object has no attribute 'super'AssertionErrorin_loadOutline()
Workaround
Opening the PDF from a file path works correctly:
import fitz
import tempfile
import os
# Save to temporary file and open from path
with tempfile.NamedTemporaryFile(suffix='.pdf', delete=False) as tmp:
tmp.write(pdf_content)
tmp_path = tmp.name
try:
doc = fitz.open(tmp_path) # This works
print(f"Document opened: {len(doc)} pages")
doc.close()
finally:
os.unlink(tmp_path)Impact
This bug prevents using PyMuPDF with in-memory PDF data, which is critical for:
- Web applications processing uploaded PDFs
- Microservices handling PDF streams
- PDF processing pipelines that avoid disk I/O
- Cloud functions with read-only filesystems
Additional Information
- The bugs appear to be related to how PyMuPDF handles the internal document object when created from streams vs. file paths
- Both
fitz.open()andfitz.Document()constructors are affected - The issue does NOT occur when opening PDFs from file paths
- This affects document processing workflows in production environments
Suggested Fix
- In line 5021, check if
self.thishas asuperattribute before calling it - In line 3491, review the assertion logic for stream-opened documents
- Ensure consistent behavior between file-based and stream-based document opening
Related Issues
Please let me know if this is related to any existing issues or if additional debugging information would be helpful.
How to reproduce the bug
PyMuPDF version
1.26.5
Operating system
Linux
Python version
3.12