You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When adding a transformation t to the page e.g. to scale the content (t.scale(...)) or to move it around the page (t.translate(...)), we observe a huge increase in size. For a test file of 798KB, the resulting PDF is 1.6MB. For larger files, it seems that the size can even increase by a factor of 5, e.g. ~11MB before and ~53MB after.
This size increase can even be observed if no transformation is applied, but we only call page.add_transformation(Transformation()).
Environment
reproducible on python 3.7+ and pypdf 3.16.2, also tested with 4.0.1. (latest) which shows the same behavior.
Code + PDF
This is a minimal, complete example that shows the issue:
for page in writer.pages:
# ⚠️ This has to be done on the writer, not the reader!
page.compress_content_streams(level=9) # This is CPU intensive!
will result in a 715K file.
My guess is that we need to uncompress the file for handling the transformation, but we don't re-compress it. It makes sense to not do it by default as compression takes time.
When adding a transformation
t
to the page e.g. to scale the content (t.scale(...)
) or to move it around the page (t.translate(...)
), we observe a huge increase in size. For a test file of 798KB, the resulting PDF is 1.6MB. For larger files, it seems that the size can even increase by a factor of 5, e.g. ~11MB before and ~53MB after.This size increase can even be observed if no transformation is applied, but we only call
page.add_transformation(Transformation())
.Environment
reproducible on python 3.7+ and pypdf 3.16.2, also tested with 4.0.1. (latest) which shows the same behavior.
Code + PDF
This is a minimal, complete example that shows the issue:
result-file.pdf
test-file.pdf
The text was updated successfully, but these errors were encountered: