Adding transformation to a page can increase the file size by a factor of >2 #2436

lschlesinger · 2024-02-02T10:33:29Z

When adding a transformation t to the page e.g. to scale the content (t.scale(...)) or to move it around the page (t.translate(...)), we observe a huge increase in size. For a test file of 798KB, the resulting PDF is 1.6MB. For larger files, it seems that the size can even increase by a factor of 5, e.g. ~11MB before and ~53MB after.
This size increase can even be observed if no transformation is applied, but we only call page.add_transformation(Transformation()).

Environment

reproducible on python 3.7+ and pypdf 3.16.2, also tested with 4.0.1. (latest) which shows the same behavior.

Code + PDF

This is a minimal, complete example that shows the issue:

from pypdf import PdfWriter, Transformation

def transform_pdf(
    pdf_path: str,
    output_pdf: str,
):
    writer = PdfWriter(clone_from=pdf_path)
    for page in writer.pages:
        page.add_transformation(Transformation())
    writer.write(output_pdf)


if __name__ == '__main__':
    transform_pdf("test-file.pdf", "result-file.pdf")

result-file.pdf
test-file.pdf

The text was updated successfully, but these errors were encountered:

MartinThoma · 2024-02-03T07:39:34Z

Please have a look at https://pypdf.readthedocs.io/en/latest/user/file-size.html

Adding the block

    for page in writer.pages:
        # ⚠️ This has to be done on the writer, not the reader!
        page.compress_content_streams(level=9)  # This is CPU intensive!

will result in a 715K file.

My guess is that we need to uncompress the file for handling the transformation, but we don't re-compress it. It makes sense to not do it by default as compression takes time.

MartinThoma · 2024-02-03T07:39:59Z

If this solves your issue, please close it :-)

If it doesn't, please let me know why.

lschlesinger · 2024-02-05T08:30:53Z

Thanks for the quick response @MartinThoma, it seems to solve the issue 👍

lschlesinger closed this as completed Feb 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding transformation to a page can increase the file size by a factor of >2 #2436

Adding transformation to a page can increase the file size by a factor of >2 #2436

lschlesinger commented Feb 2, 2024 •

edited

Loading

MartinThoma commented Feb 3, 2024

MartinThoma commented Feb 3, 2024

lschlesinger commented Feb 5, 2024 •

edited

Loading

Adding transformation to a page can increase the file size by a factor of >2 #2436

Adding transformation to a page can increase the file size by a factor of >2 #2436

Comments

lschlesinger commented Feb 2, 2024 • edited Loading

Environment

Code + PDF

MartinThoma commented Feb 3, 2024

MartinThoma commented Feb 3, 2024

lschlesinger commented Feb 5, 2024 • edited Loading

lschlesinger commented Feb 2, 2024 •

edited

Loading

lschlesinger commented Feb 5, 2024 •

edited

Loading