Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Restore merge_transformed_page & co + add 'over' parameter #1567

Merged
merged 21 commits into from
Feb 5, 2023

Conversation

pubpub-zz
Copy link
Collaborator

@pubpub-zz pubpub-zz commented Jan 21, 2023

Added the over parameter

Fixes #1426
Fixes #1601

pubpub-zz and others added 6 commits January 18, 2023 21:39
for performance
was not sure it would have worked😉

Co-authored-by: Martin Thoma <info@martin-thoma.de>
@pubpub-zz
Copy link
Collaborator Author

still in progress to confirm use cases

@codecov
Copy link

codecov bot commented Jan 21, 2023

Codecov Report

Base: 91.90% // Head: 91.94% // Increases project coverage by +0.03% 🎉

Coverage data is based on head (e24367b) compared to base (6ec88ad).
Patch coverage: 86.06% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1567      +/-   ##
==========================================
+ Coverage   91.90%   91.94%   +0.03%     
==========================================
  Files          33       33              
  Lines        6252     6368     +116     
  Branches     1244     1271      +27     
==========================================
+ Hits         5746     5855     +109     
+ Misses        326      325       -1     
- Partials      180      188       +8     
Impacted Files Coverage Δ
pypdf/_writer.py 84.55% <85.71%> (-0.03%) ⬇️
pypdf/_page.py 90.51% <86.08%> (+0.66%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@pubpub-zz
Copy link
Collaborator Author

pending a last check but ready for review

@MartinThoma
Copy link
Member

This includes the (decimal -> float) change, so I suggest we first handle #1563 before we merge this PR.

@pubpub-zz
Copy link
Collaborator Author

This includes the (decimal -> float) change, so I suggest we first handle #1563 before we merge this PR.

agree, that was the subject of my comment in #1563, however I needed the fix to complete my tests.

MartinThoma pushed a commit that referenced this pull request Feb 4, 2023
Decimal was replaced by float in order to fix bugs.

It might also improve speed in some cases.
It is a preparation for #1567

Fixes #1527
Fixes #1376
pypdf/_page.py Outdated
@@ -315,7 +315,9 @@ def __repr__(self) -> str:
return f"Transformation(ctm={self.ctm})"

def apply_on(
self, pt: Union[Tuple[Decimal, Decimal], Tuple[float, float], List[float]]
self,
pt: Union[Tuple[Decimal, Decimal], Tuple[float, float], List[float]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still Decimal or is it float?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree Decimal is no more required.

value = value.decode()
else:
value = str_(value)
value = float(value)
return float.__new__(cls, value)
except Exception as e:
# If this isn't a valid decimal (happens in malformed PDFs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this has to be replaced by "float"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally part of the PR on float.

@MartinThoma MartinThoma changed the title merge_transformed_page & co BUG: merge_transformed_page & co Feb 4, 2023
@MartinThoma MartinThoma added soon PRs that are almost ready to be merged, issues that get solved pretty soon is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF labels Feb 4, 2023
@pubpub-zz
Copy link
Collaborator Author

all good ?

@MartinThoma
Copy link
Member

Github shows me that there is an update to the git submodule sample-files, but it doesn't show me to which commit it's updated. That is super weird.

Do you happen to know what was changed?

@MartinThoma MartinThoma changed the title BUG: merge_transformed_page & co BUG: Restore merge_transformed_page & co Feb 5, 2023
@MartinThoma MartinThoma merged commit 2d60c71 into py-pdf:main Feb 5, 2023
@MartinThoma
Copy link
Member

Thank you for the PR @pubpub-zz 🤗 I'll release it today

@pubpub-zz
Copy link
Collaborator Author

Just a small point for me this

Github shows me that there is an update to the git submodule sample-files, but it doesn't show me to which commit it's updated. That is super weird.

Do you happen to know what was changed?

I have some issues about it saying there is some changes but I'm unable to discard them. I did no change about it.

PS : Should not we consider this PR as an enhancement as there is some new functions ?

@MartinThoma
Copy link
Member

An enhancement for me is something new the user can do which they couldn't do before. Is there anything like that?

To me it looks like the merging functionality has been fixed, but it did work at some point before (right?)

@pubpub-zz
Copy link
Collaborator Author

The merge_transformed_page and code have been hosted from the old names. Also these functions now have the new over parameter

@MartinThoma
Copy link
Member

By the way: Why did you introduce that parameter? (I should have asked that before)

Aren't those two equivalent?

page1.merge_scaled_page(page2, over=True)
page2.merge_scaled_page(page1, over=False)

@MartinThoma MartinThoma removed the soon PRs that are almost ready to be merged, issues that get solved pretty soon label Feb 5, 2023
@pubpub-zz
Copy link
Collaborator Author

By the way: Why did you introduce that parameter? (I should have asked that before)

Aren't those two equivalent?

page1.merge_scaled_page(page2, over=True)
page2.merge_scaled_page(page1, over=False)

No
first the syntax is page1.merge_scaled_page(page2, .5, over=True)
therefore you can not "swap" page_1/page_2

Second, the standard case is to have already added in the page into the writer (I dislike the idea of "modifying" page in readers).
therefore "page_1" can not be easily replaced

@MartinThoma
Copy link
Member

first the syntax is page1.merge_scaled_page(page2, .5, over=True), therefore you can not "swap" page_1/page_2

Good point! That's something I completely missed 🤦

Yes, in that case I agree that it's actually a new feature / enhancement 👍 Thanks for explaining 🤗

@MartinThoma MartinThoma changed the title BUG: Restore merge_transformed_page & co ENH: Restore merge_transformed_page & co + add 'over' parameter Feb 5, 2023
MartinThoma added a commit that referenced this pull request Feb 5, 2023
NOTICE: pypdf changed the way it represents numbers parsed from PDF files.
  pypdf<3.4.0 represented numbers as Decimal, pypdf>=3.4.0 represents them as
  floats. Several other PDF libraries to this, as well as many PDF viewers.
  We hope to fix issues with too high precision like this and get a speed boost.
  In case your PDF documents rely on more than 18 decimals of precision you
  should check if it still works as expected.
  To clarify: This does not affect the text shown in PDF documents. It affects
  numbers, e.g. when graphics are drawn on the PDF or very exact positions are
  used. Typically, 5 decimals should be enough.

New Features (ENH)
-  Enable merging forms with overlapping names (#1553)
-  Add 'over' parameter to merge_transformend_page & co (#1567)

Bug Fixes (BUG)
-  Fix getter of the PageObject.rotation property with an indirect object (#1602)
-  Restore merge_transformed_page & co (#1567)
-  Replace decimal by float (#1563)

Robustness (ROB)
-  PdfWriter.remove_images: /Contents might not be in page_ref (#1598)

Developer Experience (DEV)
-  Introduce ruff (#1586, #1609)

Maintenance (MAINT)
-  Remove decimal (#1608)

[Full Changelog](3.3.0...3.4.0)
@pubpub-zz pubpub-zz deleted the merge_trsf_page branch June 24, 2023 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Page merging broken since #1371 Alternative to add_transformation translate
2 participants