Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: PdfReaderProtocol #1303

Merged
merged 5 commits into from Aug 31, 2022
Merged

MAINT: PdfReaderProtocol #1303

merged 5 commits into from Aug 31, 2022

Conversation

MartinThoma
Copy link
Member

@MartinThoma MartinThoma commented Aug 29, 2022

PyPDF2 has some dependencies that make proper typing hard:

  • PdfReader has the pages property which returns a List[PageObject]
  • PageObject has the pdf property which returns the PdfReader it belongs to

The simplest solution would be to put both classes in the same file, but that makes PRs hard to read. Additionally, bigger files mean merge conflicts happen more often.

Another solution is to just not use type annotations for one of the objects (or use Any as the type)

The solution implemented in this PR is to define a Protocol (PEP 544): A protocol just states which methods a class is expected to have (with their function signature). It's duck typing: If it walks like a duck and it quacks like a duck, then it must be a duck.

So we define the expected behavior instead of referencing to the specific class.

typing.Iterable is an example for a Protocol. In the Java world, one would call this an interface.

@pubpub-zz
Copy link
Collaborator

@MartinThoma
can you add a small description of the objective please (not familiar with PEP-544)

@MartinThoma
Copy link
Member Author

@pubpub-zz Sure! I've edited the first comment :-)

@MartinThoma
Copy link
Member Author

There is actually another solution: Guarded imports / being really careful with the import order. But I haven't figured it out how to do it without cyclic imports.

@codecov
Copy link

codecov bot commented Aug 29, 2022

Codecov Report

Merging #1303 (aab6886) into main (d9ba817) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #1303   +/-   ##
=======================================
  Coverage   95.02%   95.02%           
=======================================
  Files          30       30           
  Lines        4986     4986           
  Branches     1025     1025           
=======================================
  Hits         4738     4738           
  Misses        141      141           
  Partials      107      107           
Impacted Files Coverage Δ
PyPDF2/_page.py 94.36% <100.00%> (ø)
PyPDF2/_reader.py 91.21% <100.00%> (ø)
PyPDF2/_writer.py 91.55% <100.00%> (ø)
PyPDF2/types.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@MartinThoma MartinThoma merged commit c696192 into main Aug 31, 2022
@MartinThoma MartinThoma deleted the pdfreader-protocol branch August 31, 2022 04:45
MartinThoma added a commit that referenced this pull request Sep 4, 2022
Version 2.10.5, 2022-09-04
--------------------------

New Features (ENH):
-  Process XRefStm (#1297)
-  Auto-detect RTL for text extraction (#1309)

Bug Fixes (BUG):
-  Avoid scaling cropbox twice (#1314)

Robustness (ROB):
-  Fix offset correction in revised PDF (#1318)
-  Crop data of /U and /O in encryption dictionary to 48 bytes (#1317)
-  MultiLine bfrange in cmap (#1299)
-  Cope with 2 digit codes in bfchar (#1310)
-  Accept '/annn' charset as ASCII code (#1316)
-  Log errors during Float / NumberObject initialization (#1315)
-  Cope with corrupted entries in xref table (#1300)

Documentation (DOC):
-  Migration guide (PyPDF2 1.x \xe2\x9e\x94 2.x) (#1324)
-  Creating a coverage report (#1319)
-  Fix AnnotationBuilder.free_text example (#1311)
-  Fix usage of page.scale by replacing it with page.scale_by (#1313)

Developer Experience (DEV):
-  Only run coverage for PyPDF2

Maintenance (MAINT):
-  PdfReaderProtocol (#1303)
-  Throw PdfReadError if Trailer can't be read (#1298)
-  Remove catching OverflowException (#1302)

Full Changelog: 2.10.4...2.10.5
@MasterOdin MasterOdin mentioned this pull request Nov 10, 2022
pubpub-zz added a commit to pubpub-zz/pypdf that referenced this pull request Nov 12, 2022
includes also reintroduction of py-pdf#1303 wrongly cancelled in py-pdf#1309
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants