MAINT: PdfReaderProtocol #1303

MartinThoma · 2022-08-29T19:22:28Z

PyPDF2 has some dependencies that make proper typing hard:

PdfReader has the pages property which returns a List[PageObject]
PageObject has the pdf property which returns the PdfReader it belongs to

The simplest solution would be to put both classes in the same file, but that makes PRs hard to read. Additionally, bigger files mean merge conflicts happen more often.

Another solution is to just not use type annotations for one of the objects (or use Any as the type)

The solution implemented in this PR is to define a Protocol (PEP 544): A protocol just states which methods a class is expected to have (with their function signature). It's duck typing: If it walks like a duck and it quacks like a duck, then it must be a duck.

So we define the expected behavior instead of referencing to the specific class.

typing.Iterable is an example for a Protocol. In the Java world, one would call this an interface.

pubpub-zz · 2022-08-29T19:45:19Z

@MartinThoma
can you add a small description of the objective please (not familiar with PEP-544)

MartinThoma · 2022-08-29T19:58:33Z

@pubpub-zz Sure! I've edited the first comment :-)

MartinThoma · 2022-08-29T20:02:43Z

There is actually another solution: Guarded imports / being really careful with the import order. But I haven't figured it out how to do it without cyclic imports.

codecov · 2022-08-29T20:08:33Z

Codecov Report

Merging #1303 (aab6886) into main (d9ba817) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #1303   +/-   ##
=======================================
  Coverage   95.02%   95.02%           
=======================================
  Files          30       30           
  Lines        4986     4986           
  Branches     1025     1025           
=======================================
  Hits         4738     4738           
  Misses        141      141           
  Partials      107      107

Impacted Files	Coverage Δ
PyPDF2/_page.py	`94.36% <100.00%> (ø)`
PyPDF2/_reader.py	`91.21% <100.00%> (ø)`
PyPDF2/_writer.py	`91.55% <100.00%> (ø)`
PyPDF2/types.py	`100.00% <100.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Version 2.10.5, 2022-09-04 -------------------------- New Features (ENH): - Process XRefStm (#1297) - Auto-detect RTL for text extraction (#1309) Bug Fixes (BUG): - Avoid scaling cropbox twice (#1314) Robustness (ROB): - Fix offset correction in revised PDF (#1318) - Crop data of /U and /O in encryption dictionary to 48 bytes (#1317) - MultiLine bfrange in cmap (#1299) - Cope with 2 digit codes in bfchar (#1310) - Accept '/annn' charset as ASCII code (#1316) - Log errors during Float / NumberObject initialization (#1315) - Cope with corrupted entries in xref table (#1300) Documentation (DOC): - Migration guide (PyPDF2 1.x \xe2\x9e\x94 2.x) (#1324) - Creating a coverage report (#1319) - Fix AnnotationBuilder.free_text example (#1311) - Fix usage of page.scale by replacing it with page.scale_by (#1313) Developer Experience (DEV): - Only run coverage for PyPDF2 Maintenance (MAINT): - PdfReaderProtocol (#1303) - Throw PdfReadError if Trailer can't be read (#1298) - Remove catching OverflowException (#1302) Full Changelog: 2.10.4...2.10.5

includes also reintroduction of py-pdf#1303 wrongly cancelled in py-pdf#1309

MartinThoma added 2 commits August 29, 2022 21:22

MAINT: PdfReaderProtocol

3684d42

Adjust type

9edadf7

Add pages

4d88e66

Remove unecessary import

81551eb

no cover

aab6886

MartinThoma merged commit c696192 into main Aug 31, 2022

MartinThoma deleted the pdfreader-protocol branch August 31, 2022 04:45

MasterOdin mentioned this pull request Nov 10, 2022

ENH: Add Cloning #1371

Merged

pubpub-zz added a commit to pubpub-zz/pypdf that referenced this pull request Nov 12, 2022

Rewriting using Protocols

e1c3ed3

includes also reintroduction of py-pdf#1303 wrongly cancelled in py-pdf#1309

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAINT: PdfReaderProtocol #1303

MAINT: PdfReaderProtocol #1303

MartinThoma commented Aug 29, 2022 •

edited

pubpub-zz commented Aug 29, 2022

MartinThoma commented Aug 29, 2022

MartinThoma commented Aug 29, 2022

codecov bot commented Aug 29, 2022 •

edited

MAINT: PdfReaderProtocol #1303

MAINT: PdfReaderProtocol #1303

Conversation

MartinThoma commented Aug 29, 2022 • edited

pubpub-zz commented Aug 29, 2022

MartinThoma commented Aug 29, 2022

MartinThoma commented Aug 29, 2022

codecov bot commented Aug 29, 2022 • edited

Codecov Report

MartinThoma commented Aug 29, 2022 •

edited

codecov bot commented Aug 29, 2022 •

edited