Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix reading more than last1K for EOF #642

Merged
merged 5 commits into from Apr 16, 2022
Merged

Fix reading more than last1K for EOF #642

merged 5 commits into from Apr 16, 2022

Conversation

rltpoa
Copy link
Contributor

@rltpoa rltpoa commented Oct 9, 2021

Added optional parameter in readNextEndLine() to limit the offset while reading the line. Then read() uses this parameter to limit the reading to the last 1024 bytes (last1K variable) while looking for "%%EOF".

Resolves Issue #639

Better solution than PR #439 (because of optional parameter acting only when readNextEndLine is used to search for the EOF marker, not interfering in other situations)

added optional parameter in readNextEndLine() to limit the offset

then read() uses this parameter to limit the reading to last1K
@MartinThoma MartinThoma added is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF PdfReader The PdfReader component is affected labels Apr 6, 2022
@codecov-commenter
Copy link

codecov-commenter commented Apr 16, 2022

Codecov Report

Merging #642 (73b2bba) into main (d5a5eea) will not change coverage.
The diff coverage is 66.66%.

@@           Coverage Diff           @@
##             main     #642   +/-   ##
=======================================
  Coverage   70.59%   70.59%           
=======================================
  Files          10       10           
  Lines        3425     3425           
  Branches      798      798           
=======================================
  Hits         2418     2418           
  Misses        763      763           
  Partials      244      244           
Impacted Files Coverage Δ
PyPDF2/pdf.py 72.42% <66.66%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5a5eea...73b2bba. Read the comment docs.

@MartinThoma MartinThoma merged commit 03ea3ec into py-pdf:main Apr 16, 2022
@MartinThoma
Copy link
Member

Thank you for your contribution! I've just merged the PR to main and will release a new version of PyPDF2 on PyPI within this month.

MartinThoma added a commit that referenced this pull request Apr 18, 2022
Deprecations (DEP):
-  Remove support for Python 2.6 and older (#776)

New Features (ENH):
-  Extract document permissions (#320)

Bug Fixes (BUG):
-  Clip by trimBox when merging pages, which would otherwise be ignored (#240)
-  Add overwriteWarnings parameter PdfFileMerger (#243)
-  IndexError for getPage() of decryped file (#359)
-  Handle cases where decodeParms is an ArrayObject (#405)
-  Updated PDF fields don't show up when page is written (#412)
-  Set Linked Form Value (#414)
-  Fix zlib -5 error for corrupt files (#603)
-  Fix reading more than last1K for EOF (#642)
-  Acciental import

Robustness (ROB):
-  Allow extra whitespace before "obj" in readObjectHeader (#567)

Documentation (DOC):
-  Link to pdftoc in Sample_Code (#628)
-  Working with annotations (#764)
-  Structure history

Developer Experience (DEV):
-  Add issue templates (#765)
-  Add tool to generate changelog

Maintenance (MAINT):
-  Use grouped constants instead of string literals (#745)
-  Add error module (#768)
-  Use decorators for @staticmethod (#775)
-  Split long functions (#777)

Testing (TST):
-  Run tests in CI once with -OO Flags (#770)
-  Filling out forms (#771)
-  Add tests for Writer (#772)
-  Error cases (#773)
-  Check Error messages (#769)
-  Regression test for issue #88
-  Regression test for issue #327

Code Style (STY):
-  Make variable naming more consistent in tests

All changes: 1.27.5...1.27.6
@rltpoa rltpoa deleted the bugs branch April 19, 2022 00:45
VictorCarlquist pushed a commit to VictorCarlquist/PyPDF2 that referenced this pull request Apr 29, 2022
Deprecations (DEP):
-  Remove support for Python 2.6 and older (py-pdf#776)

New Features (ENH):
-  Extract document permissions (py-pdf#320)

Bug Fixes (BUG):
-  Clip by trimBox when merging pages, which would otherwise be ignored (py-pdf#240)
-  Add overwriteWarnings parameter PdfFileMerger (py-pdf#243)
-  IndexError for getPage() of decryped file (py-pdf#359)
-  Handle cases where decodeParms is an ArrayObject (py-pdf#405)
-  Updated PDF fields don't show up when page is written (py-pdf#412)
-  Set Linked Form Value (py-pdf#414)
-  Fix zlib -5 error for corrupt files (py-pdf#603)
-  Fix reading more than last1K for EOF (py-pdf#642)
-  Acciental import

Robustness (ROB):
-  Allow extra whitespace before "obj" in readObjectHeader (py-pdf#567)

Documentation (DOC):
-  Link to pdftoc in Sample_Code (py-pdf#628)
-  Working with annotations (py-pdf#764)
-  Structure history

Developer Experience (DEV):
-  Add issue templates (py-pdf#765)
-  Add tool to generate changelog

Maintenance (MAINT):
-  Use grouped constants instead of string literals (py-pdf#745)
-  Add error module (py-pdf#768)
-  Use decorators for @staticmethod (py-pdf#775)
-  Split long functions (py-pdf#777)

Testing (TST):
-  Run tests in CI once with -OO Flags (py-pdf#770)
-  Filling out forms (py-pdf#771)
-  Add tests for Writer (py-pdf#772)
-  Error cases (py-pdf#773)
-  Check Error messages (py-pdf#769)
-  Regression test for issue py-pdf#88
-  Regression test for issue py-pdf#327

Code Style (STY):
-  Make variable naming more consistent in tests

All changes: py-pdf/pypdf@1.27.5...1.27.6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF PdfReader The PdfReader component is affected
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PdfFileReader keep looking for "%%EOF" on more than the last 1024 bytes of stream in malformed PDF files
3 participants