Fix IndexError for getPage() of decryped file #359

denis-osipov · 2017-07-12T17:05:29Z

Issue #327
_flatten() method change flattenedPages to empty list when trying to get page and then test if file is encrypted and try to decrypt it. If it fails PdfReadError("file has not been decrypted") raise and flattenedPages not revert to None.
This fix allow to get page from PdfFileReader object after decryption.

ameybh · 2019-07-03T17:48:47Z

This works for me. I hope they merge it soon.

MartinThoma · 2022-04-06T06:05:20Z

I'm sorry that it took so long. I just became a maintainer of this project.

Do you @ameybhavsar24 / @denis-osipov have an example file / code snippet that shows the issue?

MartinThoma · 2022-04-07T14:51:09Z

I don't understand why this changes anything

denis-osipov · 2022-04-07T17:44:08Z

Hi, @MartinThoma

There is an encrypted pdf file to use in example:

import PyPDF2

pdfFileObj = open('example.pdf', 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)

try:
    # Can't get access to content of encrypted file. It's OK.
    print(pdfReader.getPage(0))
except PyPDF2.utils.PdfReadError as error:
    print('Expected error:', error)

# Password is correct:)
pdfReader.decrypt('test')

try:
    # Current behaviour is unexpected (for me): we decrypted file and now should
    # have access to its content. But we'll get an error here.
    print(pdfReader.getPage(0))
except IndexError as error:
    print('Unexpected error (file is decrypted now):', error)

Problem appears because _flatten() method sets self.flattenedPages before it tries to get pages and doesn't set it back to None in case of error. This PR just makes _flatten() to set self.flattenedPages to an empty array after it successfully got pages.

If there is a better solution or current behaviour is correct, just close the PR, please.

codecov-commenter · 2022-04-16T05:39:24Z

Codecov Report

Merging #359 (0417c1c) into main (733989a) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #359   +/-   ##
=======================================
  Coverage   69.57%   69.57%           
=======================================
  Files           9        9           
  Lines        3316     3316           
  Branches      783      783           
=======================================
  Hits         2307     2307           
  Misses        766      766           
  Partials      243      243

Impacted Files	Coverage Δ
PyPDF2/pdf.py	`72.36% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 733989a...0417c1c. Read the comment docs.

Credits to Denis Osipov: #359 (comment) Co-authored-by: Denis Osipov <osipov_d@list.ru>

MartinThoma · 2022-04-16T06:08:54Z

@denis-osipov Thank you for all the time and patience 🙏

This PR was just merged and will go into the next release (some time this month)

@ameybhavsar24 It's merged :-) It was by far not soon, but as the new maintainer I hope in future such things get resolved quicker.

@staticmethod

Deprecations (DEP): - Remove support for Python 2.6 and older (#776) New Features (ENH): - Extract document permissions (#320) Bug Fixes (BUG): - Clip by trimBox when merging pages, which would otherwise be ignored (#240) - Add overwriteWarnings parameter PdfFileMerger (#243) - IndexError for getPage() of decryped file (#359) - Handle cases where decodeParms is an ArrayObject (#405) - Updated PDF fields don't show up when page is written (#412) - Set Linked Form Value (#414) - Fix zlib -5 error for corrupt files (#603) - Fix reading more than last1K for EOF (#642) - Acciental import Robustness (ROB): - Allow extra whitespace before "obj" in readObjectHeader (#567) Documentation (DOC): - Link to pdftoc in Sample_Code (#628) - Working with annotations (#764) - Structure history Developer Experience (DEV): - Add issue templates (#765) - Add tool to generate changelog Maintenance (MAINT): - Use grouped constants instead of string literals (#745) - Add error module (#768) - Use decorators for @staticmethod (#775) - Split long functions (#777) Testing (TST): - Run tests in CI once with -OO Flags (#770) - Filling out forms (#771) - Add tests for Writer (#772) - Error cases (#773) - Check Error messages (#769) - Regression test for issue #88 - Regression test for issue #327 Code Style (STY): - Make variable naming more consistent in tests All changes: 1.27.5...1.27.6

@staticmethod

Deprecations (DEP): - Remove support for Python 2.6 and older (py-pdf#776) New Features (ENH): - Extract document permissions (py-pdf#320) Bug Fixes (BUG): - Clip by trimBox when merging pages, which would otherwise be ignored (py-pdf#240) - Add overwriteWarnings parameter PdfFileMerger (py-pdf#243) - IndexError for getPage() of decryped file (py-pdf#359) - Handle cases where decodeParms is an ArrayObject (py-pdf#405) - Updated PDF fields don't show up when page is written (py-pdf#412) - Set Linked Form Value (py-pdf#414) - Fix zlib -5 error for corrupt files (py-pdf#603) - Fix reading more than last1K for EOF (py-pdf#642) - Acciental import Robustness (ROB): - Allow extra whitespace before "obj" in readObjectHeader (py-pdf#567) Documentation (DOC): - Link to pdftoc in Sample_Code (py-pdf#628) - Working with annotations (py-pdf#764) - Structure history Developer Experience (DEV): - Add issue templates (py-pdf#765) - Add tool to generate changelog Maintenance (MAINT): - Use grouped constants instead of string literals (py-pdf#745) - Add error module (py-pdf#768) - Use decorators for @staticmethod (py-pdf#775) - Split long functions (py-pdf#777) Testing (TST): - Run tests in CI once with -OO Flags (py-pdf#770) - Filling out forms (py-pdf#771) - Add tests for Writer (py-pdf#772) - Error cases (py-pdf#773) - Check Error messages (py-pdf#769) - Regression test for issue py-pdf#88 - Regression test for issue py-pdf#327 Code Style (STY): - Make variable naming more consistent in tests All changes: py-pdf/pypdf@1.27.5...1.27.6

Fix IndexError for getPage() of decryped file

f80eb5f

denis-osipov changed the title ~~Fix IndexError for getPage() of decryped file~~ Fix IndexError for getPage() of decryped file (Issue #327) Jul 12, 2017

denis-osipov mentioned this pull request Jul 19, 2017

IndexError: list index out of range after decrypting encrypted file #327

Closed

Add comment about issue

2813fcd

MartinThoma added is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF Tiny Pull requests that make a tiny change - and thus should be easy to merge labels Apr 6, 2022

MartinThoma added 2 commits April 9, 2022 21:51

Merge branch 'master' into fix-issue-327

c003bcb

Merge branch 'main' into fix-issue-327

0417c1c

MartinThoma added the soon PRs that are almost ready to be merged, issues that get solved pretty soon label Apr 16, 2022

MartinThoma merged commit bd7500d into py-pdf:main Apr 16, 2022

MartinThoma added a commit that referenced this pull request Apr 16, 2022

TST: Regression test for #327

d58a849

Credits to Denis Osipov: #359 (comment) Co-authored-by: Denis Osipov <osipov_d@list.ru>

MartinThoma changed the title ~~Fix IndexError for getPage() of decryped file (Issue #327)~~ Fix IndexError for getPage() of decryped file Jun 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix IndexError for getPage() of decryped file #359

Fix IndexError for getPage() of decryped file #359

denis-osipov commented Jul 12, 2017 •

edited

ameybh commented Jul 3, 2019

MartinThoma commented Apr 6, 2022

MartinThoma commented Apr 7, 2022

denis-osipov commented Apr 7, 2022 •

edited

codecov-commenter commented Apr 16, 2022

MartinThoma commented Apr 16, 2022

Fix IndexError for getPage() of decryped file #359

Fix IndexError for getPage() of decryped file #359

Conversation

denis-osipov commented Jul 12, 2017 • edited

ameybh commented Jul 3, 2019

MartinThoma commented Apr 6, 2022

MartinThoma commented Apr 7, 2022

denis-osipov commented Apr 7, 2022 • edited

codecov-commenter commented Apr 16, 2022

Codecov Report

MartinThoma commented Apr 16, 2022

denis-osipov commented Jul 12, 2017 •

edited

denis-osipov commented Apr 7, 2022 •

edited