Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PdfFileReader is deprecated and was removed in PyPDF2 3.0.0 #339

Closed
szeswee opened this issue Dec 23, 2022 · 25 comments · Fixed by #307
Closed

PdfFileReader is deprecated and was removed in PyPDF2 3.0.0 #339

szeswee opened this issue Dec 23, 2022 · 25 comments · Fixed by #307
Labels
bug Something isn't working

Comments

@szeswee
Copy link

szeswee commented Dec 23, 2022

Describe the bug

Version 3.0.0 of PyPDF2 was just released today (23 Dec 2022), which includes a breaking change for removing PdfFileReader (see changelog). As a result, all new installs and usage of camelot-py will raise the following exception:

Traceback (most recent call last):
  File "test.py", line 9, in <module>
    camelot.read_pdf(PDF_FILE_PATH)
  File ".venv/py37/lib/python3.7/site-packages/camelot/io.py", line 117, in read_pdf
    **kwargs
  File ".venv/py37/lib/python3.7/site-packages/camelot/handlers.py", line 172, in parse
    self._save_page(self.filepath, p, tempdir)
  File ".venv/py37/lib/python3.7/site-packages/camelot/handlers.py", line 111, in _save_page
    infile = PdfFileReader(fileobj, strict=False)
  File ".venv/py37/lib/python3.7/site-packages/PyPDF2/_reader.py", line 1974, in __init__
    deprecation_with_replacement("PdfFileReader", "PdfReader", "3.0.0")
  File ".venv/py37/lib/python3.7/site-packages/PyPDF2/_utils.py", line 369, in deprecation_with_replacement
    deprecation(DEPR_MSG_HAPPENED.format(old_name, removed_in, new_name))
  File ".venv/py37/lib/python3.7/site-packages/PyPDF2/_utils.py", line 351, in deprecation
    raise DeprecationError(msg)
PyPDF2.errors.DeprecationError: PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead.

Steps to reproduce the bug

  1. Create a new virtualenv
  2. Install camelot-py:
    pip install camelot-py[base]
    
  3. Run the following code:
    import camelot
    
    # replace with a valid path on your local filesystem
    PDF_FILE_PATH = "/path/to/file.pdf"
    
    # raises an exception from PyPDF2
    camelot.read_pdf(PDF_FILE_PATH)

Expected behavior

The code above should execute without any exceptions.

Environment

  • OS: macOS 12.3.1
  • Python version: 3.7
  • Numpy version: 1.24.0
  • OpenCV version: 4.6.0.66
  • Ghostscript version: 0.7
  • Camelot version: 0.10.1
@szeswee szeswee added the bug Something isn't working label Dec 23, 2022
@szeswee
Copy link
Author

szeswee commented Dec 23, 2022

As a workaround, I've added this line in my requirement.txt for the time being:

PyPDF2~=2.0

@KshitizPandya
Copy link

KshitizPandya commented Dec 27, 2022

Hey @szeswee,

Gone with the steps, created a virtual environment through terminal, and executed the script in the IDE, as well as in colab, giving the same error.

@saidakyuz
Copy link

Hey @szeswee I also tried on both platforms(see screenshots) like @KshitizPandya and have the issue still. Does it work for you? Can you explain it more in detail?
image
image

@MartinThoma
Copy link
Contributor

As a side-note: PyPDF2 should now be considered deprecated. We will continue development at pypdf, see History of pypdf

@owen800q
Copy link

We are hitting this issue now, any workaround fix?

@anakin87
Copy link
Contributor

@vinayak-mehta if you want, I'm available to submit a PR to fix this issue.

@alprnyldz
Copy link

As a workaround, I've added this line in my requirement.txt for the time being:

PyPDF2~=2.0

Thank you for the workaround fix

@KshitizPandya
Copy link

KshitizPandya commented Dec 29, 2022

Hey @saidakyuz , I did a bit of work around and now my camelot is working fine. Am mentioning the steps below if you want to refer.

  1. Set anaconda env (preferably use python 3.7)
  2. Install camelot -
    pip install camelot-py[base]
  3. It will itself download the pyPDF2 version 3.0.0, so you need to extensively change the version -
    pip install 'PyPDF2<3.0'
  4. I used pyCharm to work with my script so I set the environment from settings and then it worked fine.
    image

Environment packages:
camelot-py 0.10.1
ghostscript 0.7
pypdf2 2.0.0

Note: It might sometimes create error showing ghostscript is not installed. You can explicitly install it from: https://ghostscript.com/releases/gsdnld.html
and then set it computer's environment variable to bypass any issues and restart.

Then the issue should be resolved.

OUTPUT:
image

Hope it helps.

@MartinThoma
Copy link
Contributor

@KshitizPandya Please replace pip install pyPDF2 == 2.0 by pip install 'PyPDF2<3.0'

@KshitizPandya
Copy link

@KshitizPandya Please replace pip install pyPDF2 == 2.0 by pip install 'PyPDF2<3.0'

Thanks for the correction @MartinThoma

@dmitry-ra
Copy link

Another possible way is to downgrade installed version of PyPDF2:

pip install --upgrade PyPDF2==2.12.1

@Rhackleford
Copy link

@vinayak-mehta if you want, I'm available to submit a PR to fix this issue.

Im so close to getting my program finished, when i run in pycharm the code runs fine when i run the exe i get the
""PyPDF2.errors.DeprecationError: PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead.[17036] Failed to execute script 'Werks' due to unhandled exception!""
error

the suggested workarounds arent working for me, probably because this is my first program.
i would LOVE a fix or a workaround that i can do..
please advise!

@Rhackleford
Copy link

Another possible way is to downgrade installed version of PyPDF2:

pip install --upgrade PyPDF2==2.12.1

I tried this and also downgraded to PyPDF2.0, i get the same result, the deprecation error. I'm sure I'm missing something dumb.
i am running the camelot-py[cv] version of camelot, would that have anything to do with it?

@MartinThoma
Copy link
Contributor

MartinThoma commented Dec 29, 2022

@RhacklefordGPT most likely you downgraded it in the wrong environment. So you have two places where PyPDF2 is installed. You need to ensure to downgrade it in the correct one.

For example, you might need pip3. Or you might need to load a virtual environment.

To verify, you can add the following before you import camelot:

import PyPDF2

print("PyPDF2==" + PyPDF2.__version__)

@Rhackleford
Copy link

Rhackleford commented Dec 29, 2022

@RhacklefordGPT most likely you downgraded it in the wrong environment. So you have two places where PyPDF2 is installed. You need to ensure to downgrade it in the correct one.

For example, you might need pip3. Or you might need to load a virtual environment.

To verify, you can add the following before you import camelot:

import PyPDF2

print("PyPDF2==" + PyPDF2.__version__)

Looks like that did the trick.
my project is in a venv and I had been using the pycharm terminal to pip anything,
looks like before I was using venv I had installed it using CMD directly.
I'm still not clear on why if I use pyinstaller through Pycharm it tries to use things installed using cmd, I would think those are separated somehow, but really I do have a loose grasp of how this all works, could you recommend some further reading so I can avoid this? should I delete every installation of everything outside my venv?
ps thank you very much now I can bring a finished product into work and blow some minds with it!!! :)

@au3m
Copy link

au3m commented Dec 30, 2022

import PyPDF2

print("PyPDF2==" + PyPDF2.version)

i have already do that but i get other error
OSError: Ghostscript is not installed. You can install it using the instructions here: https://camelot-py.readthedocs.io/en/master/user/install-deps.html
although it is installed already

@KshitizPandya
Copy link

KshitizPandya commented Dec 31, 2022

import PyPDF2
print("PyPDF2==" + PyPDF2.version)

i have already do that but i get other error OSError: Ghostscript is not installed. You can install it using the instructions here: https://camelot-py.readthedocs.io/en/master/user/install-deps.html although it is installed already

Hey @au3m,
Doing just the installation sometimes might not help. Sometimes you might need to set the things in the computer's environment variables to access it easily.
So try setting "ghostscript" to your environment variables.

STEPS FOR REFERENCE:

  1. copy the path where you have installed ghostscript.

  2. If you are using windows - search for "Edit the system environment variable" .
    image

  3. above dialog should open. Click on the "environment variable" tab.

  4. Under "system variables" section double click "path".
    image

  5. Click on the open space and paste the copied path of the ghostscript.
    image

  6. Click OK and for precautions restart your device.

After this your program should run fine without giving the ghostscript related error.

@au3m
Copy link

au3m commented Dec 31, 2022

import PyPDF2
print("PyPDF2==" + PyPDF2.version)

i have already do that but i get other error OSError: Ghostscript is not installed. You can install it using the instructions here: https://camelot-py.readthedocs.io/en/master/user/install-deps.html although it is installed already

Hey @au3m,
Doing just the installation sometimes might not help. Sometimes you might need to set the things in the computer's environment variables to access it easily.
So try setting "ghostscript" to your environment variables.

STEPS FOR REFERENCE:

  1. copy the path where you have installed ghostscript.

  2. If you are using windows - search for "Edit the system environment variable" .
    image

  3. above dialog should open. Click on the "environment variable" tab.

  4. Under "system variables" section double click "path".
    image

  5. Click on the open space and paste the copied path of the ghostscript.
    image

  6. Click OK and for precautions restart your device.

After this your program should run fine without giving the ghostscript related error.

@KshitizPandya
Thanks bro it works now ☺️

@AChnki
Copy link

AChnki commented Dec 31, 2022

I think the problem is based on a missed migration considering the naming adjustments within PyPDF2/pypdf - see the following doc: https://pypdf2.readthedocs.io/en/stable/user/migration-1-to-2.html

Following the The Deprecation Process of PyPDF2/pypdf they are not longer tolerated.

I replaced the handlers.py-file with the file from the PR below and the cli is working again for me.
PR from @MartinThoma can be found here: #307

@kfunes3706
Copy link

hey, I am getting this error on m2 mbp, would anyone know how to fix? I've verified gs is installed
Screenshot 2023-01-07 at 11 22 12 PM

@Prathamesh-gunjal
Copy link

image

Still getting this error please help!!

@AayushSameerShah
Copy link

AayushSameerShah commented May 25, 2023

If anyone trying to do this on colab then run the following steps:

!pip install ghostscript
!pip install camelot-py[cv]
!pip install excalibur-py
!apt install ghostscript python3-tk

And after that check if installed:

from ctypes.util import find_library

# It will display `libgs.so.9` if installed or will print `None` if not
print(find_library("gs")) 

If still doesn't work:

!excalibur initdb

Source: here

@Gio-2020
Copy link

import PyPDF2
print("PyPDF2==" + PyPDF2.version)

i have already do that but i get other error OSError: Ghostscript is not installed. You can install it using the instructions here: https://camelot-py.readthedocs.io/en/master/user/install-deps.html although it is installed already

Hey @au3m, Doing just the installation sometimes might not help. Sometimes you might need to set the things in the computer's environment variables to access it easily. So try setting "ghostscript" to your environment variables.

STEPS FOR REFERENCE:

  1. copy the path where you have installed ghostscript.
  2. If you are using windows - search for "Edit the system environment variable" .
    image
  3. above dialog should open. Click on the "environment variable" tab.
  4. Under "system variables" section double click "path".
    image
  5. Click on the open space and paste the copied path of the ghostscript.
    image
  6. Click OK and for precautions restart your device.

After this your program should run fine without giving the ghostscript related error.

Thanks! Help me a lot.

@jmartle
Copy link

jmartle commented May 29, 2023

pip install --upgrade PyPDF2==2.12.1

works

@juliatong
Copy link

Hey @saidakyuz , I did a bit of work around and now my camelot is working fine. Am mentioning the steps below if you want to refer.

  1. Set anaconda env (preferably use python 3.7)
  2. Install camelot -
    pip install camelot-py[base]
  3. It will itself download the pyPDF2 version 3.0.0, so you need to extensively change the version -
    pip install 'PyPDF2<3.0'
  4. I used pyCharm to work with my script so I set the environment from settings and then it worked fine.
    image

Environment packages: camelot-py 0.10.1 ghostscript 0.7 pypdf2 2.0.0

Note: It might sometimes create error showing ghostscript is not installed. You can explicitly install it from: https://ghostscript.com/releases/gsdnld.html and then set it computer's environment variable to bypass any issues and restart.

Then the issue should be resolved.

OUTPUT: image

Hope it helps.

I set all the libs exact same version as yours, yet error remains...
Environment packages: camelot-py 0.10.1 ghostscript 0.7 pypdf2 2.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.