Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NotImplementedError: only algorithm code 1 and 2 are supported #325

Closed
RealDataLLC opened this issue May 12, 2019 · 12 comments
Closed

NotImplementedError: only algorithm code 1 and 2 are supported #325

RealDataLLC opened this issue May 12, 2019 · 12 comments

Comments

@RealDataLLC
Copy link

RealDataLLC commented May 12, 2019

Having trouble running this code on my mac. Using Conda virtual env and installed using conda. Pdf is not password protected.

import camelot
import pandas as pd
import re
import numpy as np
table1 = camelot.read_pdf('IEEJ - 2019 - Outlook.pdf')


NotImplementedError Traceback (most recent call last)
in
----> 1 table1 = camelot.read_pdf('IEEJ - 2019 - Outlook.pdf')#, pages = ex_page, password = None)#, area = (left, 112, right,112+ 90))
2 table1

/anaconda3/envs/tensorflow/lib/python3.6/site-packages/camelot/io.py in read_pdf(filepath, pages, password, flavor, suppress_stdout, layout_kwargs, **kwargs)
104 kwargs = remove_extra(kwargs, flavor=flavor)
105 tables = p.parse(flavor=flavor, suppress_stdout=suppress_stdout,
--> 106 layout_kwargs=layout_kwargs, **kwargs)
107 return tables

/anaconda3/envs/tensorflow/lib/python3.6/site-packages/camelot/handlers.py in parse(self, flavor, suppress_stdout, layout_kwargs, **kwargs)
153 with TemporaryDirectory() as tempdir:
154 for p in self.pages:
--> 155 self._save_page(self.filepath, p, tempdir)
156 pages = [os.path.join(tempdir, 'page-{0}.pdf'.format(p))
157 for p in self.pages]

/anaconda3/envs/tensorflow/lib/python3.6/site-packages/camelot/handlers.py in _save_page(self, filepath, page, temp)
98 infile = PdfFileReader(fileobj, strict=False)
99 if infile.isEncrypted:
--> 100 infile.decrypt(self.password)
101 fpath = os.path.join(temp, 'page-{0}.pdf'.format(page))
102 froot, fext = os.path.splitext(fpath)

/anaconda3/envs/tensorflow/lib/python3.6/site-packages/PyPDF2/pdf.py in decrypt(self, password)
1985 self._override_encryption = True
1986 try:
-> 1987 return self._decrypt(password)
1988 finally:
1989 self._override_encryption = False

/anaconda3/envs/tensorflow/lib/python3.6/site-packages/PyPDF2/pdf.py in _decrypt(self, password)
1994 raise NotImplementedError("only Standard PDF encryption handler is available")
1995 if not (encrypt['/V'] in (1, 2)):
-> 1996 raise NotImplementedError("only algorithm code 1 and 2 are supported")
1997 user_password, key = self._authenticateUserPassword(password)
1998 if user_password:

NotImplementedError: only algorithm code 1 and 2 are supported

@pachacamac
Copy link

pachacamac commented Jul 30, 2019

Hate to be that guy but any update on this? Totally at a loss here. If it helps I'm running on Linux.

> pdftk "that.pdf" dump_data
WARNING: The creator of the input PDF:
   that.pdf
   has set an owner password (which is not required to handle this PDF).
   You did not supply this password. Please respect any copyright.
InfoBegin
InfoKey: Creator
InfoValue: IDM
InfoBegin
InfoKey: CreationDate
InfoValue: D:20180607145130+02'00'
InfoBegin
InfoKey: Producer
InfoValue: PDFlib+PDI 7.0.2 (COM/Win32)
InfoBegin
InfoKey: Author
InfoValue: IntegraDM
PdfID0: 939f2420294646f31f041d74020f2c30
PdfID1: 939f2420294646f31f041d74020f2c30
NumberOfPages: 10
PageMediaBegin
PageMediaNumber: 1
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 2
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 3
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 4
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 5
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 6
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 7
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 8
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 9
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92
PageMediaBegin
PageMediaNumber: 10
PageMediaRotation: 0
PageMediaRect: 0 0 595.2 841.92
PageMediaDimensions: 595.2 841.92

and

> file that.pdf
that.pdf: PDF document, version 1.7

unfortunately I can not share the original pdf as it contains sensitive data but reading it works fine and https://github.com/jcushman/pdfquery reads and handles it just fine.

@myleshk
Copy link

myleshk commented Aug 6, 2019

Same here. Please fix

@boranaf
Copy link

boranaf commented Aug 10, 2019

Hi, here is a file that gives the same error
"MGROS-2017Y.pdf only algorithm code 1 and 2 are supported"

MGROS-2017Y.pdf

@myleshk
Copy link

myleshk commented Aug 15, 2019

I recognized that this is an issue of the dependancy PyPDF2 from 2015.

@boranaf
Copy link

boranaf commented Aug 15, 2019

thanks for your feedback which prompted me to retry
you are right @myleshk

@vinayak-mehta
Copy link
Contributor

Is the PDF encrypted? Can you try decrypting it using qpdf and then try again?

@pachacamac
Copy link

@vinayak-mehta mine is not encrypted. And as I said pdfquery another Python library can read it just fine.

@vinayak-mehta
Copy link
Contributor

vinayak-mehta commented Aug 27, 2019

I understand. Looked at pdfquery, it looks nice! Interestingly, it also uses pdfminer under the hood. I'll look into this over the weekend.

@vinayak-mehta
Copy link
Contributor

Sorry for the late responses to issues.

@alexxxkorolev
Copy link

Camelot does not support Acrobat files version 6 or higher. Convert your PDF file to a lower version (I used Acrobat 4.0 PDF 1.3) just through any converter online. The problem should be solved!

@pachacamac
Copy link

@alexxxkorolev thanks for the tip! Any suggestion for a command line tool, preferably Linux, that can downgrade PDFs? The problem is that I use camelot in an automated pipeline and can not manually convert PDFs.

@manohar9600
Copy link

manohar9600 commented Sep 10, 2020

py-pdf/pypdf#378 (comment)
using pikepdf, solved for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants