Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windows anaconda read_pdf #121

Closed
ghost opened this issue Oct 4, 2018 · 7 comments
Closed

windows anaconda read_pdf #121

ghost opened this issue Oct 4, 2018 · 7 comments
Labels

Comments

@ghost
Copy link

ghost commented Oct 4, 2018

Hi
I am getting following error while reading the pdf from specific directory, i have explored this error in the forum, not able to figure correctly. Please help thank you.

My Environment
Windows 10
Anaconda
Python 3.6

Below is the code snippet from windows command windows

(base) C:\work\python\pdf>python
Python 3.6.3 |Anaconda custom (64-bit)| (default, Oct 15 2017, 03:27:45) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.

import camelot
import os
tables = camelot.read_pdf('c:\work\python\pdf\stt.pdf')
Traceback (most recent call last):
File "", line 1, in
File "C:\ProgramData\Anaconda3\lib\site-packages\camelot\io.py", line 91, in read_pdf
tables = p.parse(flavor=flavor, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\camelot\handlers.py", line 146, in parse
t = parser.extract_tables(p)
File "C:\ProgramData\Anaconda3\lib\site-packages\camelot\parsers\lattice.py", line 316, in extract_tables
self._generate_image()
File "C:\ProgramData\Anaconda3\lib\site-packages\camelot\parsers\lattice.py", line 180, in _generate_image
if "ghostscript" in subprocess.check_output(["gs", "-version"]).decode('utf-8').lower():
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 336, in check_output
**kwargs).stdout
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 403, in run
with Popen(*popenargs, **kwargs) as process:
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 709, in init
restore_signals, start_new_session)
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 997, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

@vinayak-mehta
Copy link
Contributor

@raja-rahamath Thanks for reporting this with all the specifics! I was able to reproduce this bug on Windows. I have written a fix, will release it in the night (with the conda package and updated installation instructions) and then update here.

Meanwhile, you'll need to install ghostscript from https://www.ghostscript.com/download/gsdnld.html.

After installation, you'll need to reboot your system to make sure that the ghostscript executable is in your PATH. Alternatively, if you don't want to reboot, you can manually add the path to ghostscript installation's bin folder to the windows PATH variable as shown here. https://java.com/en/download/help/path.xml

@vinayak-mehta
Copy link
Contributor

You can also check out this SO answer for instructions about adding the ghostscript executable to the windows PATH variable.

@ghost
Copy link
Author

ghost commented Oct 5, 2018

Hi thanks for the reply, I have tried the suggested solution by installing ghostscript and adding it system environment and rebooting, still I am getting the same error, kindly advise

04/10/2018 08:22 AM 157,651 stt.pdf

           5 File(s)     60,068,493 bytes
           0 Dir(s)  27,476,795,392 bytes free

(base) C:\work\python\pdf>python
Python 3.6.3 |Anaconda custom (64-bit)| (default, Oct 15 2017, 03:27:45) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.

import camelot
tables = camelot.read_pdf('stt.pdf')
Traceback (most recent call last):
File "", line 1, in
File "C:\ProgramData\Anaconda3\lib\site-packages\camelot\io.py", line 91, in read_pdf
tables = p.parse(flavor=flavor, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\camelot\handlers.py", line 146, in parse
t = parser.extract_tables(p)
File "C:\ProgramData\Anaconda3\lib\site-packages\camelot\parsers\lattice.py", line 316, in extract_tables
self._generate_image()
File "C:\ProgramData\Anaconda3\lib\site-packages\camelot\parsers\lattice.py", line 180, in _generate_image
if "ghostscript" in subprocess.check_output(["gs", "-version"]).decode('utf-8').lower():
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 336, in check_output
**kwargs).stdout
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 403, in run
with Popen(*popenargs, **kwargs) as process:
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 709, in init
restore_signals, start_new_session)
File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 997, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
exit()

(base) C:\work\python\pdf>gswin64c
GPL Ghostscript 9.25 (2018-09-13)
Copyright (C) 2018 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
GS>

@vinayak-mehta
Copy link
Contributor

@raja-rahamath, I've added a fix for this in the master branch, will create a release today that you can install later using pip.

However, if you want that fix ASAP (and help me test the fix!), you can install Camelot with the steps described here.

@vinayak-mehta
Copy link
Contributor

@raja-rahamath You can now install 0.2.1 by pip install camelot-py==0.2.1, which should fix this.

@ghost
Copy link
Author

ghost commented Oct 6, 2018

Thank you very much for the support working perfectly

@arin1996
Copy link

hi,I have installed camelot using conda install command.The package gets installed successfully but when i am trying to use import it it shows following error.
ImportError Traceback (most recent call last)
in
----> 1 import camelot

~\AppData\Local\Continuum\anaconda3\lib\site-packages\camelot_init_.py in
6
7 from .version import version
----> 8 from .io import read_pdf
9 from .plotting import PlotMethods
10

~\AppData\Local\Continuum\anaconda3\lib\site-packages\camelot\io.py in
3 import warnings
4
----> 5 from .handlers import PDFHandler
6 from .utils import validate_input, remove_extra
7

~\AppData\Local\Continuum\anaconda3\lib\site-packages\camelot\handlers.py in
7
8 from .core import TableList
----> 9 from .parsers import Stream, Lattice
10 from .utils import (TemporaryDirectory, get_page_layout, get_text_objects,
11 get_rotation, is_url, download_url)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\camelot\parsers_init_.py in
2
3 from .stream import Stream
----> 4 from .lattice import Lattice

~\AppData\Local\Continuum\anaconda3\lib\site-packages\camelot\parsers\lattice.py in
18 merge_close_lines, get_table_index, compute_accuracy,
19 compute_whitespace)
---> 20 from ..image_processing import (adaptive_threshold, find_lines,
21 find_contours, find_joints)
22

~\AppData\Local\Continuum\anaconda3\lib\site-packages\camelot\image_processing.py in
3 from future import division
4
----> 5 import cv2
6 import numpy as np
7

ImportError: DLL load failed: The specified module could not be found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants