Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cant install on windows using pip #10

Closed
vnyk opened this issue Dec 30, 2017 · 12 comments
Closed

Cant install on windows using pip #10

vnyk opened this issue Dec 30, 2017 · 12 comments

Comments

@vnyk
Copy link

vnyk commented Dec 30, 2017

pip install pdftotext Collecting pdftotext Using cached pdftotext-2.0.1.tar.gz Installing collected packages: pdftotext Running setup.py install for pdftotext ... error Complete output from command "c:\users\vinayak sharma\appdata\local\programs\python\python35\python.exe" -u -c "import setuptools, tokenize;__file__='C:\\Users\\Local\\Temp\\pip-build-6eh2vxu8\\pdftotext\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\VINAYA~1\AppData\Local\Temp\pip-kyy39x3a-record\install-record.txt --single-version-externally-managed --compile: WARNING: pkg-config not found--guessing at poppler version. If the build fails, install pkg-config and try again. running install running build running build_ext building 'pdftotext' extension error: Unable to find vcvarsall.bat

----------------------------------------

Command ""c:\users\Local\\Temp\\pip-build-6eh2vxu8\\pdftotext\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\VINAYA~1\AppData\Local\Temp\pip-kyy39x3a-record\install-record.txt --single-version-externally-managed --compile" failed with error code 1 in C:\Users\VINAYA~1\AppData\Local\Temp\pip-build-6eh2vxu8\pdftotext\

@jalan
Copy link
Owner

jalan commented Dec 31, 2017

Sorry, I don't use Windows, but I think you would need three things:

  • a compiler, like Visual Studio
  • python native development tools. It looks like Visual Studio has them now
  • poppler development files

If you figure something out, please reply here, and I can add it to the README.

@thunderkid
Copy link

I'm also having problems installing on windows, but perhaps different ones. I've installed VS2017. Running pip install pdftotext gives this error:

pdftotext/pdftotext.cpp(4): fatal error C1083: Cannot open include file: 'poppler/cpp/poppler-document.h': No such file or directory

Googling around, it seems poppler is some pdf rendering library, so I tried to install it. Following its installation instructions conda install -c conda-forge poppler gives an error saying the package is not available. And I almost all of what I can find about it is 5-10 years old. I'm therefore assuming it's some old code automatically included in *nix and not really used much in windows.

@kwkelly
Copy link

kwkelly commented Jan 24, 2018

I too had trouble installing this on a restricted windows system. However, it might be possible to build an alternative to this package for windows based on xpdf. Poppler was originally based on xpdf (and maybe still is?). I've got a rudimentary xpdf-based pdftotext working using the subprocess module. The xpdf binaries do not need to be installed, so they can simply be downloaded and called on windows. Curiously, there are some small differences I'm seeing in the text extraction on some tables.

@manish59
Copy link

manish59 commented Apr 6, 2018

@kwkelly can you let me know how use used pdftotext using subprocess as when i give the full path of pdf it is saying it didnt find the path of the pdf

@mounikarudra
Copy link

I am getting the same error. Was anyone able to resolve this issue or found any workaround for the text extraction??

@manish59
Copy link

manish59 commented Jul 5, 2018 via email

@amansingh9097
Copy link

has anyone found any work around for this? The issue still persists..

@SjRKV
Copy link

SjRKV commented Jul 4, 2019

This is a part of the error which I am getting. Anybody have any solution for this:- 'pip install pdftotext'

ERROR: WARNING: pkg-config not found--guessing at poppler version.
If the build fails, install pkg-config and try again.
c:\python 3.7.1 (x86)\lib\distutils\dist.py:274: UserWarning: Unknown distribution option: 'long_description_content_type'
warnings.warn(msg)

running install
running build
running build_ext
building 'pdftotext' extension
creating build
creating build\temp.win32-3.7
creating build\temp.win32-3.7\Release

@vishwas097
Copy link

@SjRKV did you find any solution for this error?

@ruian0
Copy link

ruian0 commented Mar 30, 2021

In case you come here and did not check the readme, this can be fixed on ubuntu with
sudo apt install build-essential libpoppler-cpp-dev pkg-config python3-dev

@amansingh9097
Copy link

In case you come here and did not check the readme, this can be fixed on ubuntu with
sudo apt install build-essential libpoppler-cpp-dev pkg-config python3-dev

In case you didn't read this Issue itself, this was specifically raised w.r.t. Windows!

@ruian0
Copy link

ruian0 commented May 18, 2021

In case you come here and did not check the readme, this can be fixed on ubuntu with
sudo apt install build-essential libpoppler-cpp-dev pkg-config python3-dev

In case you didn't read this Issue itself, this was specifically raised w.r.t. Windows!

In case my comment confused you for whatever reason, it was for Ubuntu users when they search the error since without running the command one would see the same error in Ubuntu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants