Does not have support for windows? #2

ghost · 2021-11-30T00:41:24Z

Hi, first of all the library is really good.

I tried to run this library on windows 10 and it doesn't work. I believe I did everything right, installed Tesseract and ran the following code:

from multilingual_pdf2text.pdf2text import PDF2Text
from multilingual_pdf2text.models.document_model.document import Document
import logging

from utils import write_txt

logging.basicConfig(level=logging.INFO)


def main():
    ## create document for extraction with configurations
    pdf_document = Document(document_path="./pdfs_samples/page1.pdf", language="por")
    pdf2text = PDF2Text(document=pdf_document)
    content = pdf2text.extract()
    for page in content:
        print(page["text"])
        write_txt(page["text"], filename="output_multilingual_pdf2text1.txt")


if __name__ == "__main__":
    main()

I ran this same code on linux(ubuntu 20.04) and it worked perfectly. So, was wondering if the library doesn't support windows?

shahrukhx01 · 2021-11-30T07:57:18Z

@richecr As long as you are able to install Tessaract on Windows this library would work fine. You can take a look at this article Installing and using Tesseract 4 on windows 10

shahrukhx01 closed this as completed Nov 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does not have support for windows? #2

Does not have support for windows? #2

ghost commented Nov 30, 2021

shahrukhx01 commented Nov 30, 2021

Does not have support for windows? #2

Does not have support for windows? #2

Comments

ghost commented Nov 30, 2021

shahrukhx01 commented Nov 30, 2021