Skip to content
This repository has been archived by the owner. It is now read-only.

Further testing OCR #4

Open
christian-oreilly opened this issue Sep 8, 2017 · 1 comment
Open

Further testing OCR #4

christian-oreilly opened this issue Sep 8, 2017 · 1 comment

Comments

@christian-oreilly
Copy link
Contributor

We previously encountered issues when performing OCR on some document (documented in ocrmypdf/OCRmyPDF#97). Since this issue has been closed by the developers of OCRmyPDF, we need to revisit this issue to check that OCR is now working as expected by the REST server.

@pafonta
Copy link
Contributor

pafonta commented Aug 21, 2018

OCRmyPDF should be upgraded. After upgrade, the text file should be obtained with:

ocrmypdf --sidecar output.txt input.pdf output.pdf

--sidecar is a feature of v5.0:

Add a new feature, --sidecar, which allows creating “sidecar” text files which contain the OCR results in plain text. These OCR text is more reliable than extracting text from PDFs. Closes #126.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants