Further testing OCR #4

christian-oreilly · 2017-09-08T15:53:47Z

We previously encountered issues when performing OCR on some document (documented in ocrmypdf/OCRmyPDF#97). Since this issue has been closed by the developers of OCRmyPDF, we need to revisit this issue to check that OCR is now working as expected by the REST server.

pafonta · 2018-08-21T09:52:02Z

OCRmyPDF should be upgraded. After upgrade, the text file should be obtained with:

ocrmypdf --sidecar output.txt input.pdf output.pdf

--sidecar is a feature of v5.0:

Add a new feature, --sidecar, which allows creating “sidecar” text files which contain the OCR results in plain text. These OCR text is more reliable than extracting text from PDFs. Closes #126.

christian-oreilly added the low_priority label Oct 17, 2017

pafonta removed the low priority label Jan 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Further testing OCR #4

Further testing OCR #4

christian-oreilly commented Sep 8, 2017

pafonta commented Aug 21, 2018

Further testing OCR #4

Further testing OCR #4

Comments

christian-oreilly commented Sep 8, 2017

pafonta commented Aug 21, 2018