This repository was archived by the owner on Oct 3, 2022. It is now read-only.

Description
Overview
On Void Linux, the tesseract binary resides at /usr/bin/tesseract-ocr due to a naming conflict with the game Tesseract. It would be nice if the paths to the OCR engine could be explicitly specified, e.g. via a command line option, environment variable, or configuration file.
Version Information
$ ocrodjvu --version
ocrodjvu 0.11
+ Python 2.7.16
+ subprocess32
+ python-djvulibre 0.8.4
+ lxml 4.3.3
$ lsb_release --all
LSB Version: 1.0
Distributor ID: VoidLinux
Description: Void Linux
Release: rolling
Codename: void
Comments
For the moment, I am hacking around this issue by packing ocrodjvu on my distro with the following patch:
--- a/lib/engines/tesseract.py
+++ b/lib/engines/tesseract.py
@@ -111,7 +111,7 @@
image_format = image_io.TIFF
needs_utf8_fix = True
- executable = utils.property('tesseract')
+ executable = utils.property('tesseract-ocr')
extra_args = utils.property([], shlex.split)
use_hocr = utils.property(None, int)
fix_html = utils.property(0, int)