Skip to content

Commit

Permalink
updated to pack tesseract 5
Browse files Browse the repository at this point in the history
  • Loading branch information
chpoit committed Feb 3, 2022
1 parent 9f85ee9 commit bd62032
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 4 deletions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,12 +225,13 @@ Normal instructions apply once the application starts.
## Windows
- Requirements
- **Google Tesseract** installed and in path
- A copy of the version 4 is bundled with the release. Just run it, no extra packages needed
- You can download the installer here version here: [Installer here](https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v4.1.0.20190314.exe)
- A copy of the version 5.0.1 is bundled with the release. Just run it, no extra packages needed
- You can download the installer here version here: [Installer here](https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v5.0.1.20220118.exe)
- Make sure it's in path
- Python3 installed and in path
- Running:
- `source env/bin/activate` (if you use a virtual env)
- `python main.py`

# Extra command-line options
If you run from source, or call the executable from the terminal you can make use of the following flags/arguments to achieve different functionality.
Expand Down
2 changes: 1 addition & 1 deletion data/versions.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"app": "1.5.2",
"app": "1.5.3",
"skills": "3.0",
"languages": {
"eng": "1.0.1",
Expand Down
15 changes: 14 additions & 1 deletion src/tesseract/tesseract_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ def _is_pyinstaller():

def _get_pyinstaller_tesseract_path():
base_path = sys._MEIPASS
bundled_path = os.path.join(base_path, "Tesseract-OCR", "libtesseract-4.dll")
bundled_path = os.path.join(base_path, "Tesseract-OCR", "libtesseract-5.dll")
return bundled_path


Expand All @@ -37,13 +37,26 @@ def find_tesseract():
# TODO: Make this resilient to "change" (tesseract version), probably not necessary
locations = [
ctypes.util.find_library("libtesseract-4"), # win32
ctypes.util.find_library("libtesseract-5"), # win32
ctypes.util.find_library("libtesseract302"), # win32 version 3.2
ctypes.util.find_library("libtesseract"), # others
ctypes.util.find_library("tesseract"), # others
]

if WINDOWS:
locations += [
os.path.join(
os.getenv("ProgramW6432"), "Tesseract-OCR", "libtesseract-5.dll"
),
os.path.join(
os.getenv("LOCALAPPDATA"), "Tesseract-OCR", "libtesseract-5.dll"
),
os.path.join(
os.getenv("ProgramFiles"), "Tesseract-OCR", "libtesseract-5.dll"
),
os.path.join(
os.getenv("programfiles(x86)"), "Tesseract-OCR", "libtesseract-5.dll"
),
os.path.join(
os.getenv("ProgramW6432"), "Tesseract-OCR", "libtesseract-4.dll"
),
Expand Down

0 comments on commit bd62032

Please sign in to comment.