tesseract4java: Tesseract GUI
A graphical user interface for the Tesseract OCR engine. The program has been introduced in the Master’s thesis “Analyses and Heuristics for the Improvement of Optical Character Recognition Results for Fraktur Texts” by Paul Vorbach (German).
Binary distributions and release notes are available in the releases section.
Box editor for training
Glyph overview for easier detection of errors
Comparison view to compare the original document with the perceived result
Evaluation view with a transcription field
Batch export functionality to handle large projects
Building and running the software
This software is written in Java and can be built using Apache Maven. In order to build the software you have to follow these steps:
- Obtain a copy either by cloning the repository or downloading the current zip file.
- Also obtain a copy of a patched version of ocrevalUAtion (zip file).
- Open a command line in the ocrevalUAtion directory and run
mvn clean install.
cdto the tesseract4java directory and run
mvn clean package -Pstandalone. This will include the Tesseract binaries for your platform. You can manually define the platform by providing the option
-Djavacpp.platform=[PLATFORM](available platforms are
After you've run through all steps, the directory "tesseract4java/gui/target" will contain the file
"tesseract4java-[VERSION]-[PLATFORM].jar", which you can run by double-clicking or executing
java -jar tesseract4java-[VERSION]-[PLATFORM].jar.
- This software uses the Tesseract OCR engine (APLv2.0).
- This software uses ocrevalUAtion by Rafael C. Carrasco for providing accuracy measures of the OCR results (GPLv3).
- This software uses the Silk icon set by Mark James (famfamfam.com) (CC-BY-3.0).
tesseract4java - a graphical user interface for the Tesseract OCR engine Copyright (C) 2014-2016 Paul Vorbach This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.