-
Notifications
You must be signed in to change notification settings - Fork 0
Home
This software was designed with an open-source approach in mind. Thus we have focused to keep software pre-requisites unlicensed. The software relies a lot on the Linux environment as well as the open source community of Tesseract Ocr and OpenCv.
Since the server was developed in Java we had to use a wrapper called Tess4J which relies on the C++ header files (which are only accessible by installing tesseract - unfortunately).
- Java 8 +
- Tesseract v4.0 +
The operating system is quite important in the success of this software since we had to test the software on a "current" operating system to make provision for as much support as possible and "future proofing".
The software solution was tested to run on Ubuntu 16.04 + (Including Ubuntu 18.04). This guide will specifically focus on the installation of the software solution on an Ubuntu 18.04 environment (since it's the latest and will receive the most support in the coming months).
Install Ubuntu 18.04 server on any PC or VM which has access to the internet and at least 1GB RAM and 20GB Hard drive space. The recommended processor for the job is a Dual Core processor with 4 threads.
_Since this software relies a lot on Threading tasks and relying on concurrency it would be best to take a processor of higher single core performance with high thread counts. _
Ensure that the server is on the latest Kernel and software update.
sudo apt update
sudo apt upgrade
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev
sudo apt install git
sudo apt install maven
git clone https://github.com/Benehiko/ocr-core
cd ocr-core/
mvn clean install
This will generate a jar file called 'OCRv2-3.1-jar-with-dependencies.jar' inside the target/ folder of ocr-core/ copy this jar anywhere you would like and execute it like so:
java -jar OCRv2-3.1-jar-with-dependencies.jar