Skip to content
Nextcloud OCR (optical character recoginition) processing for images and PDF with tesseract-ocr, OCRmyPDF and message queueing for asynchronous purpose.
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github Gearman and related stuff (#6) Sep 12, 2016
.tx Fix tx config Feb 12, 2018
appinfo Update supported php version to 7.2 Mar 11, 2018
ci Added minor fixes for nc13, translations and git releasing. Mar 10, 2018
css Added small fix regarding overlapping header row file actions Mar 11, 2018
img 79: Webpack setup and restructured client behaviour Jun 8, 2017
js Refactored according to tslint and cleaned up some strings Mar 18, 2018
l10n [tx-robot] updated from transifex Apr 21, 2019
lib Refactored according to tslint and cleaned up some strings Mar 18, 2018
redis 102: Added Redis AUTH Jul 22, 2017
screenshots 98: Gif works now in Docker Jul 6, 2017
templates 102: Added Redis AUTH Jul 22, 2017
tests/Unit
worker Fix English typo Mar 10, 2018
.gitignore
.scrutinizer.yml 79: Webpack setup and restructured client behaviour Jun 8, 2017
.travis.yml Update supported php version to 7.2 Mar 11, 2018
AUTHORS.md Added a fix for #122 and changed output format of image processing to… Jan 28, 2018
COPYING First commit Jul 14, 2016
README.md Update README.md Mar 29, 2019
personal.php 79: Webpack setup and restructured client behaviour Jun 8, 2017

README.md

This project is not updated or maintained anymore. At the moment there is too much to do in other projects, so I won't have time for this in the near future. Sorry :-/

OCR

OCR

Build Status Scrutinizer Code Quality Code Coverage License: AGPL v3

Nextcloud OCR (optical character recognition) processing for images and PDF with tesseract-ocr and OCRmyPDF brings OCR capability to your Nextcloud. The app uses a docker container with tesseract-ocr, OCRmyPDF and communicates over redis in order to process images (png, jpeg, tiff) and PDF asynchronously and save the output file to the source folder in nextcloud. That for example enables you to search in it. (Hint: currently not all PDF-types are supported, for more information see here)

Prerequisites, Requirements and Dependencies

The OCR app has some prerequisites:

  • Nextcloud 12 or 13. For older versions take an older major version of this app.
  • Linux server as environment. (tested with Debian 8 and Ubuntu 14.04 (Trusty)) currently not compatible to ARM processors like raspberry
  • Docker is used for processing files. tesseract-ocr and OCRmyPDF reside in a docker container.
  • php-redis is used for the communication and has to be a part of your php.

Limitations

Currently the app is not working with any activated encryption, nor is it working with files shared via external storage or federated sharing. This has to be considered. If one wants to process such a file, it must be copied to the local environment.

For further information see the homepage or the appropriate documentation in the wiki.

Installation

Install the app from the Nextcloud AppStore or download the release package from github (NOT the sources) and place the content in nextcloud/apps/ocr/.

Please consider: The app will not work as long as the Docker container isn't running. (more information in the wiki)

Administration and Usage

Please read the related topics in the wiki.

Disclaimer

The software is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

You can’t perform that action at this time.