You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OCRWorker.php is running as a systemd service, as described in the documentation, with the correct user and group www-data same as the web server.
Yet, OCRWorker fails to ocr a newly uploaded pdf file.
Bug report / Feature request
Expected Behavior
The documentation states that, OCRWorker.php is supposed to automatically ocr a newly added pdf file.
Current Behavior
It doesn't ocr the pdf file, even after waiting about an hour. However, the command in the overlay menu, does indeed ocr the file.
Would prefer the automatic ocr to be working.
Possible Solution
Where to look to troubleshoot this?
Is there a log?
Has anyone had the same issue with OCRWorker and solved it?
Steps to Reproduce (for bugs)
Install the ocr app, and its prerequisites, as describe in the documentation.
Install the OCRWorker.php daemon using the systemd option as detailed in the wiki.
Upload a pdf file containing a scanned document, to owncloud/nextcloud.
Watch the process list, OCRWorker.php is doing nothing, even after a long time.
Click on the overlay menu, start the the ocr manually, watch the process list, the daemon OCRWorker and tesseract run with high load for 10 seconds, and a new file is produced adjacent to the original file, with _OCR.pdf suffix, correctly containing the ocr'ed data.
Context
Your Environment
OCR version used: latest version from here.
Browser Name and version: Firefox 52.0b3 latest version.
Operating System and version (desktop or mobile): Windows 10 latest updates. Linux Debian 8 server.
ownCloud/nextcloud version: (see ownCloud admin page or version.php) latest version nextcloud.
PHP version 7.0
Database version Mysql Mariadb 5.6
Are you using encryption: yes/no No.
Log File Content (nextcloud/owncloud.log of the "data"-directory)
The text was updated successfully, but these errors were encountered:
Actually the steps you are describing are absolutely what the app is about to do. It offers the possibility to ocr a file (image / pdf). It does not work directly after a new file is added to nextcloud. As this isn't the behaviour, what others would expect. One example:
You add/upload your photos (jpg - supported type for ocr) of your last vacation and there isn't much text on it. It's just a bunch of photos from sightseeing and so on. But if the app would trigger a ocr process for all newly added files, this would be the case.
So the manual start has to be performed by the user.
If you have the need for another behavior, you can for example fork the github repository and add a hook for a newly uploaded file.
I will close this issue, as the app is supposed to work like this.
When you upload a new PDF file, would it be accurate to say, owncloud creates the file and/or opens it for writing, completely writes it, and closes it? After it's closed for the first time, is it the hook for postCreate or postWrite that gets triggered ?
OCRWorker.php
is running as asystemd
service, as described in the documentation, with the correct user and groupwww-data
same as the web server.Yet, OCRWorker fails to ocr a newly uploaded pdf file.
Bug report / Feature request
Expected Behavior
The documentation states that,
OCRWorker.php
is supposed to automatically ocr a newly added pdf file.Current Behavior
It doesn't ocr the pdf file, even after waiting about an hour. However, the command in the overlay menu, does indeed ocr the file.
Would prefer the automatic ocr to be working.
Possible Solution
Where to look to troubleshoot this?
Is there a log?
Has anyone had the same issue with
OCRWorker
and solved it?Steps to Reproduce (for bugs)
OCRWorker.php
daemon using thesystemd
option as detailed in the wiki.OCRWorker.php
is doing nothing, even after a long time.OCRWorker
andtesseract
run with high load for 10 seconds, and a new file is produced adjacent to the original file, with_OCR.pdf
suffix, correctly containing the ocr'ed data.Context
Your Environment
Log File Content (nextcloud/owncloud.log of the "data"-directory)
The text was updated successfully, but these errors were encountered: