Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocrmypdf with incrontab / inotify #90

Closed
segro21 opened this issue Sep 25, 2014 · 3 comments
Closed

ocrmypdf with incrontab / inotify #90

segro21 opened this issue Sep 25, 2014 · 3 comments
Labels

Comments

@segro21
Copy link

segro21 commented Sep 25, 2014

Hi,
this is not realey an issue of ocrmypdf, but I'm trying to get this to work on an samba-share with incrontab /inotify.
I've created a folder and watch activities in this folder with incrontab. That works fine for things like pdftk, but nothing happens on ocrmypdf. Syslog shows the command correct, but then ends.

my incrontab -e
/home/pdfin IN_CLOSE_WRITE /opt/ocrmypdf/ocrmypdf.sh $@/$# $@/out/$#
/home/pdfin/out IN_CLOSE_WRITE /bin/rm $@/../$#
->this works fine for stamping pdfs with logo
/home/stamp IN_CLOSE_WRITE /usr/bin/pdftk $@/$# stamp $@/BB.pdf output $@/out/$#
/home/stamp/out IN_CLOSE_WRITE /bin/rm $@/../$#

Any ideas?

@Manuel-J
Copy link

Manuel-J commented Feb 3, 2015

Hello,

incrond behaves like cron and runs in a sparse environment.
But for me everything works fine if I use a non root incrontab and a bash script (use the full path for the script within incrontab!) which calls ocrmypdf.sh.

$ sudo nano /usr/bin/incrondOCR.sh
#!/bin/bash
case $1 in
*_ocr.pdf)
        echo "File $1 is an ocr pdf.";
        exit 0;;
*.pdf)
        echo "Run ocr on file $1.";
        /usr/share/OCRmyPDF/OCRmyPDF.sh -l deu $1 ${1%.pdf}_ocr.pdf;
        exit 0;;
*)
        echo "Failure: File $1 is not an pdf.";
        exit 1;;
esac
$ sudo sh -c "echo 'nobody' >> /etc/incron.allow"
$ sudo incrontab -e -u nobody
/srv/data/scans/ IN_CLOSE_WRITE /usr/bin/incrondOCR.sh $@$#

@fpetry
Copy link

fpetry commented Dec 4, 2015

EDIT: Solved it by changing the path in the script (see below). Incron is working in "/", and therefore is not able to create som temporary files...

I just tried a similar thing with version 3.1 and ocrmypdf fails while opening a database :-( :

Dec  4 16:47:10 notebook incrond[15320]: (six86) CMD (/media/DatenExt/Scanner/ocr.sh /media/DatenExt/Scanner/bbbb.pdf)
Dec  4 16:47:10 notebook ocr.sh: Traceback (most recent call last):
Dec  4 16:47:10 notebook ocr.sh:   File "/usr/local/bin/ocrmypdf", line 11, in <module>
Dec  4 16:47:10 notebook ocr.sh:     sys.exit(run_pipeline())
Dec  4 16:47:10 notebook ocr.sh:   File "/usr/local/lib/python3.5/dist-packages/ocrmypdf/main.py", line 898, in run_pipeline
Dec  4 16:47:10 notebook ocr.sh:     cmdline.run(options)
Dec  4 16:47:10 notebook ocr.sh:   File "/usr/local/lib/python3.5/dist-packages/ruffus/cmdline.py", line 824, in run
Dec  4 16:47:10 notebook ocr.sh:     **appropriate_options)
Dec  4 16:47:10 notebook ocr.sh:   File "/usr/local/lib/python3.5/dist-packages/ruffus/task.py", line 5567, in pipeline_run
Dec  4 16:47:10 notebook ocr.sh:     target_tasks, forcedtorun_tasks)
Dec  4 16:47:10 notebook ocr.sh:   File "/usr/local/lib/python3.5/dist-packages/ruffus/task.py", line 4452, in _pipeline_prepare_to_run
Dec  4 16:47:10 notebook ocr.sh:     job_history = open_job_history(history_file)
Dec  4 16:47:10 notebook ocr.sh:   File "/usr/local/lib/python3.5/dist-packages/ruffus/ruffus_utility.py", line 211, in open_job_history
Dec  4 16:47:10 notebook ocr.sh:     return dbdict.open(history_file, picklevalues=True)
Dec  4 16:47:10 notebook ocr.sh:   File "/usr/local/lib/python3.5/dist-packages/ruffus/dbdict.py", line 363, in open
Dec  4 16:47:10 notebook ocr.sh:     return DbDict(filename, picklevalues)
Dec  4 16:47:10 notebook ocr.sh:   File "/usr/local/lib/python3.5/dist-packages/ruffus/dbdict.py", line 112, in __init__
Dec  4 16:47:10 notebook ocr.sh:     self.con = sqlite3.connect(filename)
Dec  4 16:47:10 notebook ocr.sh: sqlite3.OperationalError: unable to open database file

My script:

#!/bin/bash
exec 1> >(logger -s -t $(basename $0)) 2>&1

cd /media/DatenExt/Scanner/out/

INFILE=${1##*/} 
OUTFILE=${INFILE%.*}_ocr.pdf
/usr/local/bin/ocrmypdf -l deu $1 /media/DatenExt/Scanner/out/$OUTFILE

I suppose it's something with the environment of incron, executing the script seperately works just fine.

@jbarlow83
Copy link
Collaborator

Yes, it needs to be create that file, so needs a writable CWD. I think. I
will check.

The active project is now jbarlow83/ocrmypdf. Please record any other
issues there. Thanks.
On Fri, Dec 4, 2015 at 07:54 six86 notifications@github.com wrote:

I just tried a similar thing with version 3.1 and ocrmypdf fails while
opening a database :-( :

'''
Dec 4 16:47:10 notebook incrond[15320]: (six86) CMD
(/media/DatenExt/Scanner/ocr.sh /media/DatenExt/Scanner/bbbb.pdf)
Dec 4 16:47:10 notebook ocr.sh: Traceback (most recent call last):
Dec 4 16:47:10 notebook ocr.sh: File "/usr/local/bin/ocrmypdf", line 11,
in
Dec 4 16:47:10 notebook ocr.sh: sys.exit(run_pipeline())
Dec 4 16:47:10 notebook ocr.sh: File
"/usr/local/lib/python3.5/dist-packages/ocrmypdf/main.py", line 898, in
run_pipeline
Dec 4 16:47:10 notebook ocr.sh: cmdline.run(options)
Dec 4 16:47:10 notebook ocr.sh: File
"/usr/local/lib/python3.5/dist-packages/ruffus/cmdline.py", line 824, in run
Dec 4 16:47:10 notebook ocr.sh: **appropriate_options)
Dec 4 16:47:10 notebook ocr.sh: File
"/usr/local/lib/python3.5/dist-packages/ruffus/task.py", line 5567, in
pipeline_run
Dec 4 16:47:10 notebook ocr.sh: target_tasks, forcedtorun_tasks)
Dec 4 16:47:10 notebook ocr.sh: File
"/usr/local/lib/python3.5/dist-packages/ruffus/task.py", line 4452, in

_pipeline_prepare_to_run Dec 4 16:47:10 notebook ocr.sh: job_history =
open_job_history(history_file) Dec 4 16:47:10 notebook ocr.sh: File
"/usr/local/lib/python3.5/dist-packages/ruffus/ruffus_utility.py", line
211, in open_job_history Dec 4 16:47:10 notebook ocr.sh: return
dbdict.open(history_file, picklevalues=True) Dec 4 16:47:10 notebook
ocr.sh: File "/usr/local/lib/python3.5/dist-packages/ruffus/dbdict.py",
line 363, in open Dec 4 16:47:10 notebook ocr.sh: return DbDict(filename,
picklevalues) Dec 4 16:47:10 notebook ocr.sh: File
"/usr/local/lib/python3.5/dist-packages/ruffus/dbdict.py", line 112, in
init
Dec 4 16:47:10 notebook ocr.sh: self.con = sqlite3.connect(filename)
Dec 4 16:47:10 notebook ocr.sh: sqlite3.OperationalError: unable to open
database file
'''

My script:
'''bash
#!/bin/bash
exec 1> >(logger -s -t $(basename $0)) 2>&1

INFILE=${1##
/} OUTFILE=${INFILE%.}_ocr.pdf
/usr/local/bin/ocrmypdf -l deu $1 /media/DatenExt/Scanner/out/$OUTFILE
'''

I suppose it's something with the environment of incron, executing the
script seperately works just fine.


Reply to this email directly or view it on GitHub
#90 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants