Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Looking for text.pdf that does not exist #41

Closed
tadamhicks opened this issue Mar 22, 2016 · 8 comments
Closed

Looking for text.pdf that does not exist #41

tadamhicks opened this issue Mar 22, 2016 · 8 comments
Assignees
Milestone

Comments

@tadamhicks
Copy link

When I set up a watch in a directory, I frequently see

ERROR: Cannot find specified pdf file /path/to/file_text.pdf

This does not happen with every pdf I place in there, but when it does the watchdog tips over.

I don't know what it's looking for, and I would like for the watchdog to stay running despite errors.

@palica
Copy link

palica commented Apr 13, 2016

You can patch pypdfocr_watcher.py like this:

diff --git a/pypdfocr/pypdfocr_watcher.py b/pypdfocr/pypdfocr_watcher.py
index 73581e0..cb78002 100755
--- a/pypdfocr/pypdfocr_watcher.py
+++ b/pypdfocr/pypdfocr_watcher.py
@@ -93,7 +93,7 @@ class PyPdfWatcher(FileSystemEventHandler):

         """
         if ev_path.endswith(".pdf"):
-            if not ev_path.endswith("_ocr.pdf"):
+            if not ev_path.endswith(("_ocr.pdf","_text.pdf")):^M
                 PyPdfWatcher.events_lock.acquire()
                 if not ev_path in PyPdfWatcher.events:
                     PyPdfWatcher.events[ev_path] = time.time()


@virantha
Copy link
Owner

Thanks, I'll try to get this into the next release.

@virantha virantha added this to the 0.9.1 milestone Jun 23, 2016
@virantha virantha self-assigned this Jun 23, 2016
@mmatiaschek
Copy link

I get this error regularly - a release of 0.9.1 would be great!

@mmatiaschek
Copy link

Patch works fine, i incorporated it into my docker image...
https://hub.docker.com/r/mmatiaschek/pypdfocr/

@danmash
Copy link

danmash commented Oct 2, 2016

@mmatiaschek thank you!

@danmash
Copy link

danmash commented Nov 20, 2016

@virantha You made a typo in pypdfocr_watcher.py
if not ev_path.endswith(("_ocr.pdf", "_test.pdf")):

_test instead of _text

@mattobrien415
Copy link

I am still having this problem using 0.9.1! My monitoring dies. I checked pypdfocr_watcher.py, the fix is clearly there. Am I the only one?

Only thing I can think of, is that I'm running the process over multiple directories at once.

@solartune
Copy link

I have the same problem. @virantha please, check the line
if not ev_path.endswith(("_ocr.pdf", "_test.pdf")):
There should be _text as said @danmash

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants