Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Tesseract Multiple-Threading Issue #1019
I am trying to run multiple concurrent tesseract processes but it seems tesseract is deadlocking and the processes never returns.
I am testing tesseract 4.0, running it from a multi-threaded Java service. The service runs 5 threads and each thread initiates a single tesseract process. When multiple processes of tesseract are running at the same time, all processes hang indefinitely, they never exit nor write anything to the output file. I tried to set a 2 minute timeout for the tesseract process and destroy it afterwards but that complicated the issue as it seems tesseract spawns its own child processes and those remain running and never return. I tried waiting for a full hour and the processes never exited and htop was showing me all cores have 100% of their power consumed by tesseract.
The command is: sudo bash -c tesseract image_list results -l lang -c preserve_interword_spaces=1 --tessdata-dir ./tessdata --oem 1