Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insufficient error message when output file cannot be created #1424

Closed
Shreeshrii opened this issue Mar 25, 2018 · 13 comments
Closed

Insufficient error message when output file cannot be created #1424

Shreeshrii opened this issue Mar 25, 2018 · 13 comments

Comments

@Shreeshrii
Copy link
Collaborator

@stweil

With the new script subdirectory for the scripts traineddata,

tesseract gives error if language is given as script/Devanagari.

It works when /script is added to the tessdata-dir path and language is given as Devanagari.

I am copying my generic bash script for ruuning tesseract below, followed by console output.

@Shreeshrii
Copy link
Collaborator Author

#!/bin/bash
SOURCE="."
set -- "$SOURCE"/hin-eng.png
for img_file; do
echo -e  "\r\n File: $img_file"
for tessdir in tessdata_fast 
do    
    for lang in "Devanagari"
    do
        for psm_counter in 3 6
        do
            echo -e  "\r\n ********** $lang --oem 1 psm " $psm_counter " \r\n"
            time tesseract --tessdata-dir /mnt/c/Users/User/shree/$tessdir/script   "${img_file}" "${img_file%.*}$tessdir-$psm_counter-${lang}"  --oem 1 --psm $psm_counter -l ${lang}
        done
    done
    for lang in "script/Devanagari"
    do
        for psm_counter in 3 6
        do
            echo -e  "\r\n ********** $lang --oem 1 psm " $psm_counter " \r\n"
            time tesseract --tessdata-dir /mnt/c/Users/User/shree/$tessdir   "${img_file}" "${img_file%.*}$tessdir-$psm_counter-${lang}"  --oem 1 --psm $psm_counter -l ${lang}
        done
    done
done
done


@Shreeshrii
Copy link
Collaborator Author

#bash ./tess1.sh

File: ./hin-eng.png

********** Devanagari --oem 1 psm 3

Tesseract Open Source OCR Engine v4.0.0-beta.1-42-g3fa25 with Leptonica

real 0m7.072s
user 0m6.359s
sys 0m0.516s

********** Devanagari --oem 1 psm 6

Tesseract Open Source OCR Engine v4.0.0-beta.1-42-g3fa25 with Leptonica

real 0m6.827s
user 0m6.125s
sys 0m0.500s

********** script/Devanagari --oem 1 psm 3

Tesseract Open Source OCR Engine v4.0.0-beta.1-42-g3fa25 with Leptonica
Error during processing.

real 0m1.424s
user 0m0.781s
sys 0m0.453s

********** script/Devanagari --oem 1 psm 6

Tesseract Open Source OCR Engine v4.0.0-beta.1-42-g3fa25 with Leptonica
Error during processing.

real 0m1.411s
user 0m0.734s
sys 0m0.438s

@stweil
Copy link
Contributor

stweil commented Mar 25, 2018

That looks strange, as it works for me:

tesseract --tessdata-dir /tessdata_fast testing/devatest.png - -l script/Devanagari
CDAC-GISTYogesh A र
सभी मनुष्यों को गौरव और अधिकारों के मामले में जन्मजात स्वतन्त्रता और समानता प्रप्त हे ।

उन्हें बुद्धि और अन्तरात्मा की देन प्राप्त है और परस्पर उन्हें भाईचारे के भाव से बर्ताव करना चाहिए ।

Gargi
सभी मनुष्यों को गौरव और अधिकारों के मामले में जन्मजात स्वतन्त्रता और समानता प्राप्त हे।

उन्हें बुद्धि और अन्तरात्मा की देन प्राप्त हैऔर परस्पर उन्हें भाईचारे के भाव से बर्ताव करना चाहिए ।
JanaHindi

सभी मनुष्यों को गौरव और अधिकारों के मामले में जन्मजात स्वतन्त्रता और समानता प्रास हे ।
उन्हें बुद्धि और अन्तरात्मा की देन प्रास है और पररूपर उन्हें भाईचारे के भाव से बर्ताव करना चाहिए ।

Kalimati
सभी मनुष्यों को गोरव ओर अधिकारों के मामले में जन्मजात स्वतन्त्रता ओर समानता प्राप्त हे ।
उन्हें बुद्धि ओर अन्तरात्मा की देन प्राप्त है ओर परस्पर उन्हें भाईचारे के भाव से बर्ताव करना चाहिए ।

@Shreeshrii
Copy link
Collaborator Author

Could it be related to how I am passing the variable in bash?

@stweil
Copy link
Contributor

stweil commented Mar 25, 2018

Error during processing. is a typical case where the user interface needs to be improved, because it does not tell what the real problem was.

Could you please create a very simple test environment with a newly created traineddata directory containing nothing but traineddata/script/Devanagari.traineddata from tessdata_fast? Then run tesseract --tessdata-dir traineddata testing/devatest.png - -l script/Devanagari. That works for me.

@stweil
Copy link
Contributor

stweil commented Mar 25, 2018

Could it be related to how I am passing the variable in bash?

I see no obvious problem with the script. Try running your script with bash -x ./tess1.sh to see what is really called.

@Shreeshrii
Copy link
Collaborator Author

Thank you.

Using bash -x helped figure out the problem.

I was using $lang as part of output filename for comparing output using diff languages.

Script/Devanagari was probably looking to use a subdir because of the /.

I removed $lang and it worked fine.

Thanks. Closing the issue.

@stweil
Copy link
Contributor

stweil commented Mar 25, 2018

Yes, that's the real problem. tesseract --tessdata-dir traineddata/ testing/devatest.png abc/x -l script/Devanagari fails for me, too, with "Error during processing".

Please reopen the issue and change the title to something like "Insufficient error message when output file cannot be created".

@Shreeshrii Shreeshrii reopened this Mar 25, 2018
@Shreeshrii Shreeshrii changed the title Error using script/Devanagari as language Insufficient error message when output file cannot be created Mar 25, 2018
@godofcheerup
Copy link

Hi @Shreeshrii Could I get the bash script you mentioned ?

@Shreeshrii
Copy link
Collaborator Author

@stweil Is this something you plan to address for 4.0.0?

@stweil
Copy link
Contributor

stweil commented Oct 18, 2018

It's addressed by pull request #2002.

@amitdo
Copy link
Collaborator

amitdo commented Oct 18, 2018

We should release 4.0.0 before #2019 comes ...:-).

@Shreeshrii
Copy link
Collaborator Author

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants