Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tesstrain.sh continues even if text2image crashed. #2005

Closed
zdenop opened this issue Oct 18, 2018 · 7 comments
Closed

tesstrain.sh continues even if text2image crashed. #2005

zdenop opened this issue Oct 18, 2018 · 7 comments
Milestone

Comments

@zdenop
Copy link
Contributor

zdenop commented Oct 18, 2018

see comment: #1999 (comment)

Problem seems to be that "check completion status" in tesstrain_utils.sh is wrong/does not work as expected. Here is short test:

#!/bin/bash

# Logging helper functions.
tlog() {
    echo -e $* 2>&1 1>&2 | tee -a ${LOG_FILE}
}

err_exit() {
    echo -e "ERROR: "$* 2>&1 1>&2 | tee -a ${LOG_FILE}
    exit 1
}

run_command() {
    local cmd=$(which $1)
    if [[ -z ${cmd} ]]; then
      for d in api training; do
        cmd=$(which $d/$1)
        if [[ ! -z ${cmd} ]]; then
          break
        fi
      done
      if [[ -z ${cmd} ]]; then
          err_exit "$1 not found"
      fi
    fi
    shift
    tlog "[$(date)] ${cmd} $@"
    "${cmd}" "$@" 2>&1 1>&2 | tee -a ${LOG_FILE}
    echo "Tested return error is:" $?
    # check completion status
    if [[ $? -gt 0 ]]; then
        err_exit "Program $(basename ${cmd}) failed. Abort."
    fi
}

run_command text2image --?
read -p "Everything is fine. Press any key..."

IMO according design logic run_command should exit script (because text2image --? returns exit code 1) , but it continue and user see "Everything is fine. Press any key..."

@zdenop zdenop added the bug label Oct 18, 2018
@zdenop zdenop added this to the 4.0.0 milestone Oct 18, 2018
@Shreeshrii
Copy link
Collaborator

@zdenop There are other issues with text2image and tesstrain.sh also. Many times it fails to create box file for a certain font but continues on any way and fails at the next step when it does not find the box file.

@zdenop
Copy link
Contributor Author

zdenop commented Oct 18, 2018

@Shreeshrii : did you understand description above and I test I made?

@Shreeshrii
Copy link
Collaborator

@zdenop I needed to test it out to understand it.

The exit status/code is based on the last command run.

    "${cmd}" "$@" 2>&1 1>&2 | tee -a ${LOG_FILE}
    echo "Tested return error is:" $?
    # check completion status

So completion status displays status of echo command and if that is removed then of tee command.

https://unix.stackexchange.com/questions/14270/get-exit-status-of-process-thats-piped-to-another
suggests ways by which we can get the status of text2image completion.

The following works, using pipefail:

#!/bin/bash

# Logging helper functions.
tlog() {
    echo -e $* 2>&1 1>&2 | tee -a ${LOG_FILE}
}

err_exit() {
    echo -e "ERROR: "$* 2>&1 1>&2 | tee -a ${LOG_FILE}
    exit 1
}

run_command() {
    local cmd=$(which $1)
    if [[ -z ${cmd} ]]; then
      for d in api training; do
        cmd=$(which $d/$1)
        if [[ ! -z ${cmd} ]]; then
          break
        fi
      done
      if [[ -z ${cmd} ]]; then
          err_exit "$1 not found"
      fi
    fi
    shift
    tlog "[$(date)] ${cmd} $@"
    set -o pipefail
    "${cmd}" "$@" 2>&1 1>&2 | tee -a ${LOG_FILE}
## echo "Tested return error is:" $?
    # check completion status
    if [[ $? -gt 0 ]]; then
        err_exit "Program $(basename ${cmd}) failed. Abort."
    fi
}

run_command text2image --?
read -p "Everything is fine. Press any key..."

output:

bash -x ./test.sh
+ run_command text2image '--?'
++ which text2image
+ local cmd=/home/ubuntu/tesseract/src/training/text2image
+ [[ -z /home/ubuntu/tesseract/src/training/text2image ]]
+ shift
++ date
+ tlog '[Fri Oct 19 02:51:13 UTC 2018] /home/ubuntu/tesseract/src/training/text2image --?'
+ echo -e '[Fri' Oct 19 02:51:13 UTC '2018]' /home/ubuntu/tesseract/src/training/text2image '--?'
+ tee -a
[Fri Oct 19 02:51:13 UTC 2018] /home/ubuntu/tesseract/src/training/text2image --?
+ set -o pipefail
+ /home/ubuntu/tesseract/src/training/text2image '--?'
+ tee -a
ERROR: Non-existent flag --?
+ [[ 1 -gt 0 ]]
++ basename /home/ubuntu/tesseract/src/training/text2image
+ err_exit 'Program text2image failed. Abort.'
+ echo -e 'ERROR: Program' text2image failed. Abort.
+ tee -a
ERROR: Program text2image failed. Abort.
+ exit 1

@Shreeshrii
Copy link
Collaborator

@stweil Is using pipefail the best way of handling this?

@zdenop
Copy link
Contributor Author

zdenop commented Oct 19, 2018

@Shreeshrii : can you try to add set -euo pipefail at the beginning of tesstrain_utils.sh and run the training?
See this blog for explanation what it should do...

@zdenop zdenop closed this as completed in 4869406 Oct 20, 2018
@mgeerdsen
Copy link
Contributor

@zdenop actually setting -u in tesstrain_utils.sh leads to unbound variable errors in tesstrain_utils.sh and also

/usr/local/bin/language-specific.sh: line 889: FLAGS_webtext_prefix: unbound variable

This appears to fix the former ones at least in my short test:

diff --git a/src/training/tesstrain_utils.sh b/src/training/tesstrain_utils.sh
index 220688dd..b173e21c 100644
--- a/src/training/tesstrain_utils.sh
+++ b/src/training/tesstrain_utils.sh
@@ -213,12 +213,9 @@ parse_flags() {
 
     # Take training text and wordlist from the langdata directory if not
     # specified in the command-line.
-    if [[ -z ${TRAINING_TEXT} ]]; then
-        TRAINING_TEXT=${LANGDATA_ROOT}/${LANG_CODE}/${LANG_CODE}.training_text
-    fi
-    if [[ -z ${WORDLIST_FILE} ]]; then
-        WORDLIST_FILE=${LANGDATA_ROOT}/${LANG_CODE}/${LANG_CODE}.wordlist
-    fi
+    TRAINING_TEXT=${TRAINING_TEXT:-${LANGDATA_ROOT}/${LANG_CODE}/${LANG_CODE}.training_text}
+    WORDLIST_FILE=${TRAINING_TEXT:-${LANGDATA_ROOT}/${LANG_CODE}/${LANG_CODE}.wordlist}
+
     WORD_BIGRAMS_FILE=${LANGDATA_ROOT}/${LANG_CODE}/${LANG_CODE}.word.bigrams
     NUMBERS_FILE=${LANGDATA_ROOT}/${LANG_CODE}/${LANG_CODE}.numbers
     PUNC_FILE=${LANGDATA_ROOT}/${LANG_CODE}/${LANG_CODE}.punc

@zdenop
Copy link
Contributor Author

zdenop commented Oct 21, 2018

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants