Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't run lstm training from tesseract 4.0 #1115

Closed
KevinZhuYF opened this issue Sep 10, 2017 · 4 comments
Closed

Can't run lstm training from tesseract 4.0 #1115

KevinZhuYF opened this issue Sep 10, 2017 · 4 comments

Comments

@KevinZhuYF
Copy link

I'm trying to run this demo with tesseract 4.0 lstm training :

mkdir -p ~/tesstutorial/engoutput
training/lstmtraining --debug_interval 100
--traineddata ~/tesstutorial/engtrain/eng/eng.traineddata
--net_spec '[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c111]'
--model_output ~/tesstutorial/engoutput/base --learning_rate 20e-4
--train_listfile ~/tesstutorial/engtrain/eng.training_files.txt
--eval_listfile /tesstutorial/engeval/eng.training_files.txt
--max_iterations 5000 &>
/tesstutorial/engoutput/basetrain.log

and system returns me a message as this:
Segmentation fault (core dumped)

When I checked the log file which is generated by this process, I got this:

mgr_.Init(traineddata_path.c_str()):Error:Assert failed:in file ../lstm/lstmtrainer.h, line 110

What should I do to make it work?


Environment

  • Tesseract Version: <4.0>
  • Commit Number:
  • Platform: <Ubuntu 16.04>

Current Behavior:

Expected Behavior:

Suggested Fix:

@Shreeshrii
Copy link
Collaborator

Ray is currently updating the repo with new code. You can try a commit from about a week back and check.

@Shreeshrii
Copy link
Collaborator

Shreeshrii commented Sep 15, 2017

The following example shows the command line for training from scratch. Try it with the default training data created with the command-lines above.

Did you create the required traineddata by

training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng --linedata_only \
  --noextract_font_properties --langdata_dir ../langdata \
  --tessdata_dir ./tessdata --output_dir ~/tesstutorial/engtrain

@amitdo
Copy link
Collaborator

amitdo commented Oct 1, 2017

@KevinZhuYF, please respond to @Shreeshrii's question.

@Shreeshrii
Copy link
Collaborator

Shreeshrii commented Oct 2, 2017

mgr_.Init(traineddata_path.c_str()):Error:Assert failed:in file ../lstm/lstmtrainer.h, line 110

This error is caused by a missing traineddata file given in the training command. There is another similar issue - #1075

It will be nice to have a more user-friendly message.

This issue can be closed and a link added to the other one ( more details in it)

@zdenop zdenop closed this as completed Sep 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants