Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr_.Init(traineddata_path.c_str()):Error:Assert failed: #1075

Closed
iuriigalaida opened this issue Aug 11, 2017 · 19 comments
Closed

mgr_.Init(traineddata_path.c_str()):Error:Assert failed: #1075

iuriigalaida opened this issue Aug 11, 2017 · 19 comments

Comments

@iuriigalaida
Copy link

iuriigalaida commented Aug 11, 2017

I've got error 'mgr_.Init(traineddata_path.c_str()):Error:Assert failed:in file ../../../../lstm/lstmtrainer.h, line 110' when trying to run a command below. Can somebody help me with this?
All paths are correct.
version 4.0

c:\Temp\tesseract-master\tesseract-master\api>lstmtraining --model_output C:/Temp/engbest/model --continue_from C:/Temp/engbest/eng.lstm --traineddata C:/Temp/engbest/eng.traineddata --old_traineddata C:/Temp/tesseract-master/tesseract-master/tessdata/eng.traineddata --train_listfile C:/Temp/engbest/eng.training_files.txt --max_iterations 3600

@stephenyong2005
Copy link

Do you put the traineddate in the write directory?
--traineddata C:/Temp/engbest/eng.traineddata
Please check that if there is eng.traineddata in C:/Temp/engbest/? If no, copy the eng.traineddata to C:/Temp/engbest/.

Then, I think this will be resolved. I have met the same problem, and this is my resolution.

@iuriigalaida
Copy link
Author

It helped. Thanks. But than I got

Loaded file C:/Temp/engbest/eng.lstm, unpacking...
Warning: LSTMTrainer deserialized an LSTMRecognizer!
Code range changed from 105 to 105!!
Failed to continue from: C:/Temp/engbest/eng.lstm

stephenyong2005 Did you see error like this in your case?

@stephenyong2005
Copy link

stephenyong2005 commented Aug 13, 2017 via email

@Shreeshrii
Copy link
Collaborator

Are you using the latest source from github for building tesseract? What is your version info, git log info?

@Shreeshrii
Copy link
Collaborator

Also see #1069
regarding 'Failed to continue from' error

@stephenyong2005
Copy link

stephenyong2005 commented Aug 14, 2017 via email

@stephenyong2005
Copy link

stephenyong2005 commented Aug 14, 2017 via email

@iuriigalaida
Copy link
Author

After I get latest sourses v4 lstmtraining started work but I almost always got errors like
Encoding of string failed!
Can't encode transcription:
Even if I directly set path to unicharset file.

@Shreeshrii
Copy link
Collaborator

Shreeshrii commented Aug 14, 2017 via email

@iuriigalaida
Copy link
Author

yes. i have 44 new character and I wont to extend english traineddata. Attached my unicharset file. How can I aviod this encoding issues?

@Shreeshrii
Copy link
Collaborator

Unicharset was not attached -

see https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#training-just-a-few-layers

Have you tried Latin.traineddata - it will have more characters than eng in it.

@iuriigalaida
Copy link
Author

eng.zip

@Shreeshrii
Copy link
Collaborator

Shreeshrii commented Aug 14, 2017 via email

@Shreeshrii
Copy link
Collaborator

Re: 'mgr_.Init(traineddata_path.c_str()):Error:Assert failed:in file ../../../../lstm/lstmtrainer.h, line 110'

This comes when the path to the traineddata files is incorrect. Check file locations and correct the command for training.

@amitdo
Copy link
Collaborator

amitdo commented Sep 16, 2017

The command should use tprintf() with a meaningful message and exit(), not assert.

@Shreeshrii
Copy link
Collaborator

@stweil for #1423

@amitdo
Copy link
Collaborator

amitdo commented Sep 10, 2021

ASSERT_HOST(mgr_.Init(traineddata_path.c_str()));

@stweil stweil added this to the 5.0.0 milestone Sep 10, 2021
@stweil stweil added the bug label Sep 10, 2021
@zdenop
Copy link
Contributor

zdenop commented Oct 30, 2021

What is the expected solution here? Just note that path does not exist?

diff --git a/src/ccmain/tessedit.cpp b/src/ccmain/tessedit.cpp
index 3ad8d921..6b094129 100644
--- a/src/ccmain/tessedit.cpp
+++ b/src/ccmain/tessedit.cpp
@@ -90,7 +90,6 @@ bool Tesseract::init_tesseract_lang_data(const std::string &arg0,
   // Initialize TessdataManager.
   std::string tessdata_path = language_data_path_prefix + kTrainedDataSuffix;
   if (!mgr->is_loaded() && !mgr->Init(tessdata_path.c_str())) {
-    tprintf("Error opening data file %s\n", tessdata_path.c_str());
     tprintf(
         "Please make sure the TESSDATA_PREFIX environment variable is set"
         " to your \"tessdata\" directory.\n");
diff --git a/src/ccutil/serialis.cpp b/src/ccutil/serialis.cpp
index d9c9a8d4..4356d027 100644
--- a/src/ccutil/serialis.cpp
+++ b/src/ccutil/serialis.cpp
@@ -21,6 +21,7 @@
 #include "errcode.h"

 #include "helpers.h" // for ReverseN
+#include "tprintf.h" // for tprintf

 #include <climits> // for INT_MAX
 #include <cstdio>
@@ -44,6 +45,8 @@ bool LoadDataFromFile(const char *filename, std::vector<char> *data) {
       result = static_cast<long>(fread(&(*data)[0], 1, size, fp)) == size;
     }
     fclose(fp);
+  } else {
+    tprintf("Error opening data file '%s'!\n", filename);
   }
   return result;
 }
diff --git a/src/ccutil/tessdatamanager.cpp b/src/ccutil/tessdatamanager.cpp
index 279cf7ac..1582ef92 100644
--- a/src/ccutil/tessdatamanager.cpp
+++ b/src/ccutil/tessdatamanager.cpp
@@ -82,6 +82,8 @@ bool TessdataManager::LoadArchiveFile(const char *filename) {
       result = is_loaded_;
     }
     archive_read_free(a);
+  } else {
+    tprintf("Error opening data file '%s'!\n", filename);
   }
   return result;
 }

stweil added a commit to stweil/tesseract that referenced this issue Oct 30, 2021
stweil added a commit that referenced this issue Oct 30, 2021
@stweil
Copy link
Contributor

stweil commented Oct 30, 2021

Fixed in commit 68017db.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants