Improve training messages (issue #3560) #3644

stweil · 2021-11-15T21:20:41Z

The old messages could wrongly be interpreted as CER / WER values,
but Tesseract training currently uses simple bag of characters /
bag of words error rates (see LSTMTrainer::ComputeCharError,
LSTMTrainer::ComputeWordError).

Signed-off-by: Stefan Weil sw@weilnetz.de

stweil · 2021-11-15T21:22:03Z

@bertsky, I used the message texts which you suggested in issue #3560.

bertsky

Yes, that's the relevant places I know.

bertsky · 2021-11-16T09:16:23Z

src/training/unicharset/lstmtester.cpp

@@ -118,8 +118,8 @@ std::string LSTMTester::RunEvalSync(int iteration, const double *training_errors
  std::string result;


There's one more: one line 109 (Line Char error rate → Line BCER) – cannot use GH suggestions.

That's fixed now, too.

The old messages could wrongly be interpreted as CER / WER values, but Tesseract training currently uses simple bag of characters / bag of words error rates (see LSTMTrainer::ComputeCharError, LSTMTrainer::ComputeWordError). Signed-off-by: Stefan Weil <sw@weilnetz.de>

bertsky approved these changes Nov 16, 2021

View reviewed changes

stweil force-pushed the issue3560 branch from 06803c5 to 1650567 Compare November 17, 2021 06:48

stweil force-pushed the issue3560 branch from 1650567 to aac9704 Compare November 17, 2021 06:59

amitdo merged commit c716ebd into tesseract-ocr:main Nov 17, 2021

stweil deleted the issue3560 branch November 17, 2021 08:04

bertsky mentioned this pull request Mar 4, 2024

question: How to Diagnose Overfitting and Underfitting of Tesseract Models? tesseract-ocr/tesstrain#200

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve training messages (issue #3560) #3644

Improve training messages (issue #3560) #3644

stweil commented Nov 15, 2021

stweil commented Nov 15, 2021

bertsky left a comment

bertsky Nov 16, 2021

stweil Nov 17, 2021

		@@ -118,8 +118,8 @@ std::string LSTMTester::RunEvalSync(int iteration, const double *training_errors
		std::string result;

Improve training messages (issue #3560) #3644

Improve training messages (issue #3560) #3644

Conversation

stweil commented Nov 15, 2021

stweil commented Nov 15, 2021

bertsky left a comment

Choose a reason for hiding this comment

bertsky Nov 16, 2021

Choose a reason for hiding this comment

stweil Nov 17, 2021

Choose a reason for hiding this comment