tesseract osd retrain for other undected script which is available in script Dir #18

omesh-sharma · 2020-10-30T09:39:42Z

Hey

i am Using Tesseract OCR for the text extraction form the image :

I need your valuable suggestion for the below mentioned points.

How can i Retrain osd.traindata file for adding Ethiopic and other scripts , because current osd.traindata file unable to detect few scripts name eg:(ethiopic , gujarati, gurmukhi) but script files for them are available in script directory.

which is more accurate for text extraction [LANGUAGE TRAIN DATA FILES] or [SCRIPT TRAIN DATA FILES]

Does it make any difference to use the script for text extraction instead of language.traindata in term of text extraction accuracy.

Please Share your valuable comments and suggestions for above mentioned list as per your experience with tesseract.
It'll be very helpful for my final year project.

Contact: sharmaomesh0@gmail.com .

Thanks and regards
Omesh sharma

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tesseract osd retrain for other undected script which is available in script Dir #18

tesseract osd retrain for other undected script which is available in script Dir #18

omesh-sharma commented Oct 30, 2020

tesseract osd retrain for other undected script which is available in script Dir #18

tesseract osd retrain for other undected script which is available in script Dir #18

Comments

omesh-sharma commented Oct 30, 2020

Hey

i am Using Tesseract OCR for the text extraction form the image :

I need your valuable suggestion for the below mentioned points.

Contact: sharmaomesh0@gmail.com .