Skip to content

Commit

Permalink
Add list of scripts to manpage for tesseract (#1347)
Browse files Browse the repository at this point in the history
  • Loading branch information
Shreeshrii authored and zdenop committed Feb 24, 2018
1 parent bb89dc3 commit 40f4311
Showing 1 changed file with 49 additions and 0 deletions.
49 changes: 49 additions & 0 deletions doc/tesseract.1.asc
Expand Up @@ -244,6 +244,55 @@ To use a non-standard language pack named *foo.traineddata*, set the
*TESSDATA_PREFIX*/tessdata/*foo*.traineddata and give Tesseract the
argument '-l foo'.

SCRIPTS
-------
The traineddata files for the following scripts for tesseract 4.00
are also in https://github.com/tesseract-ocr/tessdata_fast.
In most cases, each of these contains all the languages that use that script PLUS English.
So it is possible to recognize a language that has not been specifically trained for
by using traineddata for the script it is written in.
Arabic,
Armenian,
Bengali,
Canadian Aboriginal,
Cherokee,
Cyrillic,
Devanagari,
Ethiopic,
Fraktur,
Georgian,
Greek,
Gujarati,
Gurmukhi,
Han - Simplified,
Han - Simplified (vertical),
Han - Traditional,
Han - Traditional (vertical),
Hangul,
Hangul (vertical),
Hebrew,
Japanese,
Japanese (vertical),
Kannada,
Khmer,
Lao,
Latin,
Malayalam,
Myanmar,
Oriya (Odia),
Sinhala,
Syriac,
Tamil,
Telugu,
Thaana,
Thai,
Tibetan,
Vietnamese.
CONFIG FILES AND AUGMENTING WITH USER DATA
------------------------------------------

Expand Down

0 comments on commit 40f4311

Please sign in to comment.