Skip to content

Commit

Permalink
Update language list based on tessdata_fast; fix #1343
Browse files Browse the repository at this point in the history
  • Loading branch information
zdenop committed Feb 23, 2018
1 parent 6f80c35 commit 035325d
Showing 1 changed file with 15 additions and 2 deletions.
17 changes: 15 additions & 2 deletions doc/tesseract.1.asc
Expand Up @@ -115,8 +115,9 @@ SINGLE OPTIONS
LANGUAGES
---------

There are currently language packs available for the following languages
(in https://github.com/tesseract-ocr/tessdata):
The currently available traineddata files for tesseract 4.00
for the following languages are in
(in https://github.com/tesseract-ocr/tessdata_fast):

*afr* (Afrikaans)
*amh* (Amharic)
Expand Down Expand Up @@ -176,47 +177,58 @@ There are currently language packs available for the following languages
*khm* (Central Khmer)
*kir* (Kirghiz; Kyrgyz)
*kor* (Korean)
*kor_vert* (Korean (vertical))
*kur* (Kurdish)
*kur_ara* (Kurdish (Arabic))
*lao* (Lao)
*lat* (Latin)
*lav* (Latvian)
*lit* (Lithuanian)
*ltz* (Luxembourgish)
*mal* (Malayalam)
*mar* (Marathi)
*mkd* (Macedonian)
*mlt* (Maltese)
*mon* (Mongolian)
*mri* (Maori)
*msa* (Malay)
*mya* (Burmese)
*nep* (Nepali)
*nld* (Dutch; Flemish)
*nor* (Norwegian)
*oci* (Occitan (post 1500))
*ori* (Oriya)
*osd* (Orientation and script detection module)
*pan* (Panjabi; Punjabi)
*pol* (Polish)
*por* (Portuguese)
*pus* (Pushto; Pashto)
*que* (Quechua)
*ron* (Romanian; Moldavian; Moldovan)
*rus* (Russian)
*san* (Sanskrit)
*sin* (Sinhala; Sinhalese)
*slk* (Slovak)
*slk_frak* (Slovak - Fraktur)
*slv* (Slovenian)
*snd* (Sindhi)
*spa* (Spanish; Castilian)
*spa_old* (Spanish; Castilian - Old)
*sqi* (Albanian)
*srp* (Serbian)
*srp_latn* (Serbian - Latin)
*sun* (Sundanese)
*swa* (Swahili)
*swe* (Swedish)
*syr* (Syriac)
*tam* (Tamil)
*tat* (Tatar)
*tel* (Telugu)
*tgk* (Tajik)
*tgl* (Tagalog)
*tha* (Thai)
*tir* (Tigrinya)
*ton* (Tonga)
*tur* (Turkish)
*uig* (Uighur; Uyghur)
*ukr* (Ukrainian)
Expand All @@ -225,6 +237,7 @@ There are currently language packs available for the following languages
*uzb_cyrl* (Uzbek - Cyrilic)
*vie* (Vietnamese)
*yid* (Yiddish)
*yor* (Yoruba)

To use a non-standard language pack named *foo.traineddata*, set the
*TESSDATA_PREFIX* environment variable so the file can be found at
Expand Down

1 comment on commit 035325d

@Shreeshrii
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also update the list of scripts - 38014a8

Please sign in to comment.