-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Shan language (shn) #33
Comments
It seems this repo isn't active or maintained. |
That's correct, this repo is for the old Tesseract 3.05 and the legacy OCR recognizer. |
Yes please, thanks @stweil . |
@ronaldaug, do you want to prepare a pull request which adds |
Ok, I'll prepare and send a pull request to "/tesseract-orc/langdata_Istm" based on mya and other languages. |
@stweil |
Yes, the repo is active. I also noticed your pull request, but had no time to review it up to now. Ideally Shan support and training should be done by someone who knows that language (so not by me). |
Thanks for your quick response. |
Could someone help me to add the Shan language in tesseract?
Shan language = https://en.wikipedia.org/wiki/Shan_language
Language code = shn
Shan Wiki = https://shn.wikipedia.org
All Shan words (including IPA) = jsonfile
Websites that are using Shan scripts = https://shannews.org/ , http://shanunicode.com/
Font = https://saosu-mp.github.io/font/PangLong/PangLong.ttf
Shan syllable break = https://github.com/kwarm/syllable-break
Some Shan characters such as
င သ တ ထ ပ မ ယ ရ လ ဝ ႉ း ွ ု ူ ိ ီ ် ၊ ။
are similar to Myanmar (Burmese).Thanks in advance
The text was updated successfully, but these errors were encountered: