Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uyghur Language #36

Closed
Abdusalamstd opened this issue Jul 5, 2020 · 8 comments
Closed

Uyghur Language #36

Abdusalamstd opened this issue Jul 5, 2020 · 8 comments
Labels
Language Request Request for new language support

Comments

@Abdusalamstd
Copy link
Contributor

I would like to help with the addition of the Uyghur Language ... what is needed to be done ?

@arulrajnet
Copy link
Contributor

Already explained here

#25 (comment)

@rkcosmos
Copy link
Contributor

rkcosmos commented Jul 6, 2020

I'm not a language expert so you have to help me understand Uyghur language a bit. From first look, it looks like Arabic language. Let me ask a few question.

  1. Is Uyghur using the same script as Arabic? Or are there additional script?
  2. I know that in Arabic there is a specific pattern when you write character next to each other to create a word. Is Uyghur using the same pattern?

@Abdusalamstd
Copy link
Contributor Author

I'm not a language expert so you have to help me understand Uyghur language a bit. From first look, it looks like Arabic language. Let me ask a few question.

  1. Is Uyghur using the same script as Arabic? Or are there additional script?
  2. I know that in Arabic there is a specific pattern when you write character next to each other to create a word. Is Uyghur using the same pattern?

Reply: 1.There additional script in Uyghur,Not exactly the same.
2.Yes,Uyghur using the same pattern.

@rkcosmos
Copy link
Contributor

rkcosmos commented Jul 6, 2020

According to Wikipedia, it seems like you have 4 set of alphabets.

  1. Uyghur Arabic alphabet or UEY
  2. Uyghur Cyrillic alphabet or USY
  3. The Uyghur New Script or UYY
  4. Uyghur Latin alphabet or ULY

We currently have Latin model. Arabic and Cyrillic are on the way. If Uyghur use all 4 set of alphabet above, then it's not gonna be easy. You can create a pull request to add all characters and words (see #25 ), but I cannot promise to do it in the near future because my priority will have to go to popular language or set of languages that share most of characters together.

@Abdusalamstd
Copy link
Contributor Author

According to Wikipedia, it seems like you have 4 set of alphabets.

  1. Uyghur Arabic alphabet or UEY
  2. Uyghur Cyrillic alphabet or USY
  3. The Uyghur New Script or UYY
  4. Uyghur Latin alphabet or ULY

We currently have Latin model. Arabic and Cyrillic are on the way. If Uyghur use all 4 set of alphabet above, then it's not gonna be easy. You can create a pull request to add all characters and words (see #25 ), but I cannot promise to do it in the near future because my priority will have to go to popular language or set of languages that share most of characters together.

Thanks for your reply!
Of the above four model, first model(Uyghur Arabic alphabet or UEY) is the most widely used. So just add first model(Uyghur Arabic alphabet or UEY) . I have finished the alphabet file "ug_char.txt", and now preparing the 'dict/ug.txt' common Uyghur words file.

@rkcosmos
Copy link
Contributor

rkcosmos commented Jul 6, 2020

please make sure 'dict/ug.txt' has enough words (other languages has ~30000).

@rkcosmos rkcosmos added the Language Request Request for new language support label Jul 9, 2020
@hilaloytun
Copy link

"ug_char.txt", and now preparing the 'dict/ug.tx

Hello, If it's not too much to ask, could you please share with me the "ug_char.txt", and "dict/ug.tx" files you prepared for Uygur, so that I can use them for my language?

@Abdusalamstd
Copy link
Contributor Author

"ug_char.txt", and now preparing the 'dict/ug.tx

Hello, If it's not too much to ask, could you please share with me the "ug_char.txt", and "dict/ug.tx" files you prepared for Uygur, so that I can use them for my language?

You can download it from this EasyOCR project repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Language Request Request for new language support
Projects
None yet
Development

No branches or pull requests

4 participants