Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: About romanized languages detection #1

Open
loretoparisi opened this issue Sep 12, 2018 · 0 comments
Open

Question: About romanized languages detection #1

loretoparisi opened this issue Sep 12, 2018 · 0 comments

Comments

@loretoparisi
Copy link

First thanks a lot for this amazing work! I'm trying to address to task of the language detection for indian languages. Currently my neural network can detected with a good accuracy most of the languages I wanted to have. The problem is that I need to detect the romanized version of these language, but I'm not sure that training that network (fasttext) on a romanized languages corpus could bring to something meaningful. I can see that here this tool is using ngram - trigram and a frequency based model.
My question is if this approach could be used for romanized version of a language like "hin", or "urd", etc. or this cannot be done in any case.
Thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant