Naive Language Detector
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
naive_language_detector
.gitignore
README.md
setup.py

README.md

Naive Language Detector

Detect the language of a given text in Python.

This simple algorithm should work fine with a long text (news article, email, document..).

Currently supports 41 languages:

['el', 'en', 'zh', 'af', 'ca', 'it', 'cs', 'ar', 'eu', u'et', az', id', es', r , nl', pt', nb', tr', lv', lt', th', ro', is', pl', be', fr', bg', uk', hr', bn', de', da', fa', hi', bs', fi', h , he', kk', sq', sv', mk', ur', sk', si', ms', sl']

Test Code

  import language_detector
  language_detector.test()
    

Training data was downloaded from [here] (http://invokeit.wordpress.com/frequency-word-lists/)