Skip to content
/ ldetect Public

A basic language detector based on word dictionaries and letter frequency analysis

License

Notifications You must be signed in to change notification settings

ThiBsc/ldetect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ldetect

License: MIT
A basic language detector based on word dictionaries and letter frequency analysis

Features

  • Detect by dictionnaries (based on @Alir3z4 stop-words)
  • Detect by letter frequencies
  • gui tkinter

Code sample

The quality of detection depends on the dictionary quality or the text length

from languagedetector import LanguageDetector

ld = LanguageDetector()

ld.addDict('fr', 'dict/dico_fr.txt')
ld.addDict('en', 'dict/dico_en.txt')
ld.addDict('es', 'dict/dico_es.txt')
ld.addDict('it', 'dict/dico_it.txt')
ld.addDict('ru', 'dict/dico_ru.txt')

ld.detect('Il fait beau') # 'fr'
ld.detect('The weather is fine') # 'en'
ld.detect('El tiempo es bueno') # 'es'
ld.detect('Il tempo è bello') # 'it'
ld.detect('Это приятно') # 'ru'

gui

gui

About

A basic language detector based on word dictionaries and letter frequency analysis

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages