public
Description: Language Identification with Ruby: probabilistic language identification with ruby1.9
Homepage:
Clone URL: git://github.com/snifty/whatlang.git
name age message
file README Loading commit data...
file generate_models.rb
file lid.rb
directory models/ Tue Jul 08 15:30:39 -0700 2008 better format of sample files [snifty]
README
NB: Requires ruby1.9.

A module to identify which of any one of a number of human languages a given text is in.

We use a simple similarity measure between frequency counts of bigrams to compare an unknown text to a set of models of 
known languages.  

The language models are built with samples from: 

  http://www.unicode.org/udhr/downloads.html

which is copied to models/ .