Crystal port of franc.
It's not the state-of-the-art algorithm on language identification, but gets 90%+ success on long enough text samples.
It supports 400+ languages.
It identifies any given text sample by extracting its 3 characters trigrams and comparing them to the most recurring trigrams extracted from a translation of the UDHR in all the available languages.
Language Detector returns the ISO-869-1 two letters language code of the most probable guess.
Add the dependency to your
dependencies: cadmium_language_detector: github: cadmiumcr/language_detector
require "cadmium_language_detector" text = "Alice was published in 1865, three years after Charles Lutwidge Dodgson and the Reverend Robinson Duckworth rowed in a boat, on 4 July 1862  (this popular date of the golden afternoon  might be a confusion or even another Alice-tale, for that particular day was cool, cloudy and rainy  ), up the Isis with the three young daughters of Henry Liddell (the Vice-Chancellor ofOxford University and Dean of Christ Church): Lorina Charlotte Liddell (aged 13, born 1849) (Prima in the book's prefatory verse); Alice Pleasance Liddell (aged 10, born 1852) (Secunda in the prefatory verse); Edith Mary Liddell (aged 8, born 1853) (Tertia in the prefatory verse).  The journey began at Folly Bridge near Oxford and ended five miles away in the village of Godstow. During the trip Charles Dodgson told the girls a story that featured a bored little girl named Alice who goes looking for an adventure. The girls loved it, and Alice Liddell asked Dodgson to write it down for her. He began writing the manuscript of the story the next day, although that earliest version no longer exists. The girls and Dodgson took another boat trip a month later when he elaborated the plot to the story of Alice, and in November he began working on the manuscript in earnest." pp LanguageDetector.new.detect(text) # => "en"
- Fork it (https://github.com/cadmiumcr/language_detector/fork)
- Create your feature branch (
git checkout -b my-new-feature)
- Commit your changes (
git commit -am 'Add some feature')
- Push to the branch (
git push origin my-new-feature)
- Create a new Pull Request
- Rémy Marronnier - creator and maintainer