Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
ICU Unicode Transliteration and Collation in Ruby.
C Ruby
Tag: v0.1.0

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
ext
test
.gitignore
LICENSE
README.rdoc
Rakefile
VERSION.yml
icunicode.gemspec

README.rdoc

ICUnicode

Unicode sorting is complicated (unicode.org/reports/tr10), and Ruby doesn't do it correctly. But there is a widely-used implementation of the Unicode collation algorithm in the ICU (International Components for Unicode) libraries. There is also no way to do Transliteration in Ruby (userguide.icu-project.org/transforms/general). This gem is a simple C wrapper around ucol_getSortKey from the ICU Collation API and utrans_transUChars from the ICU Transliteration API. These are added as simple methods on String.

Usage:

["cafe", "cafes", "caf\303\251"].sort
=> ["cafe", "cafes", "caf\303\251"]

require 'icunicode'

["cafe", "cafes", "caf\303\251"].sort_by {|s| s.unicode_sort_key}
=> ["cafe", "caf\303\251", "cafes"]

"blueberry".transliterate("Katakana").transliterate("Latin")
=> "burueberrui"

"blueberry".transliterate("Greek").transliterate("Latin")
=> "blyeberry"

Install:

sudo gem install icunicode

To do:

Add support for locales other than en-US. Increase buffer size or make it grow dynamically.

License:

Copyright © 2009 Justin Balthrop, Geni.com; Published under The MIT License, see LICENSE

Something went wrong with that request. Please try again.