Text::Hyphen is a Ruby library to hyphenate words in various languages using Ruby-fied versions of TeX hyphenation patterns. It will properly hyphenate various words according to the rules of the language the word is written in. The algorithm is based on that of the TeX typesetting system by Donald E. Knuth.
This is originally based on the Perl implementation of TeX::Hyphen and the Ruby port. The language hyphenation pattern files are based on the sources available from CTAN as of 2004.12.19 and have been manually translated by Austin Ziegler.
This is a minor release fixing documentation errors. This release provides both Ruby 1.8.7 and Ruby 1.9.2 support. This is the last major release supporting Ruby 1.8 interpreters. Future versions will only work with Ruby 1.9 or later interpreters.
require 'text/hyphen' hh = Text::Hyphen.new(:language => 'en_us', :left => 2, :right => 2) # Defaults to the above hh = Text::Hyphen.new word = "representation" points = hh.hyphenate(word) #=> [3, 5, 8, 10] puts hh.visualize(word) #=> rep-re-sen-ta-tion
Text::Hyphen is truly multilingual, with 29 languages or language variants supported. As an example, consider the difference between the following:
require 'text/hyphen' # Using left and right minimum values of 0 ensures that you will see all # possible hyphenation points, not just those that meet the minimum width # requirements. en = Text::Hyphen.new(:left => 0, :right => 0) fr = Text::Hyphen.new(:language => "fr", :left => 0, :right => 0) puts en.visualise("organiser") #=> or-gan-iser puts fr.visualise("organiser") #=> or-ga-ni-ser
As you can see, the hyphenation is distinct between the two hyphenators. Additional improvements over TeX::Hyphen include thread safety (except for debug control) and support for UTF-8 under Ruby 1.9.
gem install text-hyphen
After checking out the source, run:
$ rake newb
This task will install any missing dependencies, run the tests/specs, and generate the RDoc.