Skip to content
Hebrew - English Transliteration Engine
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
lib Remove ActiveRecord from test app Nov 2, 2018
pkg Initial commit Feb 23, 2017
test Update Gems (security fixes) Nov 17, 2018
.codeclimate.yml Add CI integrations with Travis & CodeClimate Feb 23, 2017
.rubocop.yml Add CI integrations with Travis & CodeClimate Feb 23, 2017
.travis.yml Bump Travis Ruby version Jun 15, 2017
Gemfile Update Gems (security fixes) Nov 17, 2018
Gemfile.lock Update Gems (security fixes) Nov 17, 2018
MIT-LICENSE Initial commit Feb 23, 2017
README.md Tweak coverage badge Jun 18, 2017
Rakefile Bring coverage up to 100% Jun 15, 2017
codeclimate-config.patch Add CI integrations with Travis & CodeClimate Feb 23, 2017
translit_kit.gemspec

README.md

TranslitKit

Build Status Code Climate Coverage Status Inline docs Gem Version license

TranslitKit is a framework for Hebrew-English transliteration.

Installation

gem install translit_kit
# in your Gemfile
gem 'translit_kit'

Requires Ruby 2.2 or later

Usage

Basic transliteration

  require 'translit_kit'
  word = HebrewWord.new "אַברָהָם"
  word.transliterate(:single)
  # => ["avrohom"]

  # Shortcut
  word.t(:single)
  # => ["avrohom"]

Transliteration is powered by phoneme maps, files that map between Hebrew phonemes, or units of sound, and English characters. (see below)

Three phoneme_maps are provided: :long, :short, and :single. You can easily add your own (see below)

word.t(:single)
# => ["avrohom"]
word.t(:short)
# => ["avroom", "avroam", "avroem", "avrohom", "avroham",
# "avrohem", "avraom", "avraam", "avraem", "avrahom",
# "avraham", "avrahem", "avreom", "avream", "avreem",
# "avrehom", "avreham", "avrehem" ]
word.t(:long)
# => ["avroom", "avrooom", "avroohm", ... ] # 5,997 more!

The default is :short:

  word.t == word.t(:short)
  # => true

To get the total permutation count, call HebrewWord#inspect

word.inspect
# => "אַברָהָם: Permutations: 1 single | 18 short | 6000 long"

Adding Custom Phoneme maps

Format

Phoneme Maps are simply JSON files, placed in the lib/phoneme_maps directory.

The file should map between each String (the phonemes) and an Arrays of replacement characters.

{
  "ב": ["v"],
  "בּ": ["b", "bb"]
}

A phoneme can be a Hebrew character א, nekuda (ָ), or character with modifiers, such as a dagesh (בּ). Keep in mind that many characters will be normalized (see below).

Installation

To install your custom map, place the file in lib/resources

Your file will be available as the symbol:<filename> without the .json extension.

Example: klingon.json becomes :klingon

Now you can use it anywhere:

  word.transliterate(:klingon)
  # => (Results)

At present, your map will not display results in HebrewWord#inspect

Contributing

TranslitKit is currently maintained by @AnalyzePlatypus. Contributions welcome!

Appendix: Pre-Processing

When a word is transliterated, it is pre-processed to normalize certain characters. Specifically:

  • Whitespace is stripped
  • The final letters [םןךףץ] are normalized to their standard forms
  • CHATAF nekudos ['ֲ','ֳ','ֱ'] are normalized to their standard forms
  • Full CHIRIK, TZEIREI, and CHOLOM nekudos have their letters removed
  • DAGESH characters are removed from all but the characters [בוכפת]
You can’t perform that action at this time.