Convert accented characters and transliterate non-latin alphabets to ASCII
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin Release major update (1.0.0) Sep 27, 2018
lib Bump tate to 1.0.1 Nov 10, 2018
spec Release major update (1.0.0) Sep 27, 2018
.gitignore
.rspec Initial commit Jan 27, 2016
.ruby-gemset
.ruby-version
.travis.yml
Gemfile
README.md
Rakefile
tate.gemspec Rename gemspec file to tate.gemspec Nov 10, 2018

README.md

Tate ✍️

Build Status Code Coverage Maintainability Downloaded Gem Version

Tate converts accented characters and transliterates non-latin scripts to their closest ASCII equivalent.

Tate is a productivity tool, it behaves like a standard Unix application and can be chained with other Unix commands. It reads from standard input and writes to standard output. You can use it either as a commandline utility or a library.

Examples

Let's say you have a French sentence with a lot of weird characters and you want to convert it into ASCII in the most representative way. You can use:

echo 'Le cœur de la crémiére' | tate  #=> Le coeur de la cremiere

Or some Bulgarian text you can't read:

echo 'Здравей!' | tate --lang=bg  #=> Zdravey!

Set language using lang option for custom filters, e.g. German:

echo 'Von großen Blöcken haut man große Stücke.' | tate --lang=de

Letters ö, ü and ß will be transliterated based on German transliteration rules:

Von grossen Bloecken haut man grosse Stuecke.

Language specific punctuation will be converted to closest ASCII equivalent.

For example, in Catalan, notice how the quotes (cometes franceses) and the interpunct (punt volat) are transliterated:

«Dóna amor que seràs feliç!». Això, il·lús company geniüt, ja és un lluït rètol blavís d’onze kWh.
"Dona amor que seras felic!". Aixo, il-lus company geniut, ja es un lluit retol blavis d'onze kWh.

Installation

Add this line to your application's Gemfile:

gem 'tate'

And then execute:

$ bundle

Or install it yourself as:

$ gem install tate

Usage

Ruby Library

require 'tate'
Tate::transliterate('Zəfər', language='az')  #=> Zefer

Commandline Utility

Usage: tate [options]
-l, --lang=[LANGUAGE]            Set language for custom filters
-h, --help                       Show this message
-v, --version                    Show version

Interactive Mode

If you call tate without providing any arguments, it will expect you to provide input using standard input (keyboard). After you are done typing you can use cmd + D to trigger EOL (End of Line) and the result will printed in the next line.

Standard Streams

You can pipe the output of another command into tate.

curl gov.bg/bg | tate --lang=bg > index.html

Language Support

There are custom filters for:

Azeri, Bulgarian, Catalan, French, German, Hungarian, Polish, Romanian, Spanish, and Vietnamese.

The following languages are known to work (w/o custom filters):

Croatian, Czech, Danish, Esperanto, Estonian, Finnish, Icelandic, Latvian, Lithuania, Norwegian, Portuguese, Scottish, Slovak, Slovenian, Swedish, Turkish, and Welsh.

What's next?

Russian, Irish, Arabic, and Yoruba.

Is it any good?

Yes.

Contributing

  1. Fork it (https://github.com/krmbzds/tate/fork)
  2. Create your feature branch (git checkout -b add-irish-support)
  3. Commit your changes (git commit -am 'Add Irish language support')
  4. Push to the branch (git push origin add-irish-support)
  5. Create a new Pull Request

Custom Filters

You can add custom language filters under lib/rules directory.

Donations

You can donate me at Librepay. Thanks! ☕️

Trivia

tate is short for transliterate.

Nobody has time to type transliterate in the terminal.

License

Copyright © 2018 Kerem Bozdas

This project is available under the terms of the MIT License.