Tate converts accented characters and transliterates non-latin scripts to their closest ASCII equivalent.
Tate is a productivity tool, it behaves like a standard Unix application and can be chained with other Unix commands. It reads from standard input and writes to standard output. You can use it either as a commandline utility or a library.
Let's say you have a French sentence with a lot of weird characters and you want to convert it into ASCII in the most representative way. You can use:
echo 'Le cœur de la crémiére' | tate #=> Le coeur de la cremiere
Or some Bulgarian text you can't read:
echo 'Здравей!' | tate --lang=bg #=> Zdravey!
Set language using lang
option for custom filters, e.g. German:
echo 'Von großen Blöcken haut man große Stücke.' | tate --lang=de
Letters ö, ü and ß will be transliterated based on German transliteration rules:
Von grossen Bloecken haut man grosse Stuecke.
Language specific punctuation will be converted to closest ASCII equivalent.
For example, in Catalan, notice how the quotes (cometes franceses) and the interpunct (punt volat) are transliterated:
«Dóna amor que seràs feliç!». Això, il·lús company geniüt, ja és un lluït rètol blavís d’onze kWh.
"Dona amor que seras felic!". Aixo, il-lus company geniut, ja es un lluit retol blavis d'onze kWh.
Add this line to your application's Gemfile:
gem 'tate'
And then execute:
$ bundle
Or install it yourself as:
$ gem install tate
require 'tate'
Tate::transliterate('Zəfər', language='az') #=> Zefer
Usage: tate [options]
-l, --lang=[LANGUAGE] Set language for custom filters
-h, --help Show this message
-v, --version Show version
If you call tate
without providing any arguments, it will expect you to provide input using standard input (keyboard). After you are done typing you can use cmd + D
to trigger EOL (End of Line)
and the result will printed in the next line.
You can pipe the output of another command into tate.
curl gov.bg/bg | tate --lang=bg > index.html
There are custom filters for:
Azeri, Bulgarian, Catalan, French, German, Hungarian, Polish, Romanian, Portuguese, Spanish, and Vietnamese.
The following languages are known to work (w/o custom filters):
Croatian, Czech, Danish, Esperanto, Estonian, Finnish, Icelandic, Latvian, Lithuania, Norwegian, Scottish, Slovak, Slovenian, Swedish, Turkish, and Welsh.
What's next?
Russian, Irish, Arabic, and Yoruba.
Yes.
This gem is tested against the following Ruby versions:
- ✅
3.2.2
(stable) - ✅
3.1.4
(stable) - ⏳
3.0.6
(security maintenance) - 🪦
2.7.8
(end of life)
After checking out the repo, run bin/setup
to install dependencies. Then, run rake spec
to run the tests. You can also run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
- Fork the repository
- Create your feature branch (
git checkout -b add-irish-support
) - Commit your changes (
git commit -am 'Add Irish language support'
) - Push to the branch (
git push origin add-irish-support
) - Create a new Pull Request
You can add custom language filters under lib/rules
directory.
You can donate me at Liberapay. Thanks! ☕️
tate is short for transliterate.
Nobody has time to type transliterate in the terminal.
Copyright © 2016-2023 Kerem Bozdas
This project is available under the terms of the MIT License.