Turkish deASCIIfier library for Java
Java
Latest commit 52827a7 Apr 12, 2014 @iorixxx iorixxx correct initialization of turkishDowncaseAsciifyTable map.
provide Locale.US to toLowerCase methods.
Permalink
Failed to load latest commit information.
.settings
patterns
src/turkish
.classpath
.gitignore
.project
AUTHOR
LICENSE
README.md

README.md

Turkish Deasciifier for Java

Turkish Deasciifier is a Java library which converts Turkish text written with ASCII-only sentences into proper Turkish text with Turkish-specific accented letters.

(Turkish deasciifier ile Türkçe karakterler (ş, ı, ö, ç, ğ, ü) kullanmadan yazılmış yazıları doğru Türkçe karakter karşılıkları ile düzeltebilirsiniz.)

For instance a Turkish sentence containing only ASCII characters like:

Hadi bir masal uyduralim, icinde mutlu, doygun, telassiz durdugumuz.

will be converted to a sentence containing proper Turkish accented characters:

Hadi bir masal uyduralım, içinde mutlu, doygun, telaşsız durduğumuz.

Credits

It is adapted from Emre Sevinç's Turkish Deasciifier for Python which was influenced by Deniz Yüret's Emacs Turkish Mode implementation which was inspired by Gökhan Tür's Turkish Text Deasciifier.

Zemberek library also offers such a functionality, however, this library is compact, faster (almost 2000 times) and easier to use when compared to Zemberek.

This project is developed in 2010 and is not actively maintained.

Example usage

Deasciifier d = new Deasciifier();
d.setAsciiString("Hadi bir masal uyduralim, icinde mutlu, doygun, telassiz durdugumuz.");
System.out.println(d.convertToTurkish());

That simple.

Authors

  • Maintainer: Ahmet Alp Balkan <ahmet at ahmetalpbalkan.com> (feel free to contact for any questions or contributions)