Skip to content

ironhouzi/pytib

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pytib

Generate Tibetan Unicode from Latin script

Install

python -m build

Usage

As shell script

ptib skyo

As python function

from pytib import translate


print(f'Latin: `skyo` -> Unicode: {translate("skyo")}')

Output to web browser

ptib skyo --html

Dependencies

  • Python >= 3.6.
  • Tibetan Unicode font. I recommend MS Himalaya for it's unprecedented functionality, being able to correctly render even the most obscure stacks from Tibetan transliterations of Sanskrit. Despite the excellent implementation work in MS Himalaya, the font can be rather illegible for intensive reading sessions. If you do not need complicated stacking, Noto Sans Tibetan supports both regular and bold types and fits very well with the fonts in the Noto font package which cover most of the spoken languages today.

Currently handles conversion of wylie, polyglotta and IAST to Tibetan Unicode. You are free to redefine the translation tables by creating a JSON file with the Tibetan/Sanskrit consonant, vowels and ga prefix forcing character and special characters. This JSON file can be passed to ptib using the -c or --config parameter.

The Tibetan/Sanskrit parsing algorithm is based on software written by the late Edward Henning.

Remaining work

  • Rule: Never tsheg after visarga.
  • Parse numerals.
  • Parse special characters.

Uncertainties

  • The ptib reader never places a tsheg between syllable and shad. Fixed! Only tsheg between nga and shad.
  • No, unambiguous algorithm to discern the use of anusvara/chandrabindu (rjes su nga ro/sna ldan). Currently, implemented as corner case lookups.
  • Apart from a few cases, the algorithm carries out strict translation and does not handle incorrectly spelled Tibetan transliteration of Sanskrit words. E.g. maṅgalaṃ becomes: མངྒལཾ - not: མངྒ་ལ, which is commonly seen in Tibetan texts. This would require the input: maṅga laṃ.

About

Software for Tibetan word processing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages