Skip to content
No description, website, or topics provided.
PHP
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
LICENSE
README.md
conversion_test.php
conversiondata.db
khmerLegacy2Unicode.php
khmerUnicode2Legacy.php

README.md

KhmerConverterPHP

These scripts transcode strings from Legacy khmer fonts to Unicode and vice versa. You can see them in action at http://www.selapa.net/khmerfonts/

How does it work?

Legacy → Unicode

  1. Search and replace from the database
  2. Recompose characters
  3. Transcode other characters * Ligatures get separated into characters * Ornaments get enclosed between 0x91 and 0x92 * Khmer characters missing in Unicode get enclosed between 0x86 and 0x87 * Characters missing in the legacy font get enclosed between 0x96 and 0x97
  4. Reorder characters according to Unicode order
    This code is translated to PHP from KhmerOS khmerconverter Python software

Unicode → Legacy

  1. Reorder characters according to visual order
    This code is translated to PHP from KhmerOS khmerconverter Python software
  2. Search and replace from the database
  3. Transcode characters
  4. Decompose composite characters if necessary * Missing characters get enclosed between 0x96 and 0x97
  5. Apply ligatures if present in the font

TODO

  • Refine the database (some font mappings aren't yet correct)
  • Word-breaking
  • Transcode documents with multiple fonts
You can’t perform that action at this time.