Skip to content

enricobacis/utf9

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

utf9

Encode and decode text using UTF-9.

Description

On April 1st 2005, IEEE released the RFC4042UTF-9 and UTF-18 Efficient Transformation Formats of Unicode” :

The current representation formats for Unicode (UTF-7, UTF-8, UTF-16) are not storage and computation efficient on platforms that utilize the 9 bit nonet as a natural storage unit instead of the 8 bit octet.

Since there are not so many architecture that use 9 bit nonets as natural storage units and the release date was on April Fools’ Day, the beautiful UTF-9 was forgotten and no python implementation is available.

This python module is here to fill this gap! ;)

Usage

There are only two functions:

  • utf9encode(string): takes a string and returns a utf9-encoded version.
  • utf9decode(data): takes utf9-encoded data and returns the corresponding string.

Example

>>> import utf9 >>> encoded = utf9.utf9encode(u'ႹЄLᒪo, 🌍ǃ') >>> print repr(encoded) 'pxe0xb7-x0c!1xc3x92xd5x1bxc5x82x07nx83xxedxdecXxf80' >>> print utf9.utf9decode(encoded) ႹЄLᒪo, 🌍ǃ

About

Encode and decode text using UTF-9.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages