A library to support the Internationalised Domain Names in Applications (IDNA) protocol as specified in RFC 5891. This version of the protocol is often referred to as “IDNA2008” and can produce different results from the earlier standard from 2003.
The library is also intended to act as a suitable drop-in replacement for the “encodings.idna” module that comes with the Python standard library but currently only supports the 2003 specification.
Its basic functions are simply executed:
>>> import idna
>>> idna.encode(u'ドメイン.テスト')
'xn--eckwd4c7c.xn--zckzah'
>>> print idna.decode('xn--eckwd4c7c.xn--zckzah')
ドメイン.テスト
The latest tagged release version is published in the PyPI repository:
To install this library, you can use PIP:
$ pip install idna
Alternatively, you can install the package using the bundled setup script:
$ python setup.py install
This library should work with Python 2.7, and Python 3.3 or later.
For typical usage, the encode
and decode
functions will take a domain
name argument and perform a conversion to an A-label or U-label respectively.
>>> import idna
>>> idna.encode(u'ドメイン.テスト')
'xn--eckwd4c7c.xn--zckzah'
>>> print idna.decode('xn--eckwd4c7c.xn--zckzah')
ドメイン.テスト
You may use the stream encoding and decoding methods using the
encodings.idna
compatibility module.
>>> import idna.compat
>>> print u'домена.испытание'.encode('idna')
xn--80ahd1agd.xn--80akhbyknj4f
>>> print 'xn--80ahd1agd.xn--80akhbyknj4f'.decode('idna')
домена.испытание
Conversions can be applied at a per-label basis using the ulabel
or alabel
functions if necessary:
>>> idna.alabel(u'测试')
'xn--0zwm56d'
All errors raised during the conversion following the specification should
raise an exception derived from the idna.IDNAError
base class.
More specific exceptions that may be generated as idna.IDNABidiError
when the error reflects an illegal combination of left-to-right and right-to-left
characters in a label; idna.InvalidCodepoint
when a specific codepoint is
an illegal character in an IDN label (i.e. INVALID); and idna.InvalidCodepointContext
when the codepoint is illegal based on its positional context (i.e. it is CONTEXTO
or CONTEXTJ but the contextual requirements are not satisfied.)
The library has a test suite based on each rule of the IDNA specification, as well as a subset of tests that are defined in Unicode Technical Standard 46, Unicode IDNA Compatibility Processing. Note that not all tests defined there are used, as TR46 defines tests for a normalisation approach beyond merely implementing IDNA2008.
The tests are run automatically on each commit to the master branch of the idna git repository at Travis CI: