# ENS Normalize Python

## Install

In [1]:
%pip install ens-normalize

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


## Normalize an ENS name

In [2]:
from ens_normalize import ens_normalize
# str -> str
# raises DisallowedSequence for disallowed names
# output is namehash ready
ens_normalize('Nick.ETH')

'nick.eth'

note: `ens_normalize` does not enforce any constraints that might be applied by a particular registrar. For example, the registrar for names that are a subname of '.eth' enforces a 3-character minimum and this constraint is not enforced by `ens_normalize`.

## Inspect issues with disallowed names

In [3]:
from ens_normalize import DisallowedSequence, CurableSequence
try:
    # added a hidden "zero width joiner" character
    ens_normalize('Ni‍ck.ETH')
# Catch the first disallowed sequence (the name we are attempting to normalize could have more than one).
except DisallowedSequence as e:
    # error code
    print(e.code)
    # INVISIBLE

    # a message about why the sequence is disallowed
    print(e.general_info)
    # Contains a disallowed invisible character

    if isinstance(e, CurableSequence):
        # information about the curable sequence
        print(e.sequence_info)
        # 'This invisible character is disallowed'

        # starting index of the disallowed sequence in the input string
        # (counting in Unicode code points)
        print(e.index)
        # 2

        # the disallowed sequence
        # (use repr() to "see" the invisible character)
        print(repr(e.sequence))
        # '\u200d'

        # a normalization suggestion for fixing the disallowed sequence (there might be more disallowed sequences)
        print(repr(e.suggested))
        # ''
        # replacing the disallowed sequence with this suggestion (an empty string) represents the idea that the disallowed sequence is suggested to be removed

        # You may be able to fix this disallowed sequence by replacing e.sequence with e.suggested in the input string.
        # Fields index, sequence_info, sequence, and suggested are available only for curable errors.
        # Other disallowed sequences might be found even after applying this suggestion.

INVISIBLE
Contains a disallowed invisible character
This invisible character is disallowed
2
'\u200d'
''


## Cure names

You can attempt conversion of disallowed names into normalized names using `ens_cure`. This algorithm can “cure” many normalization errors that would fail `ens_normalize`. This can be useful in some situations. For example, if a user input fails `ens_normalize`, a user could be prompted with a more helpful error message such as: “Did you mean curedname.eth?”.

Some names are not curable. For example, if it is challenging to provide a specific normalization suggestion that might be needed to replace a disallowed sequence.

Note: This function is *NOT* a part of the ENS Normalization Standard.

In [4]:
from ens_normalize import ens_cure
# input name with disallowed zero width joiner and '?'
# str -> str
ens_cure('Ni‍ck?.ETH')
# ZWJ and '?' are removed, no error is raised

'nick.eth'

In [5]:
# note: might still raise DisallowedSequence for certain names, which cannot be cured, e.g.
try:
    ens_cure('?')
except DisallowedSequence as e:
    print(repr(e), e)
# reason: '?' would have to be removed which would result in an empty name

DisallowedSequence(code="EMPTY_NAME") No valid characters in name


In [6]:
try:
    ens_cure('0χх0.eth')
except DisallowedSequence as e:
    print(repr(e), e)
# reason: it is not clear which character should be removed ('χ' or 'х')

DisallowedSequence(code="CONF_WHOLE") Contains visually confusing characters from Cyrillic and Latin scripts


## Get a beautiful name that is optimized for display

In [7]:
from ens_normalize import ens_beautify
# works like ens_normalize()
# output ready for display
ens_beautify('1⃣2⃣.eth')

'1️⃣2️⃣.eth'

note: normalization is unchanged:\
`ens_normalize(ens_beautify(x)) == ens_normalize(x)`

note: in addition to beautifying emojis with fully-qualified emoji, ens_beautify converts the character 'ξ' (Greek lowercase 'Xi') to 'Ξ' (Greek uppercase 'Xi', a.k.a. the Ethereum symbol) in labels that contain no other Greek characters

## Generate detailed name analysis

In [8]:
from ens_normalize import ens_tokenize
# str -> List[Token]
# always returns a tokenization of the input
ens_tokenize('Nàme‍🧙‍♂.eth')

[TokenMapped(cp=78, cps=[110], type='mapped'),
 TokenNFC(input=[97, 768], cps=[224], type='nfc'),
 TokenValid(cps=[109, 101], type='valid'),
 TokenDisallowed(cp=8205, type='disallowed'),
 TokenEmoji(emoji=[129497, 8205, 9794, 65039], input=[129497, 8205, 9794], cps=[129497, 8205, 9794], type='emoji'),
 TokenStop(cp=46, type='stop'),
 TokenValid(cps=[101, 116, 104], type='valid')]

## Inspect changes

For a normalizable name, you can find out how the input is transformed during normalization:

In [9]:
from ens_normalize import ens_normalizations
# Returns a list of transformations (unnormalized sequence -> normalization suggestion)
# that have been applied to the input during normalization.
# NormalizableSequence has the same fields as CurableSequence:
# - code
# - general_info
# - sequence_info
# - index
# - sequence
# - suggested
ens_normalizations('Nàme🧙‍♂️.eth')

[NormalizableSequence(code="MAPPED", index=0, sequence="N", suggested="n"),
 NormalizableSequence(code="FE0F", index=4, sequence="🧙‍♂️", suggested="🧙‍♂")]

## An example normalization workflow

In [10]:
name = 'Nàme🧙‍♂️.eth'
try:
    normalized = ens_normalize(name)
    print('Normalized:', normalized)
    # Normalized: nàme🧙‍♂.eth
    # Success!

     # was the input transformed by the normalization process?
    if name != normalized:
        # Let's check how the input was changed:
        for t in ens_normalizations(name):
            print(repr(t)) # use repr() to print more information
        # NormalizableSequence(code="MAPPED", index=0, sequence="N", suggested="n")
        # NormalizableSequence(code="FE0F", index=4, sequence="🧙‍♂️", suggested="🧙‍♂")
        #                                     invisible character inside emoji ^
except DisallowedSequence as e:
    # Even if the name is invalid according to the ENS Normalization Standard,
    # we can try to automatically cure disallowed sequences.
    try:
        print('Cured:', ens_cure(name))
    except DisallowedSequence as e:
        # The name cannot be automatically cured.
        print('Disallowed name error:', e)

Normalized: nàme🧙‍♂.eth
NormalizableSequence(code="MAPPED", index=0, sequence="N", suggested="n")
NormalizableSequence(code="FE0F", index=4, sequence="🧙‍♂️", suggested="🧙‍♂")


You can run many of the above functions at once. It is faster than running all of them sequentially.

In [11]:
from ens_normalize import ens_process
# use only the do_* flags you need
ret = ens_process("Nàme🧙‍♂️1⃣.eth",
    do_normalize=True,
    do_beautify=True,
    do_tokenize=True,
    do_normalizations=True,
    do_cure=True,
)

In [12]:
ret.normalized

'nàme🧙\u200d♂1⃣.eth'

In [13]:
ret.beautified

'nàme🧙\u200d♂️1️⃣.eth'

In [14]:
ret.tokens

[TokenMapped(cp=78, cps=[110], type='mapped'),
 TokenValid(cps=[224, 109, 101], type='valid'),
 TokenEmoji(emoji=[129497, 8205, 9794, 65039], input=[129497, 8205, 9794, 65039], cps=[129497, 8205, 9794], type='emoji'),
 TokenEmoji(emoji=[49, 65039, 8419], input=[49, 8419], cps=[49, 8419], type='emoji'),
 TokenStop(cp=46, type='stop'),
 TokenValid(cps=[101, 116, 104], type='valid')]

In [15]:
ret.cured

'nàme🧙\u200d♂1⃣.eth'

In [16]:
# This is the list of cures that were applied to the input (in this case, none).
ret.cures

[]

In [17]:
# This is the exception raised by ens_normalize().
# It is a DisallowedSequence or CurableSequence if the error is curable.
ret.error is None

True

In [18]:
ret.normalizations

[NormalizableSequence(code="MAPPED", index=0, sequence="N", suggested="n"),
 NormalizableSequence(code="FE0F", index=4, sequence="🧙‍♂️", suggested="🧙‍♂")]