old versions

“How do I…”

…(de/e)ncode entities or Unicode characters with code points between `U+010000` and `U+10FFFF`?

Html Tag 1.5.0 (Unicode only)

Note

This version can decode Unicode scalars between U+010000 and U+10FFFF, but literal Unicode characters are still encoded as surrogate pairs (see below).

Decoding works the same as any other Unicode character; see here and here.

Html Tag <= 1.4.4

Previous versions of HTML Tag represent Unicode text in UTF-16 encoding. This is the same encoding traditionally used by the Windows operating system, and hence by Notepad++.

A single code point in a UTF-16 string can have a maximum value of 0xFFFF, or 16 consecutive 1 bits. Code points above 0xFFFF can still be represented, using two code points that, taken together, form a surrogate pair.

As an example:

Make sure HTML Tag is at least version 1.4
Paste this emoji into a new buffer: 🍪 (U+1F36A)
Select the emoji and run the Encode JS command; the cookie will be broken into the escape characters \uD83C\uDF6A
Select all the text, run the Decode JS command, and confirm that the cookie appears again

To decode any Unicode character between U+010000 and U+10FFFF, you will need to:

Find the “high” and “low” surrogate for the character. You can use an online tool, or implement an algorithm in the programming language of your choice
Type or paste the high surrogate, followed by the low surrogate, both in your preferred escape character format
Run the Decode JS command after selecting the pair, or after placing the caret beside them

Return to wiki homepage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

old versions

“How do I…”

…(de/e)ncode entities or Unicode characters with code points between `U+010000` and `U+10FFFF`?

Html Tag 1.5.0 (Unicode only)

Html Tag <= 1.4.4

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

old versions

“How do I…”

…(de/e)ncode entities or Unicode characters with code points between U+010000 and U+10FFFF?

Html Tag 1.5.0 (Unicode only)

Html Tag <= 1.4.4

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

…(de/e)ncode entities or Unicode characters with code points between `U+010000` and `U+10FFFF`?