- Unicode Technical Standard #46: IDNA
- Unicode Technical Standard #51: Emoji
- Unicode Standard Annex #15: Normalization Forms
- Unicode Standard Annex #24: Script Property
- Unicode Standard Annex #29: Text Segmentation
- Unicode Standard Annex #31: Identifier and Pattern Syntax
- Unicode Technical Standard #39: Security Mechanisms
- RFC-3492: Punycode
- RFC-5891: IDNA: Protocol
- RFC-5892: The Unicode Code Points and IDNA
- WHATWG URL: IDNA
- Unicode data files
- Download Latest:
node download.js
- To download older versions:
node download.js 12.1.0
- Already included: Unicode 11-15
- Download Latest:
- CLDR data files
- Download Latest:
node parse-cldr.js
- Already included: CLDR 42
- Warning: these aren't versioned with Unicode!
- Download Latest:
- edit unicode-version.js — specify which versions to use
- edit Rules Files
node make.js
— creates/output/
with data files
- chars-valid.js
- chars-ignored.js
- chars-mapped.js
- chars-disallow.js
- chars-fenced.js — characters that occur in the middle and can't touch
- chars-escape.js — characters that should be escaped
- emoji.js — various emoji configurations
- cm.js — combining mark sequence whitelist
- scripts.js — various script configurations
- confuse.js — confusables groups
- group-order.js — how groups should be sorted for matching efficiency (auto-generated)
node names.js 61..7A 200D
— print Unicode names for hex codepointsnode names.js script Latn
— print Unicode names forLatin
node names.js prop White_Space
— print Unicode names with propertyWhite_Space
node names.js find abc
— find characters by name