Skip to content

Unicode version 1.4.0

Choose a tag to compare

@kipcole9 kipcole9 released this 11 Mar 04:28

Updates Unicode to version 13.0.

As of March 2020, Unicode has introduced Unicode 13.0 and this data now forms the basis of ex_unicode version 1.40. Version 13 of Unicode adds 5,390 characters, for a total of 143,859 characters. These additions include four new scripts, for a total of 154 scripts, as well as 55 new emoji characters.

Adds derived categories for various quotation marks.

Although the unicode character database has a flag to indicate if a given codepoint is a quotation mark, the list does not include CJK quotation marks, dingbats or alternative encodings. Some additional derived categories are therefore added that are taken from Wikipedia. The added dervived categories are:

  • QuoteMark - all quote marks
  • QuoteMarkLeft - all quote marks used on the left
  • QuoteMarkRight - quote marks used on the right
  • QuoteMarkAmbidextrous - quote marks used either left or right
  • QuoteMarkSingle - single quote marks
  • QuoteMarkDouble - double quote marks

These additional derived categories can be used in Unicode Sets, for example:

iex> Unicode.Set.match? ?', "[[:quote_mark:]]"
true
iex> Unicode.Set.match? ?', "[[:quote_mark_left:]]"
false
iex> Unicode.Set.match? ?', "[[:quote_mark_ambidextrous:]]"
true