Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion lib/elixir/lib/string.ex
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ defmodule String do
"hello world"

The functions in this module act according to
[The Unicode Standard, Version 15.0.0](http://www.unicode.org/versions/Unicode15.0.0/).
[The Unicode Standard, Version 15.1.0](http://www.unicode.org/versions/Unicode15.1.0/).

## Interpolation

Expand Down
13 changes: 8 additions & 5 deletions lib/elixir/unicode/IdentifierType.txt
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# IdentifierType.txt
# Date: 2022-08-26, 16:49:09 GMT
# © 2022 Unicode®, Inc.
# Date: 2023-08-11, 17:46:40 GMT
# © 2023 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use, see https://www.unicode.org/terms_of_use.html
#
# Unicode Security Mechanisms for UTS #39
# Version: 15.0.0
# Version: 15.1.0
#
# For documentation and usage, see https://www.unicode.org/reports/tr39
#
Expand Down Expand Up @@ -576,10 +576,11 @@ FA27..FA29 ; Recommended # 1.1 [3] CJK COMPATIBILITY ID
2B740..2B81D ; Recommended # 6.0 [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
2B820..2CEA1 ; Recommended # 8.0 [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1
2CEB0..2EBE0 ; Recommended # 10.0 [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
2EBF0..2EE5D ; Recommended # 15.1 [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
30000..3134A ; Recommended # 13.0 [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; Recommended # 15.0 [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF

# Total code points: 112139
# Total code points: 112761

# Identifier_Type: Inclusion

Expand Down Expand Up @@ -1892,6 +1893,7 @@ A8F8..A8FA ; Obsolete Not_XID # 5.2 [3] DEVANAGARI SIGN PUSH
2E9B..2E9E ; Not_XID # 3.0 [4] CJK RADICAL CHOKE..CJK RADICAL DEATH
2EA0..2EF2 ; Not_XID # 3.0 [83] CJK RADICAL CIVILIAN..CJK RADICAL J-SIMPLIFIED TURTLE
2FF0..2FFB ; Not_XID # 3.0 [12] IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT..IDEOGRAPHIC DESCRIPTION CHARACTER OVERLAID
2FFC..2FFF ; Not_XID # 15.1 [4] IDEOGRAPHIC DESCRIPTION CHARACTER SURROUND FROM RIGHT..IDEOGRAPHIC DESCRIPTION CHARACTER ROTATION
3001..3004 ; Not_XID # 1.1 [4] IDEOGRAPHIC COMMA..JAPANESE INDUSTRIAL STANDARD SYMBOL
3008..301D ; Not_XID # 1.1 [22] LEFT ANGLE BRACKET..REVERSED DOUBLE PRIME QUOTATION MARK
301F..3020 ; Not_XID # 1.1 [2] LOW DOUBLE PRIME QUOTATION MARK..POSTAL MARK FACE
Expand All @@ -1903,6 +1905,7 @@ A8F8..A8FA ; Obsolete Not_XID # 5.2 [3] DEVANAGARI SIGN PUSH
3190..3191 ; Not_XID # 1.1 [2] IDEOGRAPHIC ANNOTATION LINKING MARK..IDEOGRAPHIC ANNOTATION REVERSE MARK
31C0..31CF ; Not_XID # 4.1 [16] CJK STROKE T..CJK STROKE N
31D0..31E3 ; Not_XID # 5.1 [20] CJK STROKE H..CJK STROKE Q
31EF ; Not_XID # 15.1 IDEOGRAPHIC DESCRIPTION CHARACTER SUBTRACTION
3248..324F ; Not_XID # 5.2 [8] CIRCLED NUMBER TEN ON BLACK SQUARE..CIRCLED NUMBER EIGHTY ON BLACK SQUARE
A67E ; Not_XID # 5.1 CYRILLIC KAVYKA
A720..A721 ; Not_XID # 5.0 [2] MODIFIER LETTER STRESS AND HIGH TONE..MODIFIER LETTER STRESS AND LOW TONE
Expand Down Expand Up @@ -2136,7 +2139,7 @@ FFFD ; Not_XID # 1.1 REPLACEMENT CHARACTE
1FB00..1FB92 ; Not_XID # 13.0 [147] BLOCK SEXTANT-1..UPPER HALF INVERSE MEDIUM SHADE AND LOWER HALF BLOCK
1FB94..1FBCA ; Not_XID # 13.0 [55] LEFT HALF INVERSE MEDIUM SHADE AND RIGHT HALF BLOCK..WHITE UP-POINTING CHEVRON

# Total code points: 5699
# Total code points: 5704

# Identifier_Type: Not_NFKC

Expand Down
78 changes: 69 additions & 9 deletions lib/elixir/unicode/PropList.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# PropList-15.0.0.txt
# Date: 2022-08-05, 22:17:16 GMT
# © 2022 Unicode®, Inc.
# PropList-15.1.0.txt
# Date: 2023-08-01, 21:56:53 GMT
# © 2023 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use, see https://www.unicode.org/terms_of_use.html
#
Expand Down Expand Up @@ -856,11 +856,12 @@ FA70..FAD9 ; Ideographic # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COM
2B740..2B81D ; Ideographic # Lo [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
2B820..2CEA1 ; Ideographic # Lo [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1
2CEB0..2EBE0 ; Ideographic # Lo [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
2EBF0..2EE5D ; Ideographic # Lo [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
2F800..2FA1D ; Ideographic # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
30000..3134A ; Ideographic # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; Ideographic # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF

# Total code points: 105854
# Total code points: 106476

# ================================================

Expand Down Expand Up @@ -1241,9 +1242,10 @@ E0020..E007F ; Other_Grapheme_Extend # Cf [96] TAG SPACE..CANCEL TAG
# ================================================

2FF0..2FF1 ; IDS_Binary_Operator # So [2] IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT..IDEOGRAPHIC DESCRIPTION CHARACTER ABOVE TO BELOW
2FF4..2FFB ; IDS_Binary_Operator # So [8] IDEOGRAPHIC DESCRIPTION CHARACTER FULL SURROUND..IDEOGRAPHIC DESCRIPTION CHARACTER OVERLAID
2FF4..2FFD ; IDS_Binary_Operator # So [10] IDEOGRAPHIC DESCRIPTION CHARACTER FULL SURROUND..IDEOGRAPHIC DESCRIPTION CHARACTER SURROUND FROM LOWER RIGHT
31EF ; IDS_Binary_Operator # So IDEOGRAPHIC DESCRIPTION CHARACTER SUBTRACTION

# Total code points: 10
# Total code points: 13

# ================================================

Expand All @@ -1253,6 +1255,12 @@ E0020..E007F ; Other_Grapheme_Extend # Cf [96] TAG SPACE..CANCEL TAG

# ================================================

2FFE..2FFF ; IDS_Unary_Operator # So [2] IDEOGRAPHIC DESCRIPTION CHARACTER HORIZONTAL REFLECTION..IDEOGRAPHIC DESCRIPTION CHARACTER ROTATION

# Total code points: 2

# ================================================

2E80..2E99 ; Radical # So [26] CJK RADICAL REPEAT..CJK RADICAL RAP
2E9B..2EF3 ; Radical # So [89] CJK RADICAL CHOKE..CJK RADICAL C-SIMPLIFIED TURTLE
2F00..2FD5 ; Radical # So [214] KANGXI RADICAL ONE..KANGXI RADICAL FLUTE
Expand All @@ -1275,10 +1283,11 @@ FA27..FA29 ; Unified_Ideograph # Lo [3] CJK COMPATIBILITY IDEOGRAPH-FA27..C
2B740..2B81D ; Unified_Ideograph # Lo [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
2B820..2CEA1 ; Unified_Ideograph # Lo [5762] CJK UNIFIED IDEOGRAPH-2B820..CJK UNIFIED IDEOGRAPH-2CEA1
2CEB0..2EBE0 ; Unified_Ideograph # Lo [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
2EBF0..2EE5D ; Unified_Ideograph # Lo [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
30000..3134A ; Unified_Ideograph # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; Unified_Ideograph # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF

# Total code points: 97058
# Total code points: 97680

# ================================================

Expand Down Expand Up @@ -1376,8 +1385,58 @@ AABB..AABC ; Logical_Order_Exception # Lo [2] TAI VIET VOWEL AUE..TAI VIET
0387 ; Other_ID_Continue # Po GREEK ANO TELEIA
1369..1371 ; Other_ID_Continue # No [9] ETHIOPIC DIGIT ONE..ETHIOPIC DIGIT NINE
19DA ; Other_ID_Continue # No NEW TAI LUE THAM DIGIT ONE
200C..200D ; Other_ID_Continue # Cf [2] ZERO WIDTH NON-JOINER..ZERO WIDTH JOINER
30FB ; Other_ID_Continue # Po KATAKANA MIDDLE DOT
FF65 ; Other_ID_Continue # Po HALFWIDTH KATAKANA MIDDLE DOT

# Total code points: 12
# Total code points: 16

# ================================================

00B2..00B3 ; ID_Compat_Math_Continue # No [2] SUPERSCRIPT TWO..SUPERSCRIPT THREE
00B9 ; ID_Compat_Math_Continue # No SUPERSCRIPT ONE
2070 ; ID_Compat_Math_Continue # No SUPERSCRIPT ZERO
2074..2079 ; ID_Compat_Math_Continue # No [6] SUPERSCRIPT FOUR..SUPERSCRIPT NINE
207A..207C ; ID_Compat_Math_Continue # Sm [3] SUPERSCRIPT PLUS SIGN..SUPERSCRIPT EQUALS SIGN
207D ; ID_Compat_Math_Continue # Ps SUPERSCRIPT LEFT PARENTHESIS
207E ; ID_Compat_Math_Continue # Pe SUPERSCRIPT RIGHT PARENTHESIS
2080..2089 ; ID_Compat_Math_Continue # No [10] SUBSCRIPT ZERO..SUBSCRIPT NINE
208A..208C ; ID_Compat_Math_Continue # Sm [3] SUBSCRIPT PLUS SIGN..SUBSCRIPT EQUALS SIGN
208D ; ID_Compat_Math_Continue # Ps SUBSCRIPT LEFT PARENTHESIS
208E ; ID_Compat_Math_Continue # Pe SUBSCRIPT RIGHT PARENTHESIS
2202 ; ID_Compat_Math_Continue # Sm PARTIAL DIFFERENTIAL
2207 ; ID_Compat_Math_Continue # Sm NABLA
221E ; ID_Compat_Math_Continue # Sm INFINITY
1D6C1 ; ID_Compat_Math_Continue # Sm MATHEMATICAL BOLD NABLA
1D6DB ; ID_Compat_Math_Continue # Sm MATHEMATICAL BOLD PARTIAL DIFFERENTIAL
1D6FB ; ID_Compat_Math_Continue # Sm MATHEMATICAL ITALIC NABLA
1D715 ; ID_Compat_Math_Continue # Sm MATHEMATICAL ITALIC PARTIAL DIFFERENTIAL
1D735 ; ID_Compat_Math_Continue # Sm MATHEMATICAL BOLD ITALIC NABLA
1D74F ; ID_Compat_Math_Continue # Sm MATHEMATICAL BOLD ITALIC PARTIAL DIFFERENTIAL
1D76F ; ID_Compat_Math_Continue # Sm MATHEMATICAL SANS-SERIF BOLD NABLA
1D789 ; ID_Compat_Math_Continue # Sm MATHEMATICAL SANS-SERIF BOLD PARTIAL DIFFERENTIAL
1D7A9 ; ID_Compat_Math_Continue # Sm MATHEMATICAL SANS-SERIF BOLD ITALIC NABLA
1D7C3 ; ID_Compat_Math_Continue # Sm MATHEMATICAL SANS-SERIF BOLD ITALIC PARTIAL DIFFERENTIAL

# Total code points: 43

# ================================================

2202 ; ID_Compat_Math_Start # Sm PARTIAL DIFFERENTIAL
2207 ; ID_Compat_Math_Start # Sm NABLA
221E ; ID_Compat_Math_Start # Sm INFINITY
1D6C1 ; ID_Compat_Math_Start # Sm MATHEMATICAL BOLD NABLA
1D6DB ; ID_Compat_Math_Start # Sm MATHEMATICAL BOLD PARTIAL DIFFERENTIAL
1D6FB ; ID_Compat_Math_Start # Sm MATHEMATICAL ITALIC NABLA
1D715 ; ID_Compat_Math_Start # Sm MATHEMATICAL ITALIC PARTIAL DIFFERENTIAL
1D735 ; ID_Compat_Math_Start # Sm MATHEMATICAL BOLD ITALIC NABLA
1D74F ; ID_Compat_Math_Start # Sm MATHEMATICAL BOLD ITALIC PARTIAL DIFFERENTIAL
1D76F ; ID_Compat_Math_Start # Sm MATHEMATICAL SANS-SERIF BOLD NABLA
1D789 ; ID_Compat_Math_Start # Sm MATHEMATICAL SANS-SERIF BOLD PARTIAL DIFFERENTIAL
1D7A9 ; ID_Compat_Math_Start # Sm MATHEMATICAL SANS-SERIF BOLD ITALIC NABLA
1D7C3 ; ID_Compat_Math_Start # Sm MATHEMATICAL SANS-SERIF BOLD ITALIC PARTIAL DIFFERENTIAL

# Total code points: 13

# ================================================

Expand All @@ -1398,6 +1457,7 @@ AABB..AABC ; Logical_Order_Exception # Lo [2] TAI VIET VOWEL AUE..TAI VIET
1367..1368 ; Sentence_Terminal # Po [2] ETHIOPIC QUESTION MARK..ETHIOPIC PARAGRAPH SEPARATOR
166E ; Sentence_Terminal # Po CANADIAN SYLLABICS FULL STOP
1735..1736 ; Sentence_Terminal # Po [2] PHILIPPINE SINGLE PUNCTUATION..PHILIPPINE DOUBLE PUNCTUATION
17D4..17D5 ; Sentence_Terminal # Po [2] KHMER SIGN KHAN..KHMER SIGN BARIYOOSAN
1803 ; Sentence_Terminal # Po MONGOLIAN FULL STOP
1809 ; Sentence_Terminal # Po MONGOLIAN MANCHU FULL STOP
1944..1945 ; Sentence_Terminal # Po [2] LIMBU EXCLAMATION MARK..LIMBU QUESTION MARK
Expand Down Expand Up @@ -1462,7 +1522,7 @@ FF61 ; Sentence_Terminal # Po HALFWIDTH IDEOGRAPHIC FULL STOP
1BC9F ; Sentence_Terminal # Po DUPLOYAN PUNCTUATION CHINOOK FULL STOP
1DA88 ; Sentence_Terminal # Po SIGNWRITING FULL STOP

# Total code points: 154
# Total code points: 156

# ================================================

Expand Down
38 changes: 35 additions & 3 deletions lib/elixir/unicode/PropertyValueAliases.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# PropertyValueAliases-15.0.0.txt
# Date: 2022-08-05, 23:42:17 GMT
# © 2022 Unicode®, Inc.
# PropertyValueAliases-15.1.0.txt
# Date: 2023-08-07, 15:21:34 GMT
# © 2023 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use, see https://www.unicode.org/terms_of_use.html
#
Expand Down Expand Up @@ -91,6 +91,7 @@ age; 12.1 ; V12_1
age; 13.0 ; V13_0
age; 14.0 ; V14_0
age; 15.0 ; V15_0
age; 15.1 ; V15_1
age; NA ; Unassigned

# Alphabetic (Alpha)
Expand Down Expand Up @@ -208,6 +209,7 @@ blk; CJK_Ext_E ; CJK_Unified_Ideographs_Extension_E
blk; CJK_Ext_F ; CJK_Unified_Ideographs_Extension_F
blk; CJK_Ext_G ; CJK_Unified_Ideographs_Extension_G
blk; CJK_Ext_H ; CJK_Unified_Ideographs_Extension_H
blk; CJK_Ext_I ; CJK_Unified_Ideographs_Extension_I
blk; CJK_Radicals_Sup ; CJK_Radicals_Supplement
blk; CJK_Strokes ; CJK_Strokes
blk; CJK_Symbols ; CJK_Symbols_And_Punctuation
Expand Down Expand Up @@ -817,6 +819,21 @@ IDSB; Y ; Yes ; T
IDST; N ; No ; F ; False
IDST; Y ; Yes ; T ; True

# IDS_Unary_Operator (IDSU)

IDSU; N ; No ; F ; False
IDSU; Y ; Yes ; T ; True

# ID_Compat_Math_Continue (ID_Compat_Math_Continue)

ID_Compat_Math_Continue; N ; No ; F ; False
ID_Compat_Math_Continue; Y ; Yes ; T ; True

# ID_Compat_Math_Start (ID_Compat_Math_Start)

ID_Compat_Math_Start; N ; No ; F ; False
ID_Compat_Math_Start; Y ; Yes ; T ; True

# ID_Continue (IDC)

IDC; N ; No ; F ; False
Expand All @@ -836,6 +853,13 @@ IDS; Y ; Yes ; T
Ideo; N ; No ; F ; False
Ideo; Y ; Yes ; T ; True

# Indic_Conjunct_Break (InCB)

InCB; Consonant ; Consonant
InCB; Extend ; Extend
InCB; Linker ; Linker
InCB; None ; None

# Indic_Positional_Category (InPC)

InPC; Bottom ; Bottom
Expand Down Expand Up @@ -1074,7 +1098,10 @@ jt ; U ; Non_Joining
# Line_Break (lb)

lb ; AI ; Ambiguous
lb ; AK ; Aksara
lb ; AL ; Alphabetic
lb ; AP ; Aksara_Prebase
lb ; AS ; Aksara_Start
lb ; B2 ; Break_Both
lb ; BA ; Break_After
lb ; BB ; Break_Before
Expand Down Expand Up @@ -1112,6 +1139,8 @@ lb ; SA ; Complex_Context
lb ; SG ; Surrogate
lb ; SP ; Space
lb ; SY ; Break_Symbols
lb ; VF ; Virama_Final
lb ; VI ; Virama
lb ; WJ ; Word_Joiner
lb ; XX ; Unknown
lb ; ZW ; ZWSpace
Expand Down Expand Up @@ -1156,6 +1185,9 @@ NFKC_QC; M ; Maybe
NFKC_QC; N ; No
NFKC_QC; Y ; Yes

# NFKC_Simple_Casefold (NFKC_SCF)


# NFKD_Quick_Check (NFKD_QC)

NFKD_QC; N ; No
Expand Down
Loading