Skip to content

Latest commit

 

History

History
3069 lines (2304 loc) · 166 KB

zoc.rst

File metadata and controls

3069 lines (2304 loc) · 166 KB

zoc

Tested Software version 8.07.3 on Darwin Full results available at ucs-detect repository path data/macos-zoc-8.07.3.yaml

Wide character support

The best wide unicode table version for zoc appears to be 15.0.0, this is from a summary of the following results:

version n_errors n_total pct_success
'5.1.0'

0

26

100.0%
'5.2.0'

55

269

79.6%
'6.0.0'

10

13

23.1%
'9.0.0'

27

5000

99.5%
'10.0.0'

6

735

99.2%
'11.0.0'

0

62

100.0%
'12.0.0'

12

62

80.6%
'12.1.0'

0

1

100.0%
'13.0.0'

2

541

99.6%
'14.0.0'

2

41

95.1%
'15.0.0'

1

15

93.3%
'15.1.0'

4

5

20.0%

Sequence of a WIDE character from Unicode Version 15.0.0, from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0001F6DC '\U0001f6dc' So

2

WIRELESS

Total codepoints: 1

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xf0\x9f\x9b\x9c|\\n12|\\n"
    🛜|
    12|
  • python wcwidth.wcswidth() measures width 2, while zoc measures width 1.

Emoji ZWJ support

The best Emoji ZWJ table version for zoc appears to be None, this is from a summary of the following results:

version n_errors n_total pct_success
'2.0'

22

22

0.0%
'4.0'

500

500

0.0%
'5.0'

100

100

0.0%
'11.0'

73

73

0.0%
'12.0'

112

112

0.0%
'12.1'

165

165

0.0%
'13.0'

51

51

0.0%
'13.1'

83

83

0.0%
'14.0'

20

20

0.0%
'15.0'

1

1

0.0%
'15.1'

109

109

0.0%

Sequence of an Emoji ZWJ Sequence from Emoji Version 15.1, from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0001F9D1 '\U0001f9d1' So

2

ADULT
U+200D '\u200d' Cf

0

ZERO WIDTH JOINER
U+0001F9BC '\U0001f9bc' So

2

MOTORIZED WHEELCHAIR
U+200D '\u200d' Cf

0

ZERO WIDTH JOINER
U+27A1 '\u27a1' So

1

BLACK RIGHTWARDS ARROW
U+FE0F '\ufe0f' Mn

0

VARIATION SELECTOR-16

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xf0\x9f\xa7\x91\xe2\x80\x8d\xf0\x9f\xa6\xbc\xe2\x80\x8d\xe2\x9e\xa1\xef\xb8\x8f|\\n12|\\n"
    🧑‍🦼‍➡️|
    12|
  • python wcwidth.wcswidth() measures width 2, while zoc measures width 8.

Variation Selector-16 support

Emoji VS-16 results for zoc is 0 errors out of 100 total codepoints tested, 100.0% success. All codepoint combinations with Variation Selector-16 tested were successful.

Language Support

The following 7 languages were tested with 100% success:

Adyghe, Idoma, Kabardian, Tamazight, Central Atlas (Tifinagh), Tamazight, Standard Morocan, Vai, Yukaghir, Northern.

The following 91 languages are not fully supported:

lang n_errors n_total pct_success
Javanese (Javanese)

500

500

0.0%
Nuosu

230

230

0.0%
Cherokee (cased)

500

507

1.4%
Tai Dam

500

511

2.2%
Maldivian

500

515

2.9%
Tamil

500

516

3.1%
Tamil (Sri Lanka)

500

516

3.1%
Burmese

500

519

3.7%
Mon

500

522

4.2%
Shan

500

523

4.4%
Dzongkha

342

359

4.7%
Gujarati

500

530

5.7%
Tibetan, Central

263

279

5.7%
Malayalam

500

533

6.2%
Tamang, Eastern

42

45

6.7%
Kannada

500

536

6.7%
Khün

412

442

6.8%
Khmer, Central

492

528

6.8%
Bengali

500

540

7.4%
Chakma

500

540

7.4%
Telugu

500

550

9.1%
Nepali

500

554

9.7%
Sanskrit

500

563

11.2%
Sanskrit (Grantha)

500

565

11.5%
Marathi

500

571

12.4%
Hindi

500

576

13.2%
Sinhala

500

577

13.3%
Panjabi, Eastern

500

578

13.5%
Bhojpuri

500

584

14.4%
Thai (2)

267

313

14.7%
Maithili

500

613

18.4%
Thai

273

341

19.9%
Magahi

500

643

22.2%
Vietnamese

500

660

24.2%
Tagalog (Tagalog)

21

31

32.3%
Lao

270

426

36.6%
Lingala (tones)

500

844

40.8%
Vietnamese (Han nom)

107

199

46.2%
Pular (Adlam)

500

1044

52.1%
Yiddish, Eastern

500

1062

52.9%
Bamun

500

1138

56.1%
Orok

490

1245

60.6%
Tem

500

1290

61.2%
Nanai

379

1207

68.6%
Evenki

267

899

70.3%
Yaneshaʼ

500

1762

71.6%
Ticuna

500

1767

71.7%
Amarakaeri

401

1446

72.3%
South Azerbaijani

385

1396

72.4%
Yoruba

500

2177

77.0%
Chickasaw

122

554

78.0%
Siona

273

1492

81.7%
Fur

228

1838

87.6%
Chinantec, Chiltepec

213

1729

87.7%
Gumuz

132

1283

89.7%
Bora

162

1598

89.9%
Mòoré

226

2447

90.8%
Mongolian, Halh (Mongolian)

3

33

90.9%
Lamnso'

197

2237

91.2%
Navajo

138

1600

91.4%
Tamazight, Central Atlas

154

1822

91.5%
Gilyak

124

1504

91.8%
Ditammari

139

1882

92.6%
Assyrian Neo-Aramaic

74

1160

93.6%
Farsi, Western

102

1822

94.4%
Otomi, Mezquital

85

1849

95.4%
Veps

59

1323

95.5%
Waama

38

1000

96.2%
Dinka, Northeastern

56

1529

96.3%
Dari

66

1872

96.5%
Éwé

55

2230

97.5%
Baatonum

47

1939

97.6%
Urdu (2)

52

2251

97.7%
Urdu

50

2237

97.8%
Uduk

71

3247

97.8%
Mazahua Central

34

1574

97.8%
Secoya

29

1409

97.9%
Gen

46

2309

98.0%
Picard

36

2024

98.2%
Mixtec, Metlatónoc

24

1367

98.2%
Arabic, Standard

20

1348

98.5%
Ga

26

2039

98.7%
Panjabi, Western

21

2419

99.1%
Dangme

22

2912

99.2%
Dagaare, Southern

19

2582

99.3%
Serer-Sine

7

1596

99.6%
Fon

10

2520

99.6%
Aja

7

2061

99.7%
Pashto, Northern

4

2242

99.8%
Dendi

2

1569

99.9%
Seraiki

2

2242

99.9%

Javanese (Javanese)

Sequence of language Javanese (Javanese) from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+A9CB '\ua9cb' Po

1

JAVANESE PADA ADEG ADEG
U+A9B1 '\ua9b1' Lo

1

JAVANESE LETTER SA
U+A9A7 '\ua9a7' Lo

1

JAVANESE LETTER BA
U+A9BC '\ua9bc' Mn

0

JAVANESE VOWEL SIGN PEPET
U+A9A4 '\ua9a4' Lo

1

JAVANESE LETTER NA

Total codepoints: 5

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xea\xa7\x8b\xea\xa6\xb1\xea\xa6\xa7\xea\xa6\xbc\xea\xa6\xa4|\\n1234|\\n"
    ꧋ꦱꦧꦼꦤ|
    1234|
  • python wcwidth.wcswidth() measures width 4, while zoc measures width 10.

Nuosu

Sequence of language Nuosu from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+300A '\u300a' Ps

2

LEFT DOUBLE ANGLE BRACKET
U+A2E7 '\ua2e7' Lo

2

YI SYLLABLE ZZYT
U+A0C5 '\ua0c5' Lo

2

YI SYLLABLE MU
U+A2BD '\ua2bd' Lo

2

YI SYLLABLE COT
U+A305 '\ua305' Lo

2

YI SYLLABLE NZY
U+A14D '\ua14d' Lo

2

YI SYLLABLE DDU
U+A11C '\ua11c' Lo

2

YI SYLLABLE TI
U+A2CA '\ua2ca' Lo

2

YI SYLLABLE CYT
U+A12F '\ua12f' Lo

2

YI SYLLABLE TEP
U+A489 '\ua489' Lo

2

YI SYLLABLE YY
U+300B '\u300b' Pe

2

RIGHT DOUBLE ANGLE BRACKET

Total codepoints: 11

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe3\x80\x8a\xea\x8b\xa7\xea\x83\x85\xea\x8a\xbd\xea\x8c\x85\xea\x85\x8d\xea\x84\x9c\xea\x8b\x8a\xea\x84\xaf\xea\x92\x89\xe3\x80\x8b|\\n1234567890123456789012|\\n"
    《ꋧꃅꊽꌅꅍꄜꋊꄯꒉ》|
    1234567890123456789012|
  • python wcwidth.wcswidth() measures width 22, while zoc measures width 13.

Cherokee (cased)

Sequence of language Cherokee (cased) from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+13C2 '\u13c2' Lu

1

CHEROKEE LETTER NI
U+AB7C '\uab7c' Ll

1

CHEROKEE SMALL LETTER GV
U+AB8E '\uab8e' Ll

1

CHEROKEE SMALL LETTER NA
U+ABAB '\uabab' Ll

1

CHEROKEE SMALL LETTER DV

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe1\x8f\x82\xea\xad\xbc\xea\xae\x8e\xea\xae\xab|\\n1234|\\n"
    Ꮒꭼꮎꮫ|
    1234|
  • python wcwidth.wcswidth() measures width 4, while zoc measures width 7.

Tai Dam

Sequence of language Tai Dam from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+AA81 '\uaa81' Lo

1

TAI VIET LETTER HIGH KO
U+AAAB '\uaaab' Lo

1

TAI VIET LETTER HIGH VO
U+AAB1 '\uaab1' Lo

1

TAI VIET VOWEL AA
U+AAA3 '\uaaa3' Lo

1

TAI VIET LETTER HIGH MO

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xea\xaa\x81\xea\xaa\xab\xea\xaa\xb1\xea\xaa\xa3|\\n1234|\\n"
    ꪁꪫꪱꪣ|
    1234|
  • python wcwidth.wcswidth() measures width 4, while zoc measures width 8.

Maldivian

Sequence of language Maldivian from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0791 '\u0791' Lo

1

THAANA LETTER DAVIYANI
U+07A8 '\u07a8' Mn

0

THAANA IBIFILI
U+0790 '\u0790' Lo

1

THAANA LETTER SEENU
U+07AC '\u07ac' Mn

0

THAANA EBEFILI
U+0789 '\u0789' Lo

1

THAANA LETTER MEEMU
U+07B0 '\u07b0' Mn

0

THAANA SUKUN
U+0784 '\u0784' Lo

1

THAANA LETTER BAA
U+07A6 '\u07a6' Mn

0

THAANA ABAFILI
U+0783 '\u0783' Lo

1

THAANA LETTER RAA

Total codepoints: 9

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xde\x91\xde\xa8\xde\x90\xde\xac\xde\x89\xde\xb0\xde\x84\xde\xa6\xde\x83|\\n12345|\\n"
    ޑިސެމްބަރ|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 9.

Tamil

Sequence of language Tamil from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0BAE '\u0bae' Lo

1

TAMIL LETTER MA
U+0BA9 '\u0ba9' Lo

1

TAMIL LETTER NNNA
U+0BBF '\u0bbf' Mc

0

TAMIL VOWEL SIGN I
U+0BA4 '\u0ba4' Lo

1

TAMIL LETTER TA

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xae\xae\xe0\xae\xa9\xe0\xae\xbf\xe0\xae\xa4|\\n123|\\n"
    மனித|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Tamil (Sri Lanka)

Sequence of language Tamil (Sri Lanka) from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0BAE '\u0bae' Lo

1

TAMIL LETTER MA
U+0BA9 '\u0ba9' Lo

1

TAMIL LETTER NNNA
U+0BBF '\u0bbf' Mc

0

TAMIL VOWEL SIGN I
U+0BA4 '\u0ba4' Lo

1

TAMIL LETTER TA

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xae\xae\xe0\xae\xa9\xe0\xae\xbf\xe0\xae\xa4|\\n123|\\n"
    மனித|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Burmese

Sequence of language Burmese from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+1021 '\u1021' Lo

1

MYANMAR LETTER A
U+1015 '\u1015' Lo

1

MYANMAR LETTER PA
U+103C '\u103c' Mc

0

MYANMAR CONSONANT SIGN MEDIAL RA
U+100A '\u100a' Lo

1

MYANMAR LETTER NNYA
U+103A '\u103a' Mn

0

MYANMAR SIGN ASAT
U+1015 '\u1015' Lo

1

MYANMAR LETTER PA
U+103C '\u103c' Mc

0

MYANMAR CONSONANT SIGN MEDIAL RA
U+100A '\u100a' Lo

1

MYANMAR LETTER NNYA
U+103A '\u103a' Mn

0

MYANMAR SIGN ASAT
U+1006 '\u1006' Lo

1

MYANMAR LETTER CHA
U+102D '\u102d' Mn

0

MYANMAR VOWEL SIGN I
U+102F '\u102f' Mn

0

MYANMAR VOWEL SIGN U
U+1004 '\u1004' Lo

1

MYANMAR LETTER NGA
U+103A '\u103a' Mn

0

MYANMAR SIGN ASAT
U+101B '\u101b' Lo

1

MYANMAR LETTER RA
U+102C '\u102c' Mc

0

MYANMAR VOWEL SIGN AA

Total codepoints: 16

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe1\x80\xa1\xe1\x80\x95\xe1\x80\xbc\xe1\x80\x8a\xe1\x80\xba\xe1\x80\x95\xe1\x80\xbc\xe1\x80\x8a\xe1\x80\xba\xe1\x80\x86\xe1\x80\xad\xe1\x80\xaf\xe1\x80\x84\xe1\x80\xba\xe1\x80\x9b\xe1\x80\xac|\\n12345678|\\n"
    အပြည်ပြည်ဆိုင်ရာ|
    12345678|
  • python wcwidth.wcswidth() measures width 8, while zoc measures width 16.

Mon

Sequence of language Mon from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+101C '\u101c' Lo

1

MYANMAR LETTER LA
U+102D '\u102d' Mn

0

MYANMAR VOWEL SIGN I
U+1000 '\u1000' Lo

1

MYANMAR LETTER KA
U+103A '\u103a' Mn

0

MYANMAR SIGN ASAT
U+101C '\u101c' Lo

1

MYANMAR LETTER LA
U+101C '\u101c' Lo

1

MYANMAR LETTER LA
U+1031 '\u1031' Mc

0

MYANMAR VOWEL SIGN E
U+102C '\u102c' Mc

0

MYANMAR VOWEL SIGN AA
U+105A '\u105a' Lo

1

MYANMAR LETTER MON NGA
U+103A '\u103a' Mn

0

MYANMAR SIGN ASAT

Total codepoints: 10

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe1\x80\x9c\xe1\x80\xad\xe1\x80\x80\xe1\x80\xba\xe1\x80\x9c\xe1\x80\x9c\xe1\x80\xb1\xe1\x80\xac\xe1\x81\x9a\xe1\x80\xba|\\n12345|\\n"
    လိက်လလောၚ်|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 10.

Shan

Sequence of language Shan from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+101C '\u101c' Lo

1

MYANMAR LETTER LA
U+102D '\u102d' Mn

0

MYANMAR VOWEL SIGN I
U+1075 '\u1075' Lo

1

MYANMAR LETTER SHAN KA
U+103A '\u103a' Mn

0

MYANMAR SIGN ASAT
U+1088 '\u1088' Mc

0

MYANMAR SIGN SHAN TONE-3
U+1015 '\u1015' Lo

1

MYANMAR LETTER PA
U+102D '\u102d' Mn

0

MYANMAR VOWEL SIGN I
U+102F '\u102f' Mn

0

MYANMAR VOWEL SIGN U
U+107C '\u107c' Lo

1

MYANMAR LETTER SHAN NA
U+103A '\u103a' Mn

0

MYANMAR SIGN ASAT
U+107D '\u107d' Lo

1

MYANMAR LETTER SHAN PHA
U+1062 '\u1062' Mc

0

MYANMAR VOWEL SIGN SGAW KAREN EU
U+101D '\u101d' Lo

1

MYANMAR LETTER WA
U+103A '\u103a' Mn

0

MYANMAR SIGN ASAT
U+1087 '\u1087' Mc

0

MYANMAR SIGN SHAN TONE-2

Total codepoints: 15

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe1\x80\x9c\xe1\x80\xad\xe1\x81\xb5\xe1\x80\xba\xe1\x82\x88\xe1\x80\x95\xe1\x80\xad\xe1\x80\xaf\xe1\x81\xbc\xe1\x80\xba\xe1\x81\xbd\xe1\x81\xa2\xe1\x80\x9d\xe1\x80\xba\xe1\x82\x87|\\n123456|\\n"
    လိၵ်ႈပိုၼ်ၽၢဝ်ႇ|
    123456|
  • python wcwidth.wcswidth() measures width 6, while zoc measures width 15.

Dzongkha

Sequence of language Dzongkha from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0F60 '\u0f60' Lo

1

TIBETAN LETTER -A
U+0F42 '\u0f42' Lo

1

TIBETAN LETTER GA
U+0FB2 '\u0fb2' Mn

0

TIBETAN SUBJOINED LETTER RA
U+0F7C '\u0f7c' Mn

0

TIBETAN VOWEL SIGN O
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F56 '\u0f56' Lo

1

TIBETAN LETTER BA
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F58 '\u0f58' Lo

1

TIBETAN LETTER MA
U+0F72 '\u0f72' Mn

0

TIBETAN VOWEL SIGN I
U+0F60 '\u0f60' Lo

1

TIBETAN LETTER -A
U+0F72 '\u0f72' Mn

0

TIBETAN VOWEL SIGN I
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F51 '\u0f51' Lo

1

TIBETAN LETTER DA
U+0F56 '\u0f56' Lo

1

TIBETAN LETTER BA
U+0F44 '\u0f44' Lo

1

TIBETAN LETTER NGA
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F46 '\u0f46' Lo

1

TIBETAN LETTER CHA
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F42 '\u0f42' Lo

1

TIBETAN LETTER GA
U+0F72 '\u0f72' Mn

0

TIBETAN VOWEL SIGN I
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F60 '\u0f60' Lo

1

TIBETAN LETTER -A
U+0F5B '\u0f5b' Lo

1

TIBETAN LETTER DZA
U+0F58 '\u0f58' Lo

1

TIBETAN LETTER MA
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F42 '\u0f42' Lo

1

TIBETAN LETTER GA
U+0FB3 '\u0fb3' Mn

0

TIBETAN SUBJOINED LETTER LA
U+0F72 '\u0f72' Mn

0

TIBETAN VOWEL SIGN I
U+0F44 '\u0f44' Lo

1

TIBETAN LETTER NGA
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F42 '\u0f42' Lo

1

TIBETAN LETTER GA
U+0F66 '\u0f66' Lo

1

TIBETAN LETTER SA
U+0F63 '\u0f63' Lo

1

TIBETAN LETTER LA
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F56 '\u0f56' Lo

1

TIBETAN LETTER BA
U+0F66 '\u0f66' Lo

1

TIBETAN LETTER SA
U+0F92 '\u0f92' Mn

0

TIBETAN SUBJOINED LETTER GA
U+0FB2 '\u0fb2' Mn

0

TIBETAN SUBJOINED LETTER RA
U+0F42 '\u0f42' Lo

1

TIBETAN LETTER GA
U+0F66 '\u0f66' Lo

1

TIBETAN LETTER SA
U+0F0D '\u0f0d' Po

1

TIBETAN MARK SHAD

Total codepoints: 41

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xbd\xa0\xe0\xbd\x82\xe0\xbe\xb2\xe0\xbd\xbc\xe0\xbc\x8b\xe0\xbd\x96\xe0\xbc\x8b\xe0\xbd\x98\xe0\xbd\xb2\xe0\xbd\xa0\xe0\xbd\xb2\xe0\xbc\x8b\xe0\xbd\x91\xe0\xbd\x96\xe0\xbd\x84\xe0\xbc\x8b\xe0\xbd\x86\xe0\xbc\x8b\xe0\xbd\x82\xe0\xbd\xb2\xe0\xbc\x8b\xe0\xbd\xa0\xe0\xbd\x9b\xe0\xbd\x98\xe0\xbc\x8b\xe0\xbd\x82\xe0\xbe\xb3\xe0\xbd\xb2\xe0\xbd\x84\xe0\xbc\x8b\xe0\xbd\x82\xe0\xbd\xa6\xe0\xbd\xa3\xe0\xbc\x8b\xe0\xbd\x96\xe0\xbd\xa6\xe0\xbe\x92\xe0\xbe\xb2\xe0\xbd\x82\xe0\xbd\xa6\xe0\xbc\x8d|\\n12345678901234567890123456789012|\\n"
    འགྲོ་བ་མིའི་དབང་ཆ་གི་འཛམ་གླིང་གསལ་བསྒྲགས།|
    12345678901234567890123456789012|
  • python wcwidth.wcswidth() measures width 32, while zoc measures width 41.

Gujarati

Sequence of language Gujarati from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0AAE '\u0aae' Lo

1

GUJARATI LETTER MA
U+0ABE '\u0abe' Mc

0

GUJARATI VOWEL SIGN AA
U+0AA8 '\u0aa8' Lo

1

GUJARATI LETTER NA
U+0AB5 '\u0ab5' Lo

1

GUJARATI LETTER VA

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xaa\xae\xe0\xaa\xbe\xe0\xaa\xa8\xe0\xaa\xb5|\\n123|\\n"
    માનવ|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Tibetan, Central

Sequence of language Tibetan, Central from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0F61 '\u0f61' Lo

1

TIBETAN LETTER YA
U+0F7C '\u0f7c' Mn

0

TIBETAN VOWEL SIGN O
U+0F44 '\u0f44' Lo

1

TIBETAN LETTER NGA
U+0F66 '\u0f66' Lo

1

TIBETAN LETTER SA
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F41 '\u0f41' Lo

1

TIBETAN LETTER KHA
U+0FB1 '\u0fb1' Mn

0

TIBETAN SUBJOINED LETTER YA
U+0F56 '\u0f56' Lo

1

TIBETAN LETTER BA
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F42 '\u0f42' Lo

1

TIBETAN LETTER GA
U+0F66 '\u0f66' Lo

1

TIBETAN LETTER SA
U+0F63 '\u0f63' Lo

1

TIBETAN LETTER LA
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F56 '\u0f56' Lo

1

TIBETAN LETTER BA
U+0F66 '\u0f66' Lo

1

TIBETAN LETTER SA
U+0F92 '\u0f92' Mn

0

TIBETAN SUBJOINED LETTER GA
U+0FB2 '\u0fb2' Mn

0

TIBETAN SUBJOINED LETTER RA
U+0F42 '\u0f42' Lo

1

TIBETAN LETTER GA
U+0F66 '\u0f66' Lo

1

TIBETAN LETTER SA
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F60 '\u0f60' Lo

1

TIBETAN LETTER -A
U+0F42 '\u0f42' Lo

1

TIBETAN LETTER GA
U+0FB2 '\u0fb2' Mn

0

TIBETAN SUBJOINED LETTER RA
U+0F7C '\u0f7c' Mn

0

TIBETAN VOWEL SIGN O
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F56 '\u0f56' Lo

1

TIBETAN LETTER BA
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F58 '\u0f58' Lo

1

TIBETAN LETTER MA
U+0F72 '\u0f72' Mn

0

TIBETAN VOWEL SIGN I
U+0F60 '\u0f60' Lo

1

TIBETAN LETTER -A
U+0F72 '\u0f72' Mn

0

TIBETAN VOWEL SIGN I
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F50 '\u0f50' Lo

1

TIBETAN LETTER THA
U+0F7C '\u0f7c' Mn

0

TIBETAN VOWEL SIGN O
U+0F56 '\u0f56' Lo

1

TIBETAN LETTER BA
U+0F0B '\u0f0b' Po

1

TIBETAN MARK INTERSYLLABIC TSHEG
U+0F50 '\u0f50' Lo

1

TIBETAN LETTER THA
U+0F44 '\u0f44' Lo

1

TIBETAN LETTER NGA
U+0F0C '\u0f0c' Po

1

TIBETAN MARK DELIMITER TSHEG BSTAR
U+0F0D '\u0f0d' Po

1

TIBETAN MARK SHAD

Total codepoints: 40

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xbd\xa1\xe0\xbd\xbc\xe0\xbd\x84\xe0\xbd\xa6\xe0\xbc\x8b\xe0\xbd\x81\xe0\xbe\xb1\xe0\xbd\x96\xe0\xbc\x8b\xe0\xbd\x82\xe0\xbd\xa6\xe0\xbd\xa3\xe0\xbc\x8b\xe0\xbd\x96\xe0\xbd\xa6\xe0\xbe\x92\xe0\xbe\xb2\xe0\xbd\x82\xe0\xbd\xa6\xe0\xbc\x8b\xe0\xbd\xa0\xe0\xbd\x82\xe0\xbe\xb2\xe0\xbd\xbc\xe0\xbc\x8b\xe0\xbd\x96\xe0\xbc\x8b\xe0\xbd\x98\xe0\xbd\xb2\xe0\xbd\xa0\xe0\xbd\xb2\xe0\xbc\x8b\xe0\xbd\x90\xe0\xbd\xbc\xe0\xbd\x96\xe0\xbc\x8b\xe0\xbd\x90\xe0\xbd\x84\xe0\xbc\x8c\xe0\xbc\x8d|\\n1234567890123456789012345678901|\\n"
    ཡོངས་ཁྱབ་གསལ་བསྒྲགས་འགྲོ་བ་མིའི་ཐོབ་ཐང༌།|
    1234567890123456789012345678901|
  • python wcwidth.wcswidth() measures width 31, while zoc measures width 40.

Malayalam

Sequence of language Malayalam from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0D2E '\u0d2e' Lo

1

MALAYALAM LETTER MA
U+0D28 '\u0d28' Lo

1

MALAYALAM LETTER NA
U+0D41 '\u0d41' Mn

0

MALAYALAM VOWEL SIGN U
U+0D37 '\u0d37' Lo

1

MALAYALAM LETTER SSA
U+0D4D '\u0d4d' Mn

0

MALAYALAM SIGN VIRAMA
U+0D2F '\u0d2f' Lo

1

MALAYALAM LETTER YA
U+0D3E '\u0d3e' Mc

0

MALAYALAM VOWEL SIGN AA
U+0D35 '\u0d35' Lo

1

MALAYALAM LETTER VA
U+0D15 '\u0d15' Lo

1

MALAYALAM LETTER KA
U+0D3E '\u0d3e' Mc

0

MALAYALAM VOWEL SIGN AA
U+0D36 '\u0d36' Lo

1

MALAYALAM LETTER SHA
U+0D19 '\u0d19' Lo

1

MALAYALAM LETTER NGA
U+0D4D '\u0d4d' Mn

0

MALAYALAM SIGN VIRAMA
U+0D19 '\u0d19' Lo

1

MALAYALAM LETTER NGA
U+0D33 '\u0d33' Lo

1

MALAYALAM LETTER LLA
U+0D46 '\u0d46' Mc

0

MALAYALAM VOWEL SIGN E
U+0D15 '\u0d15' Lo

1

MALAYALAM LETTER KA
U+0D4D '\u0d4d' Mn

0

MALAYALAM SIGN VIRAMA
U+0D15 '\u0d15' Lo

1

MALAYALAM LETTER KA
U+0D41 '\u0d41' Mn

0

MALAYALAM VOWEL SIGN U
U+0D31 '\u0d31' Lo

1

MALAYALAM LETTER RRA
U+0D3F '\u0d3f' Mc

0

MALAYALAM VOWEL SIGN I
U+0D15 '\u0d15' Lo

1

MALAYALAM LETTER KA
U+0D4D '\u0d4d' Mn

0

MALAYALAM SIGN VIRAMA
U+0D15 '\u0d15' Lo

1

MALAYALAM LETTER KA
U+0D41 '\u0d41' Mn

0

MALAYALAM VOWEL SIGN U
U+0D28 '\u0d28' Lo

1

MALAYALAM LETTER NA
U+0D4D '\u0d4d' Mn

0

MALAYALAM SIGN VIRAMA
U+0D28 '\u0d28' Lo

1

MALAYALAM LETTER NA

Total codepoints: 29

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xb4\xae\xe0\xb4\xa8\xe0\xb5\x81\xe0\xb4\xb7\xe0\xb5\x8d\xe0\xb4\xaf\xe0\xb4\xbe\xe0\xb4\xb5\xe0\xb4\x95\xe0\xb4\xbe\xe0\xb4\xb6\xe0\xb4\x99\xe0\xb5\x8d\xe0\xb4\x99\xe0\xb4\xb3\xe0\xb5\x86\xe0\xb4\x95\xe0\xb5\x8d\xe0\xb4\x95\xe0\xb5\x81\xe0\xb4\xb1\xe0\xb4\xbf\xe0\xb4\x95\xe0\xb5\x8d\xe0\xb4\x95\xe0\xb5\x81\xe0\xb4\xa8\xe0\xb5\x8d\xe0\xb4\xa8|\\n12345678901234567|\\n"
    മനുഷ്യാവകാശങ്ങളെക്കുറിക്കുന്ന|
    12345678901234567|
  • python wcwidth.wcswidth() measures width 17, while zoc measures width 29.

Tamang, Eastern

Sequence of language Tamang, Eastern from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+092E '\u092e' Lo

1

DEVANAGARI LETTER MA
U+094D '\u094d' Mn

0

DEVANAGARI SIGN VIRAMA
U+0939 '\u0939' Lo

1

DEVANAGARI LETTER HA
U+0940 '\u0940' Mc

0

DEVANAGARI VOWEL SIGN II
U+0938 '\u0938' Lo

1

DEVANAGARI LETTER SA
U+0947 '\u0947' Mn

0

DEVANAGARI VOWEL SIGN E

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xa4\xae\xe0\xa5\x8d\xe0\xa4\xb9\xe0\xa5\x80\xe0\xa4\xb8\xe0\xa5\x87|\\n123|\\n"
    म्हीसे|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 6.

Kannada

Sequence of language Kannada from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0CAE '\u0cae' Lo

1

KANNADA LETTER MA
U+0CBE '\u0cbe' Mc

0

KANNADA VOWEL SIGN AA
U+0CA8 '\u0ca8' Lo

1

KANNADA LETTER NA
U+0CB5 '\u0cb5' Lo

1

KANNADA LETTER VA

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xb2\xae\xe0\xb2\xbe\xe0\xb2\xa8\xe0\xb2\xb5|\\n123|\\n"
    ಮಾನವ|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Khün

Sequence of language Khün from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+1A20 '\u1a20' Lo

1

TAI THAM LETTER HIGH KA
U+1A32 '\u1a32' Lo

1

TAI THAM LETTER HIGH TA
U+1A65 '\u1a65' Mn

0

TAI THAM VOWEL SIGN I
U+1A20 '\u1a20' Lo

1

TAI THAM LETTER HIGH KA
U+1A63 '\u1a63' Mc

0

TAI THAM VOWEL SIGN AA
U+1A45 '\u1a45' Lo

1

TAI THAM LETTER WA
U+1A64 '\u1a64' Mc

0

TAI THAM VOWEL SIGN TALL AA
U+1A75 '\u1a75' Mn

0

TAI THAM SIGN TONE-1
U+1A2F '\u1a2f' Lo

1

TAI THAM LETTER DA
U+1A60 '\u1a60' Mn

0

TAI THAM SIGN SAKOT
U+1A45 '\u1a45' Lo

1

TAI THAM LETTER WA
U+1A60 '\u1a60' Mn

0

TAI THAM SIGN SAKOT
U+1A3F '\u1a3f' Lo

1

TAI THAM LETTER LOW YA
U+1A62 '\u1a62' Mn

0

TAI THAM VOWEL SIGN MAI SAT
U+1A3E '\u1a3e' Lo

1

TAI THAM LETTER MA
U+1A36 '\u1a36' Lo

1

TAI THAM LETTER NA
U+1A69 '\u1a69' Mn

0

TAI THAM VOWEL SIGN U
U+1A54 '\u1a54' Lo

1

TAI THAM LETTER GREAT SA
U+1A29 '\u1a29' Lo

1

TAI THAM LETTER LOW CA
U+1A63 '\u1a63' Mc

0

TAI THAM VOWEL SIGN AA
U+1A60 '\u1a60' Mn

0

TAI THAM SIGN SAKOT
U+1A32 '\u1a32' Lo

1

TAI THAM LETTER HIGH TA

Total codepoints: 22

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe1\xa8\xa0\xe1\xa8\xb2\xe1\xa9\xa5\xe1\xa8\xa0\xe1\xa9\xa3\xe1\xa9\x85\xe1\xa9\xa4\xe1\xa9\xb5\xe1\xa8\xaf\xe1\xa9\xa0\xe1\xa9\x85\xe1\xa9\xa0\xe1\xa8\xbf\xe1\xa9\xa2\xe1\xa8\xbe\xe1\xa8\xb6\xe1\xa9\xa9\xe1\xa9\x94\xe1\xa8\xa9\xe1\xa9\xa3\xe1\xa9\xa0\xe1\xa8\xb2|\\n123456789012|\\n"
    ᨠᨲᩥᨠᩣᩅᩤ᩵ᨯ᩠ᩅ᩠ᨿᩢᨾᨶᩩᩔᨩᩣ᩠ᨲ|
    123456789012|
  • python wcwidth.wcswidth() measures width 12, while zoc measures width 22.

Khmer, Central

Sequence of language Khmer, Central from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+179F '\u179f' Lo

1

KHMER LETTER SA
U+17C1 '\u17c1' Mc

0

KHMER VOWEL SIGN E
U+1785 '\u1785' Lo

1

KHMER LETTER CA
U+1780 '\u1780' Lo

1

KHMER LETTER KA
U+17D2 '\u17d2' Mn

0

KHMER SIGN COENG
U+178A '\u178a' Lo

1

KHMER LETTER DA
U+17B8 '\u17b8' Mn

0

KHMER VOWEL SIGN II
U+1794 '\u1794' Lo

1

KHMER LETTER BA
U+17D2 '\u17d2' Mn

0

KHMER SIGN COENG
U+179A '\u179a' Lo

1

KHMER LETTER RO
U+1780 '\u1780' Lo

1

KHMER LETTER KA
U+17B6 '\u17b6' Mc

0

KHMER VOWEL SIGN AA
U+179F '\u179f' Lo

1

KHMER LETTER SA
U+1787 '\u1787' Lo

1

KHMER LETTER CO
U+17B6 '\u17b6' Mc

0

KHMER VOWEL SIGN AA
U+179F '\u179f' Lo

1

KHMER LETTER SA
U+1780 '\u1780' Lo

1

KHMER LETTER KA
U+179B '\u179b' Lo

1

KHMER LETTER LO
U+179F '\u179f' Lo

1

KHMER LETTER SA
U+17D2 '\u17d2' Mn

0

KHMER SIGN COENG
U+178A '\u178a' Lo

1

KHMER LETTER DA
U+17B8 '\u17b8' Mn

0

KHMER VOWEL SIGN II
U+1796 '\u1796' Lo

1

KHMER LETTER PO
U+17B8 '\u17b8' Mn

0

KHMER VOWEL SIGN II
U+179F '\u179f' Lo

1

KHMER LETTER SA
U+17B7 '\u17b7' Mn

0

KHMER VOWEL SIGN I
U+1791 '\u1791' Lo

1

KHMER LETTER TO
U+17D2 '\u17d2' Mn

0

KHMER SIGN COENG
U+1792 '\u1792' Lo

1

KHMER LETTER THO
U+17B7 '\u17b7' Mn

0

KHMER VOWEL SIGN I
U+1798 '\u1798' Lo

1

KHMER LETTER MO
U+1793 '\u1793' Lo

1

KHMER LETTER NO
U+17BB '\u17bb' Mn

0

KHMER VOWEL SIGN U
U+179F '\u179f' Lo

1

KHMER LETTER SA
U+17D2 '\u17d2' Mn

0

KHMER SIGN COENG
U+179F '\u179f' Lo

1

KHMER LETTER SA

Total codepoints: 36

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe1\x9e\x9f\xe1\x9f\x81\xe1\x9e\x85\xe1\x9e\x80\xe1\x9f\x92\xe1\x9e\x8a\xe1\x9e\xb8\xe1\x9e\x94\xe1\x9f\x92\xe1\x9e\x9a\xe1\x9e\x80\xe1\x9e\xb6\xe1\x9e\x9f\xe1\x9e\x87\xe1\x9e\xb6\xe1\x9e\x9f\xe1\x9e\x80\xe1\x9e\x9b\xe1\x9e\x9f\xe1\x9f\x92\xe1\x9e\x8a\xe1\x9e\xb8\xe1\x9e\x96\xe1\x9e\xb8\xe1\x9e\x9f\xe1\x9e\xb7\xe1\x9e\x91\xe1\x9f\x92\xe1\x9e\x92\xe1\x9e\xb7\xe1\x9e\x98\xe1\x9e\x93\xe1\x9e\xbb\xe1\x9e\x9f\xe1\x9f\x92\xe1\x9e\x9f|\\n1234567890123456789012|\\n"
    សេចក្ដីប្រកាសជាសកលស្ដីពីសិទ្ធិមនុស្ស|
    1234567890123456789012|
  • python wcwidth.wcswidth() measures width 22, while zoc measures width 36.

Bengali

Sequence of language Bengali from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+09AE '\u09ae' Lo

1

BENGALI LETTER MA
U+09BE '\u09be' Mc

0

BENGALI VOWEL SIGN AA
U+09A8 '\u09a8' Lo

1

BENGALI LETTER NA
U+09AC '\u09ac' Lo

1

BENGALI LETTER BA
U+09BE '\u09be' Mc

0

BENGALI VOWEL SIGN AA
U+09A7 '\u09a7' Lo

1

BENGALI LETTER DHA
U+09BF '\u09bf' Mc

0

BENGALI VOWEL SIGN I
U+0995 '\u0995' Lo

1

BENGALI LETTER KA
U+09BE '\u09be' Mc

0

BENGALI VOWEL SIGN AA
U+09B0 '\u09b0' Lo

1

BENGALI LETTER RA
U+09C7 '\u09c7' Mc

0

BENGALI VOWEL SIGN E
U+09B0 '\u09b0' Lo

1

BENGALI LETTER RA

Total codepoints: 12

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xa6\xae\xe0\xa6\xbe\xe0\xa6\xa8\xe0\xa6\xac\xe0\xa6\xbe\xe0\xa6\xa7\xe0\xa6\xbf\xe0\xa6\x95\xe0\xa6\xbe\xe0\xa6\xb0\xe0\xa7\x87\xe0\xa6\xb0|\\n1234567|\\n"
    মানবাধিকারের|
    1234567|
  • python wcwidth.wcswidth() measures width 7, while zoc measures width 12.

Chakma

Sequence of language Chakma from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0001111F '\U0001111f' Lo

1

CHAKMA LETTER MAA
U+0001111A '\U0001111a' Lo

1

CHAKMA LETTER NAA
U+0001112C '\U0001112c' Mc

0

CHAKMA VOWEL SIGN E
U+0001112D '\U0001112d' Mn

0

CHAKMA VOWEL SIGN AI
U+00011103 '\U00011103' Lo

1

CHAKMA LETTER AA
U+00011107 '\U00011107' Lo

1

CHAKMA LETTER KAA
U+00011134 '\U00011134' Mn

0

CHAKMA MAAYYAA
U+00011107 '\U00011107' Lo

1

CHAKMA LETTER KAA
U+00011125 '\U00011125' Lo

1

CHAKMA LETTER SAA
U+00011127 '\U00011127' Mn

0

CHAKMA VOWEL SIGN A
U+00011101 '\U00011101' Mn

0

CHAKMA SIGN ANUSVARA
U+00011122 '\U00011122' Lo

1

CHAKMA LETTER RAA
U+00011134 '\U00011134' Mn

0

CHAKMA MAAYYAA

Total codepoints: 13

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xf0\x91\x84\x9f\xf0\x91\x84\x9a\xf0\x91\x84\xac\xf0\x91\x84\xad\xf0\x91\x84\x83\xf0\x91\x84\x87\xf0\x91\x84\xb4\xf0\x91\x84\x87\xf0\x91\x84\xa5\xf0\x91\x84\xa7\xf0\x91\x84\x81\xf0\x91\x84\xa2\xf0\x91\x84\xb4|\\n1234567|\\n"
    𑄟𑄚𑄬𑄭𑄃𑄇𑄴𑄇𑄥𑄧𑄁𑄢𑄴|
    1234567|
  • python wcwidth.wcswidth() measures width 7, while zoc measures width 13.

Telugu

Sequence of language Telugu from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0C2E '\u0c2e' Lo

1

TELUGU LETTER MA
U+0C3E '\u0c3e' Mn

0

TELUGU VOWEL SIGN AA
U+0C28 '\u0c28' Lo

1

TELUGU LETTER NA
U+0C35 '\u0c35' Lo

1

TELUGU LETTER VA
U+0C38 '\u0c38' Lo

1

TELUGU LETTER SA
U+0C4D '\u0c4d' Mn

0

TELUGU SIGN VIRAMA
U+0C35 '\u0c35' Lo

1

TELUGU LETTER VA
U+0C24 '\u0c24' Lo

1

TELUGU LETTER TA
U+0C4D '\u0c4d' Mn

0

TELUGU SIGN VIRAMA
U+0C35 '\u0c35' Lo

1

TELUGU LETTER VA
U+0C2E '\u0c2e' Lo

1

TELUGU LETTER MA
U+0C41 '\u0c41' Mc

0

TELUGU VOWEL SIGN U
U+0C32 '\u0c32' Lo

1

TELUGU LETTER LA

Total codepoints: 13

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xb0\xae\xe0\xb0\xbe\xe0\xb0\xa8\xe0\xb0\xb5\xe0\xb0\xb8\xe0\xb1\x8d\xe0\xb0\xb5\xe0\xb0\xa4\xe0\xb1\x8d\xe0\xb0\xb5\xe0\xb0\xae\xe0\xb1\x81\xe0\xb0\xb2|\\n123456789|\\n"
    మానవస్వత్వముల|
    123456789|
  • python wcwidth.wcswidth() measures width 9, while zoc measures width 13.

Nepali

Sequence of language Nepali from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+092E '\u092e' Lo

1

DEVANAGARI LETTER MA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0928 '\u0928' Lo

1

DEVANAGARI LETTER NA
U+0935 '\u0935' Lo

1

DEVANAGARI LETTER VA

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xa4\xae\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa4\xb5|\\n123|\\n"
    मानव|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Sanskrit

Sequence of language Sanskrit from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+092E '\u092e' Lo

1

DEVANAGARI LETTER MA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0928 '\u0928' Lo

1

DEVANAGARI LETTER NA
U+0935 '\u0935' Lo

1

DEVANAGARI LETTER VA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0927 '\u0927' Lo

1

DEVANAGARI LETTER DHA
U+093F '\u093f' Mc

0

DEVANAGARI VOWEL SIGN I
U+0915 '\u0915' Lo

1

DEVANAGARI LETTER KA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0930 '\u0930' Lo

1

DEVANAGARI LETTER RA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0923 '\u0923' Lo

1

DEVANAGARI LETTER NNA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0902 '\u0902' Mn

0

DEVANAGARI SIGN ANUSVARA

Total codepoints: 14

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xa4\xae\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa4\xb5\xe0\xa4\xbe\xe0\xa4\xa7\xe0\xa4\xbf\xe0\xa4\x95\xe0\xa4\xbe\xe0\xa4\xb0\xe0\xa4\xbe\xe0\xa4\xa3\xe0\xa4\xbe\xe0\xa4\x82|\\n1234567|\\n"
    मानवाधिकाराणां|
    1234567|
  • python wcwidth.wcswidth() measures width 7, while zoc measures width 14.

Sanskrit (Grantha)

Sequence of language Sanskrit (Grantha) from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0001132E '\U0001132e' Lo

1

GRANTHA LETTER MA
U+0001133E '\U0001133e' Mc

0

GRANTHA VOWEL SIGN AA
U+00011328 '\U00011328' Lo

1

GRANTHA LETTER NA
U+00011335 '\U00011335' Lo

1

GRANTHA LETTER VA
U+0001133E '\U0001133e' Mc

0

GRANTHA VOWEL SIGN AA
U+00011327 '\U00011327' Lo

1

GRANTHA LETTER DHA
U+0001133F '\U0001133f' Mc

0

GRANTHA VOWEL SIGN I
U+00011315 '\U00011315' Lo

1

GRANTHA LETTER KA
U+0001133E '\U0001133e' Mc

0

GRANTHA VOWEL SIGN AA
U+00011330 '\U00011330' Lo

1

GRANTHA LETTER RA
U+0001133E '\U0001133e' Mc

0

GRANTHA VOWEL SIGN AA
U+00011323 '\U00011323' Lo

1

GRANTHA LETTER NNA
U+0001133E '\U0001133e' Mc

0

GRANTHA VOWEL SIGN AA
U+00011302 '\U00011302' Mc

0

GRANTHA SIGN ANUSVARA

Total codepoints: 14

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xf0\x91\x8c\xae\xf0\x91\x8c\xbe\xf0\x91\x8c\xa8\xf0\x91\x8c\xb5\xf0\x91\x8c\xbe\xf0\x91\x8c\xa7\xf0\x91\x8c\xbf\xf0\x91\x8c\x95\xf0\x91\x8c\xbe\xf0\x91\x8c\xb0\xf0\x91\x8c\xbe\xf0\x91\x8c\xa3\xf0\x91\x8c\xbe\xf0\x91\x8c\x82|\\n1234567|\\n"
    𑌮𑌾𑌨𑌵𑌾𑌧𑌿𑌕𑌾𑌰𑌾𑌣𑌾𑌂|
    1234567|
  • python wcwidth.wcswidth() measures width 7, while zoc measures width 14.

Marathi

Sequence of language Marathi from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+092E '\u092e' Lo

1

DEVANAGARI LETTER MA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0928 '\u0928' Lo

1

DEVANAGARI LETTER NA
U+0935 '\u0935' Lo

1

DEVANAGARI LETTER VA
U+0940 '\u0940' Mc

0

DEVANAGARI VOWEL SIGN II

Total codepoints: 5

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xa4\xae\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa4\xb5\xe0\xa5\x80|\\n123|\\n"
    मानवी|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 5.

Hindi

Sequence of language Hindi from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+092E '\u092e' Lo

1

DEVANAGARI LETTER MA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0928 '\u0928' Lo

1

DEVANAGARI LETTER NA
U+0935 '\u0935' Lo

1

DEVANAGARI LETTER VA

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xa4\xae\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa4\xb5|\\n123|\\n"
    मानव|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Sinhala

Sequence of language Sinhala from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0DB8 '\u0db8' Lo

1

SINHALA LETTER MAYANNA
U+0DCF '\u0dcf' Mc

0

SINHALA VOWEL SIGN AELA-PILLA
U+0DB1 '\u0db1' Lo

1

SINHALA LETTER DANTAJA NAYANNA
U+0DC0 '\u0dc0' Lo

1

SINHALA LETTER VAYANNA

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xb6\xb8\xe0\xb7\x8f\xe0\xb6\xb1\xe0\xb7\x80|\\n123|\\n"
    මානව|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Panjabi, Eastern

Sequence of language Panjabi, Eastern from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0A2E '\u0a2e' Lo

1

GURMUKHI LETTER MA
U+0A28 '\u0a28' Lo

1

GURMUKHI LETTER NA
U+0A41 '\u0a41' Mn

0

GURMUKHI VOWEL SIGN U
U+0A71 '\u0a71' Mn

0

GURMUKHI ADDAK
U+0A16 '\u0a16' Lo

1

GURMUKHI LETTER KHA
U+0A40 '\u0a40' Mc

0

GURMUKHI VOWEL SIGN II

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xa8\xae\xe0\xa8\xa8\xe0\xa9\x81\xe0\xa9\xb1\xe0\xa8\x96\xe0\xa9\x80|\\n123|\\n"
    ਮਨੁੱਖੀ|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 6.

Bhojpuri

Sequence of language Bhojpuri from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+092E '\u092e' Lo

1

DEVANAGARI LETTER MA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0928 '\u0928' Lo

1

DEVANAGARI LETTER NA
U+0935 '\u0935' Lo

1

DEVANAGARI LETTER VA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0927 '\u0927' Lo

1

DEVANAGARI LETTER DHA
U+093F '\u093f' Mc

0

DEVANAGARI VOWEL SIGN I
U+0915 '\u0915' Lo

1

DEVANAGARI LETTER KA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0930 '\u0930' Lo

1

DEVANAGARI LETTER RA

Total codepoints: 10

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xa4\xae\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa4\xb5\xe0\xa4\xbe\xe0\xa4\xa7\xe0\xa4\xbf\xe0\xa4\x95\xe0\xa4\xbe\xe0\xa4\xb0|\\n123456|\\n"
    मानवाधिकार|
    123456|
  • python wcwidth.wcswidth() measures width 6, while zoc measures width 10.

Thai (2)

Sequence of language Thai (2) from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0E1B '\u0e1b' Lo

1

THAI CHARACTER PO PLA
U+0E0F '\u0e0f' Lo

1

THAI CHARACTER TO PATAK
U+0E34 '\u0e34' Mn

0

THAI CHARACTER SARA I
U+0E0D '\u0e0d' Lo

1

THAI CHARACTER YO YING
U+0E0D '\u0e0d' Lo

1

THAI CHARACTER YO YING
U+0E32 '\u0e32' Lo

1

THAI CHARACTER SARA AA
U+0E2A '\u0e2a' Lo

1

THAI CHARACTER SO SUA
U+0E32 '\u0e32' Lo

1

THAI CHARACTER SARA AA
U+0E01 '\u0e01' Lo

1

THAI CHARACTER KO KAI
U+0E25 '\u0e25' Lo

1

THAI CHARACTER LO LING
U+0E27 '\u0e27' Lo

1

THAI CHARACTER WO WAEN
U+0E48 '\u0e48' Mn

0

THAI CHARACTER MAI EK
U+0E32 '\u0e32' Lo

1

THAI CHARACTER SARA AA
U+0E14 '\u0e14' Lo

1

THAI CHARACTER DO DEK
U+0E49 '\u0e49' Mn

0

THAI CHARACTER MAI THO
U+0E27 '\u0e27' Lo

1

THAI CHARACTER WO WAEN
U+0E22 '\u0e22' Lo

1

THAI CHARACTER YO YAK
U+0E2A '\u0e2a' Lo

1

THAI CHARACTER SO SUA
U+0E34 '\u0e34' Mn

0

THAI CHARACTER SARA I
U+0E17 '\u0e17' Lo

1

THAI CHARACTER THO THAHAN
U+0E18 '\u0e18' Lo

1

THAI CHARACTER THO THONG
U+0E34 '\u0e34' Mn

0

THAI CHARACTER SARA I
U+0E21 '\u0e21' Lo

1

THAI CHARACTER MO MA
U+0E19 '\u0e19' Lo

1

THAI CHARACTER NO NU
U+0E38 '\u0e38' Mn

0

THAI CHARACTER SARA U
U+0E29 '\u0e29' Lo

1

THAI CHARACTER SO RUSI
U+0E22 '\u0e22' Lo

1

THAI CHARACTER YO YAK
U+0E0A '\u0e0a' Lo

1

THAI CHARACTER CHO CHANG
U+0E19 '\u0e19' Lo

1

THAI CHARACTER NO NU

Total codepoints: 29

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xb8\x9b\xe0\xb8\x8f\xe0\xb8\xb4\xe0\xb8\x8d\xe0\xb8\x8d\xe0\xb8\xb2\xe0\xb8\xaa\xe0\xb8\xb2\xe0\xb8\x81\xe0\xb8\xa5\xe0\xb8\xa7\xe0\xb9\x88\xe0\xb8\xb2\xe0\xb8\x94\xe0\xb9\x89\xe0\xb8\xa7\xe0\xb8\xa2\xe0\xb8\xaa\xe0\xb8\xb4\xe0\xb8\x97\xe0\xb8\x98\xe0\xb8\xb4\xe0\xb8\xa1\xe0\xb8\x99\xe0\xb8\xb8\xe0\xb8\xa9\xe0\xb8\xa2\xe0\xb8\x8a\xe0\xb8\x99|\\n12345678901234567890123|\\n"
    ปฏิญญาสากลว่าด้วยสิทธิมนุษยชน|
    12345678901234567890123|
  • python wcwidth.wcswidth() measures width 23, while zoc measures width 29.

Maithili

Sequence of language Maithili from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0938 '\u0938' Lo

1

DEVANAGARI LETTER SA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0930 '\u0930' Lo

1

DEVANAGARI LETTER RA
U+094D '\u094d' Mn

0

DEVANAGARI SIGN VIRAMA
U+0935 '\u0935' Lo

1

DEVANAGARI LETTER VA
U+092D '\u092d' Lo

1

DEVANAGARI LETTER BHA
U+094C '\u094c' Mc

0

DEVANAGARI VOWEL SIGN AU
U+092E '\u092e' Lo

1

DEVANAGARI LETTER MA

Total codepoints: 8

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xa4\xb8\xe0\xa4\xbe\xe0\xa4\xb0\xe0\xa5\x8d\xe0\xa4\xb5\xe0\xa4\xad\xe0\xa5\x8c\xe0\xa4\xae|\\n12345|\\n"
    सार्वभौम|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 8.

Thai

Sequence of language Thai from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0E1B '\u0e1b' Lo

1

THAI CHARACTER PO PLA
U+0E0F '\u0e0f' Lo

1

THAI CHARACTER TO PATAK
U+0E34 '\u0e34' Mn

0

THAI CHARACTER SARA I
U+0E0D '\u0e0d' Lo

1

THAI CHARACTER YO YING
U+0E0D '\u0e0d' Lo

1

THAI CHARACTER YO YING
U+0E32 '\u0e32' Lo

1

THAI CHARACTER SARA AA
U+0E2A '\u0e2a' Lo

1

THAI CHARACTER SO SUA
U+0E32 '\u0e32' Lo

1

THAI CHARACTER SARA AA
U+0E01 '\u0e01' Lo

1

THAI CHARACTER KO KAI
U+0E25 '\u0e25' Lo

1

THAI CHARACTER LO LING
U+0E27 '\u0e27' Lo

1

THAI CHARACTER WO WAEN
U+0E48 '\u0e48' Mn

0

THAI CHARACTER MAI EK
U+0E32 '\u0e32' Lo

1

THAI CHARACTER SARA AA
U+0E14 '\u0e14' Lo

1

THAI CHARACTER DO DEK
U+0E49 '\u0e49' Mn

0

THAI CHARACTER MAI THO
U+0E27 '\u0e27' Lo

1

THAI CHARACTER WO WAEN
U+0E22 '\u0e22' Lo

1

THAI CHARACTER YO YAK
U+0E2A '\u0e2a' Lo

1

THAI CHARACTER SO SUA
U+0E34 '\u0e34' Mn

0

THAI CHARACTER SARA I
U+0E17 '\u0e17' Lo

1

THAI CHARACTER THO THAHAN
U+0E18 '\u0e18' Lo

1

THAI CHARACTER THO THONG
U+0E34 '\u0e34' Mn

0

THAI CHARACTER SARA I
U+0E21 '\u0e21' Lo

1

THAI CHARACTER MO MA
U+0E19 '\u0e19' Lo

1

THAI CHARACTER NO NU
U+0E38 '\u0e38' Mn

0

THAI CHARACTER SARA U
U+0E29 '\u0e29' Lo

1

THAI CHARACTER SO RUSI
U+0E22 '\u0e22' Lo

1

THAI CHARACTER YO YAK
U+0E0A '\u0e0a' Lo

1

THAI CHARACTER CHO CHANG
U+0E19 '\u0e19' Lo

1

THAI CHARACTER NO NU

Total codepoints: 29

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xb8\x9b\xe0\xb8\x8f\xe0\xb8\xb4\xe0\xb8\x8d\xe0\xb8\x8d\xe0\xb8\xb2\xe0\xb8\xaa\xe0\xb8\xb2\xe0\xb8\x81\xe0\xb8\xa5\xe0\xb8\xa7\xe0\xb9\x88\xe0\xb8\xb2\xe0\xb8\x94\xe0\xb9\x89\xe0\xb8\xa7\xe0\xb8\xa2\xe0\xb8\xaa\xe0\xb8\xb4\xe0\xb8\x97\xe0\xb8\x98\xe0\xb8\xb4\xe0\xb8\xa1\xe0\xb8\x99\xe0\xb8\xb8\xe0\xb8\xa9\xe0\xb8\xa2\xe0\xb8\x8a\xe0\xb8\x99|\\n12345678901234567890123|\\n"
    ปฏิญญาสากลว่าด้วยสิทธิมนุษยชน|
    12345678901234567890123|
  • python wcwidth.wcswidth() measures width 23, while zoc measures width 29.

Magahi

Sequence of language Magahi from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+092E '\u092e' Lo

1

DEVANAGARI LETTER MA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0928 '\u0928' Lo

1

DEVANAGARI LETTER NA
U+0935 '\u0935' Lo

1

DEVANAGARI LETTER VA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0927 '\u0927' Lo

1

DEVANAGARI LETTER DHA
U+093F '\u093f' Mc

0

DEVANAGARI VOWEL SIGN I
U+0915 '\u0915' Lo

1

DEVANAGARI LETTER KA
U+093E '\u093e' Mc

0

DEVANAGARI VOWEL SIGN AA
U+0930 '\u0930' Lo

1

DEVANAGARI LETTER RA

Total codepoints: 10

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xa4\xae\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa4\xb5\xe0\xa4\xbe\xe0\xa4\xa7\xe0\xa4\xbf\xe0\xa4\x95\xe0\xa4\xbe\xe0\xa4\xb0|\\n123456|\\n"
    मानवाधिकार|
    123456|
  • python wcwidth.wcswidth() measures width 6, while zoc measures width 10.

Vietnamese

Sequence of language Vietnamese from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0074 't' Ll

1

LATIN SMALL LETTER T
U+006F 'o' Ll

1

LATIN SMALL LETTER O
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+0300 '\u0300' Mn

0

COMBINING GRAVE ACCENT
U+006E 'n' Ll

1

LATIN SMALL LETTER N

Total codepoints: 5

  • Shell test using printf(1), '|' should align in output:

    $ printf "toa\xcc\x80n|\\n1234|\\n"
    toàn|
    1234|
  • python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Tagalog (Tagalog)

Sequence of language Tagalog (Tagalog) from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+170E '\u170e' Lo

1

TAGALOG LETTER LA
U+1711 '\u1711' Lo

1

TAGALOG LETTER HA
U+1706 '\u1706' Lo

1

TAGALOG LETTER TA
U+1714 '\u1714' Mn

0

TAGALOG SIGN VIRAMA

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe1\x9c\x8e\xe1\x9c\x91\xe1\x9c\x86\xe1\x9c\x94|\\n123|\\n"
    ᜎᜑᜆ᜔|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Lao

Sequence of language Lao from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0E9B '\u0e9b' Lo

1

LAO LETTER PO
U+0EB0 '\u0eb0' Lo

1

LAO VOWEL SIGN A
U+0E81 '\u0e81' Lo

1

LAO LETTER KO
U+0EB2 '\u0eb2' Lo

1

LAO VOWEL SIGN AA
U+0E94 '\u0e94' Lo

1

LAO LETTER DO
U+0EAA '\u0eaa' Lo

1

LAO LETTER SO SUNG
U+0EB2 '\u0eb2' Lo

1

LAO VOWEL SIGN AA
U+0E81 '\u0e81' Lo

1

LAO LETTER KO
U+0EBB '\u0ebb' Mn

0

LAO VOWEL SIGN MAI KON
U+0E99 '\u0e99' Lo

1

LAO LETTER NO

Total codepoints: 10

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe0\xba\x9b\xe0\xba\xb0\xe0\xba\x81\xe0\xba\xb2\xe0\xba\x94\xe0\xba\xaa\xe0\xba\xb2\xe0\xba\x81\xe0\xba\xbb\xe0\xba\x99|\\n123456789|\\n"
    ປະກາດສາກົນ|
    123456789|
  • python wcwidth.wcswidth() measures width 9, while zoc measures width 10.

Lingala (tones)

Sequence of language Lingala (tones) from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+004D 'M' Lu

1

LATIN CAPITAL LETTER M
U+004F 'O' Lu

1

LATIN CAPITAL LETTER O
U+004C 'L' Lu

1

LATIN CAPITAL LETTER L
U+0186 '\u0186' Lu

1

LATIN CAPITAL LETTER OPEN O
U+0301 '\u0301' Mn

0

COMBINING ACUTE ACCENT
U+004E 'N' Lu

1

LATIN CAPITAL LETTER N
U+0047 'G' Lu

1

LATIN CAPITAL LETTER G
U+0186 '\u0186' Lu

1

LATIN CAPITAL LETTER OPEN O
U+0301 '\u0301' Mn

0

COMBINING ACUTE ACCENT

Total codepoints: 9

  • Shell test using printf(1), '|' should align in output:

    $ printf "MOL\xc6\x86\xcc\x81NG\xc6\x86\xcc\x81|\\n1234567|\\n"
    MOLƆ́NGƆ́|
    1234567|
  • python wcwidth.wcswidth() measures width 7, while zoc measures width 9.

Vietnamese (Han nom)

Sequence of language Vietnamese (Han nom) from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0002321C '\U0002321c' Lo

2

CJK UNIFIED IDEOGRAPH-2321C
U+0031 '1' Nd

1

DIGIT ONE
U+0030 '0' Nd

1

DIGIT ZERO
U+00023383 '\U00023383' Lo

2

CJK UNIFIED IDEOGRAPH-23383
U+0031 '1' Nd

1

DIGIT ONE
U+0032 '2' Nd

1

DIGIT TWO
U+000221A5 '\U000221a5' Lo

2

CJK UNIFIED IDEOGRAPH-221A5
U+0031 '1' Nd

1

DIGIT ONE
U+0039 '9' Nd

1

DIGIT NINE
U+0034 '4' Nd

1

DIGIT FOUR
U+0038 '8' Nd

1

DIGIT EIGHT

Total codepoints: 11

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xf0\xa3\x88\x9c10\xf0\xa3\x8e\x8312\xf0\xa2\x86\xa51948|\\n12345678901234|\\n"
    𣈜10𣎃12𢆥1948|
    12345678901234|
  • python wcwidth.wcswidth() measures width 14, while zoc measures width 13.

Pular (Adlam)

Sequence of language Pular (Adlam) from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0001E916 '\U0001e916' Lu

1

ADLAM CAPITAL LETTER HA
U+0001E90B '\U0001e90b' Lu

1

ADLAM CAPITAL LETTER I
U+0001E902 '\U0001e902' Lu

1

ADLAM CAPITAL LETTER LAAM
U+0001E946 '\U0001e946' Mn

0

ADLAM GEMINATION MARK
U+0001E900 '\U0001e900' Lu

1

ADLAM CAPITAL LETTER ALIF
U+0001E912 '\U0001e912' Lu

1

ADLAM CAPITAL LETTER YA
U+0001E900 '\U0001e900' Lu

1

ADLAM CAPITAL LETTER ALIF
U+0001E910 '\U0001e910' Lu

1

ADLAM CAPITAL LETTER NUN
U+0001E911 '\U0001e911' Lu

1

ADLAM CAPITAL LETTER KAF
U+0001E90C '\U0001e90c' Lu

1

ADLAM CAPITAL LETTER O
U+0001E945 '\U0001e945' Mn

0

ADLAM VOWEL LENGTHENER
U+0001E908 '\U0001e908' Lu

1

ADLAM CAPITAL LETTER RA
U+0001E909 '\U0001e909' Lu

1

ADLAM CAPITAL LETTER E

Total codepoints: 13

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xf0\x9e\xa4\x96\xf0\x9e\xa4\x8b\xf0\x9e\xa4\x82\xf0\x9e\xa5\x86\xf0\x9e\xa4\x80\xf0\x9e\xa4\x92\xf0\x9e\xa4\x80\xf0\x9e\xa4\x90\xf0\x9e\xa4\x91\xf0\x9e\xa4\x8c\xf0\x9e\xa5\x85\xf0\x9e\xa4\x88\xf0\x9e\xa4\x89|\\n12345678901|\\n"
    𞤖𞤋𞤂𞥆𞤀𞤒𞤀𞤐𞤑𞤌𞥅𞤈𞤉|
    12345678901|
  • python wcwidth.wcswidth() measures width 11, while zoc measures width 13.

Yiddish, Eastern

Sequence of language Yiddish, Eastern from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+05D0 '\u05d0' Lo

1

HEBREW LETTER ALEF
U+05B7 '\u05b7' Mn

0

HEBREW POINT PATAH
U+05DC '\u05dc' Lo

1

HEBREW LETTER LAMED
U+05F0 '\u05f0' Lo

1

HEBREW LIGATURE YIDDISH DOUBLE VAV
U+05E2 '\u05e2' Lo

1

HEBREW LETTER AYIN
U+05DC '\u05dc' Lo

1

HEBREW LETTER LAMED
U+05D8 '\u05d8' Lo

1

HEBREW LETTER TET
U+05DC '\u05dc' Lo

1

HEBREW LETTER LAMED
U+05E2 '\u05e2' Lo

1

HEBREW LETTER AYIN
U+05DB '\u05db' Lo

1

HEBREW LETTER KAF
U+05E2 '\u05e2' Lo

1

HEBREW LETTER AYIN

Total codepoints: 11

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xd7\x90\xd6\xb7\xd7\x9c\xd7\xb0\xd7\xa2\xd7\x9c\xd7\x98\xd7\x9c\xd7\xa2\xd7\x9b\xd7\xa2|\\n1234567890|\\n"
    אַלװעלטלעכע|
    1234567890|
  • python wcwidth.wcswidth() measures width 10, while zoc measures width 11.

Bamun

Sequence of language Bamun from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+004E 'N' Lu

1

LATIN CAPITAL LETTER N
U+004A 'J' Lu

1

LATIN CAPITAL LETTER J
U+0055 'U' Lu

1

LATIN CAPITAL LETTER U
U+0301 '\u0301' Mn

0

COMBINING ACUTE ACCENT

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "NJU\xcc\x81|\\n123|\\n"
    NJÚ|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Orok

Sequence of language Orok from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0427 '\u0427' Lu

1

CYRILLIC CAPITAL LETTER CHE
U+0438 '\u0438' Ll

1

CYRILLIC SMALL LETTER I
U+043F '\u043f' Ll

1

CYRILLIC SMALL LETTER PE
U+0430 '\u0430' Ll

1

CYRILLIC SMALL LETTER A
U+0304 '\u0304' Mn

0

COMBINING MACRON
U+043B '\u043b' Ll

1

CYRILLIC SMALL LETTER EL
U+0438 '\u0438' Ll

1

CYRILLIC SMALL LETTER I
U+043D '\u043d' Ll

1

CYRILLIC SMALL LETTER EN
U+043D '\u043d' Ll

1

CYRILLIC SMALL LETTER EN
U+0435 '\u0435' Ll

1

CYRILLIC SMALL LETTER IE
U+0304 '\u0304' Mn

0

COMBINING MACRON
U+0441 '\u0441' Ll

1

CYRILLIC SMALL LETTER ES
U+0430 '\u0430' Ll

1

CYRILLIC SMALL LETTER A
U+043B '\u043b' Ll

1

CYRILLIC SMALL LETTER EL

Total codepoints: 14

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xd0\xa7\xd0\xb8\xd0\xbf\xd0\xb0\xcc\x84\xd0\xbb\xd0\xb8\xd0\xbd\xd0\xbd\xd0\xb5\xcc\x84\xd1\x81\xd0\xb0\xd0\xbb|\\n123456789012|\\n"
    Чипа̄линне̄сал|
    123456789012|
  • python wcwidth.wcswidth() measures width 12, while zoc measures width 14.

Tem

Sequence of language Tem from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0196 '\u0196' Lu

1

LATIN CAPITAL LETTER IOTA
U+0072 'r' Ll

1

LATIN SMALL LETTER R
U+028A '\u028a' Ll

1

LATIN SMALL LETTER UPSILON
U+0301 '\u0301' Mn

0

COMBINING ACUTE ACCENT
U+002D '-' Pd

1

HYPHEN-MINUS
U+0064 'd' Ll

1

LATIN SMALL LETTER D
U+025B '\u025b' Ll

1

LATIN SMALL LETTER OPEN E
U+0301 '\u0301' Mn

0

COMBINING ACUTE ACCENT
U+025B '\u025b' Ll

1

LATIN SMALL LETTER OPEN E

Total codepoints: 9

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xc6\x96r\xca\x8a\xcc\x81-d\xc9\x9b\xcc\x81\xc9\x9b|\\n1234567|\\n"
    Ɩrʊ́-dɛ́ɛ|
    1234567|
  • python wcwidth.wcswidth() measures width 7, while zoc measures width 9.

Nanai

Sequence of language Nanai from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+041D '\u041d' Lu

1

CYRILLIC CAPITAL LETTER EN
U+0430 '\u0430' Ll

1

CYRILLIC SMALL LETTER A
U+0438 '\u0438' Ll

1

CYRILLIC SMALL LETTER I
U+0306 '\u0306' Mn

0

COMBINING BREVE

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xd0\x9d\xd0\xb0\xd0\xb8\xcc\x86|\\n123|\\n"
    Най|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Evenki

Sequence of language Evenki from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0411 '\u0411' Lu

1

CYRILLIC CAPITAL LETTER BE
U+0443 '\u0443' Ll

1

CYRILLIC SMALL LETTER U
U+0433 '\u0433' Ll

1

CYRILLIC SMALL LETTER GHE
U+0430 '\u0430' Ll

1

CYRILLIC SMALL LETTER A
U+0304 '\u0304' Mn

0

COMBINING MACRON
U+0434 '\u0434' Ll

1

CYRILLIC SMALL LETTER DE
U+0443 '\u0443' Ll

1

CYRILLIC SMALL LETTER U

Total codepoints: 7

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xd0\x91\xd1\x83\xd0\xb3\xd0\xb0\xcc\x84\xd0\xb4\xd1\x83|\\n123456|\\n"
    Буга̄ду|
    123456|
  • python wcwidth.wcswidth() measures width 6, while zoc measures width 7.

Yaneshaʼ

Sequence of language Yaneshaʼ from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0303 '\u0303' Mn

0

COMBINING TILDE
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+006C 'l' Ll

1

LATIN SMALL LETTER L
U+006C 'l' Ll

1

LATIN SMALL LETTER L
U+006F 'o' Ll

1

LATIN SMALL LETTER O
U+0068 'h' Ll

1

LATIN SMALL LETTER H
U+0075 'u' Ll

1

LATIN SMALL LETTER U
U+0065 'e' Ll

1

LATIN SMALL LETTER E
U+006E 'n' Ll

1

LATIN SMALL LETTER N

Total codepoints: 9

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xcc\x83allohuen|\\n12345678|\\n"
    ̃allohuen|
    12345678|
  • python wcwidth.wcswidth() measures width 8, while zoc measures width 9.

Ticuna

Sequence of language Ticuna from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+004E 'N' Lu

1

LATIN CAPITAL LETTER N
U+00FC '\xfc' Ll

1

LATIN SMALL LETTER U WITH DIAERESIS
U+0078 'x' Ll

1

LATIN SMALL LETTER X
U+00FC '\xfc' Ll

1

LATIN SMALL LETTER U WITH DIAERESIS
U+0303 '\u0303' Mn

0

COMBINING TILDE

Total codepoints: 5

  • Shell test using printf(1), '|' should align in output:

    $ printf "N\xc3\xbcx\xc3\xbc\xcc\x83|\\n1234|\\n"
    Nüxü̃|
    1234|
  • python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Amarakaeri

Sequence of language Amarakaeri from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+006F 'o' Ll

1

LATIN SMALL LETTER O
U+0027 "'" Po

1

APOSTROPHE
U+006E 'n' Ll

1

LATIN SMALL LETTER N
U+006F 'o' Ll

1

LATIN SMALL LETTER O
U+0070 'p' Ll

1

LATIN SMALL LETTER P
U+006F 'o' Ll

1

LATIN SMALL LETTER O
U+0065 'e' Ll

1

LATIN SMALL LETTER E
U+0331 '\u0331' Mn

0

COMBINING MACRON BELOW
U+0070 'p' Ll

1

LATIN SMALL LETTER P
U+006F 'o' Ll

1

LATIN SMALL LETTER O

Total codepoints: 10

  • Shell test using printf(1), '|' should align in output:

    $ printf "o'nopoe\xcc\xb1po|\\n123456789|\\n"
    o'nopoe̱po|
    123456789|
  • python wcwidth.wcswidth() measures width 9, while zoc measures width 10.

South Azerbaijani

Sequence of language South Azerbaijani from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0049 'I' Lu

1

LATIN CAPITAL LETTER I
U+0307 '\u0307' Mn

0

COMBINING DOT ABOVE
U+004E 'N' Lu

1

LATIN CAPITAL LETTER N
U+0053 'S' Lu

1

LATIN CAPITAL LETTER S
U+0041 'A' Lu

1

LATIN CAPITAL LETTER A
U+004E 'N' Lu

1

LATIN CAPITAL LETTER N

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "I\xcc\x87NSAN|\\n12345|\\n"
    İNSAN|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Yoruba

Sequence of language Yoruba from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+1EB8 '\u1eb8' Lu

1

LATIN CAPITAL LETTER E WITH DOT BELOW
U+0300 '\u0300' Mn

0

COMBINING GRAVE ACCENT
U+0054 'T' Lu

1

LATIN CAPITAL LETTER T
U+1ECC '\u1ecc' Lu

1

LATIN CAPITAL LETTER O WITH DOT BELOW
U+0301 '\u0301' Mn

0

COMBINING ACUTE ACCENT

Total codepoints: 5

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe1\xba\xb8\xcc\x80T\xe1\xbb\x8c\xcc\x81|\\n123|\\n"
    Ẹ̀TỌ́|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 5.

Chickasaw

Sequence of language Chickasaw from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+004D 'M' Lu

1

LATIN CAPITAL LETTER M
U+00F3 '\xf3' Ll

1

LATIN SMALL LETTER O WITH ACUTE
U+0331 '\u0331' Mn

0

COMBINING MACRON BELOW
U+006D 'm' Ll

1

LATIN SMALL LETTER M
U+0061 'a' Ll

1

LATIN SMALL LETTER A

Total codepoints: 5

  • Shell test using printf(1), '|' should align in output:

    $ printf "M\xc3\xb3\xcc\xb1ma|\\n1234|\\n"
    Mó̱ma|
    1234|
  • python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Siona

Sequence of language Siona from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0067 'g' Ll

1

LATIN SMALL LETTER G
U+0075 'u' Ll

1

LATIN SMALL LETTER U
U+00EB '\xeb' Ll

1

LATIN SMALL LETTER E WITH DIAERESIS
U+0331 '\u0331' Mn

0

COMBINING MACRON BELOW
U+006E 'n' Ll

1

LATIN SMALL LETTER N
U+0061 'a' Ll

1

LATIN SMALL LETTER A

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "gu\xc3\xab\xcc\xb1na|\\n12345|\\n"
    guë̱na|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Fur

Sequence of language Fur from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0044 'D' Lu

1

LATIN CAPITAL LETTER D
U+00E1 '\xe1' Ll

1

LATIN SMALL LETTER A WITH ACUTE
U+0331 '\u0331' Mn

0

COMBINING MACRON BELOW
U+006C 'l' Ll

1

LATIN SMALL LETTER L
U+0064 'd' Ll

1

LATIN SMALL LETTER D
U+0268 '\u0268' Ll

1

LATIN SMALL LETTER I WITH STROKE
U+0301 '\u0301' Mn

0

COMBINING ACUTE ACCENT
U+014B '\u014b' Ll

1

LATIN SMALL LETTER ENG
U+00E1 '\xe1' Ll

1

LATIN SMALL LETTER A WITH ACUTE
U+A78C '\ua78c' Ll

1

LATIN SMALL LETTER SALTILLO
U+014B '\u014b' Ll

1

LATIN SMALL LETTER ENG

Total codepoints: 11

  • Shell test using printf(1), '|' should align in output:

    $ printf "D\xc3\xa1\xcc\xb1ld\xc9\xa8\xcc\x81\xc5\x8b\xc3\xa1\xea\x9e\x8c\xc5\x8b|\\n123456789|\\n"
    Dá̱ldɨ́ŋáꞌŋ|
    123456789|
  • python wcwidth.wcswidth() measures width 9, while zoc measures width 11.

Chinantec, Chiltepec

Sequence of language Chinantec, Chiltepec from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+006D 'm' Ll

1

LATIN SMALL LETTER M
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+006B 'k' Ll

1

LATIN SMALL LETTER K
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+006C 'l' Ll

1

LATIN SMALL LETTER L
U+006F 'o' Ll

1

LATIN SMALL LETTER O
U+006F 'o' Ll

1

LATIN SMALL LETTER O
U+0331 '\u0331' Mn

0

COMBINING MACRON BELOW

Total codepoints: 8

  • Shell test using printf(1), '|' should align in output:

    $ printf "makaloo\xcc\xb1|\\n1234567|\\n"
    makaloo̱|
    1234567|
  • python wcwidth.wcswidth() measures width 7, while zoc measures width 8.

Gumuz

Sequence of language Gumuz from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+006D 'm' Ll

1

LATIN SMALL LETTER M
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+0067 'g' Ll

1

LATIN SMALL LETTER G
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+0063 'c' Ll

1

LATIN SMALL LETTER C
U+0327 '\u0327' Mn

0

COMBINING CEDILLA

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "magac\xcc\xa7|\\n12345|\\n"
    magaç|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Bora

Sequence of language Bora from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+006D 'm' Ll

1

LATIN SMALL LETTER M
U+0268 '\u0268' Ll

1

LATIN SMALL LETTER I WITH STROKE
U+0301 '\u0301' Mn

0

COMBINING ACUTE ACCENT
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+006D 'm' Ll

1

LATIN SMALL LETTER M
U+00FA '\xfa' Ll

1

LATIN SMALL LETTER U WITH ACUTE
U+006E 'n' Ll

1

LATIN SMALL LETTER N
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+0061 'a' Ll

1

LATIN SMALL LETTER A

Total codepoints: 9

  • Shell test using printf(1), '|' should align in output:

    $ printf "m\xc9\xa8\xcc\x81am\xc3\xbanaa|\\n12345678|\\n"
    mɨ́amúnaa|
    12345678|
  • python wcwidth.wcswidth() measures width 8, while zoc measures width 9.

Mòoré

Sequence of language Mòoré from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0073 's' Ll

1

LATIN SMALL LETTER S
U+0065 'e' Ll

1

LATIN SMALL LETTER E
U+0303 '\u0303' Mn

0

COMBINING TILDE
U+006E 'n' Ll

1

LATIN SMALL LETTER N

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "se\xcc\x83n|\\n123|\\n"
    sẽn|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Mongolian, Halh (Mongolian)

Sequence of language Mongolian, Halh (Mongolian) from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+1828 '\u1828' Lo

1

MONGOLIAN LETTER NA
U+1821 '\u1821' Lo

1

MONGOLIAN LETTER E
U+1837 '\u1837' Lo

1

MONGOLIAN LETTER RA
U+180E '\u180e' Cf

0

MONGOLIAN VOWEL SEPARATOR
U+1821 '\u1821' Lo

1

MONGOLIAN LETTER E

Total codepoints: 5

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xe1\xa0\xa8\xe1\xa0\xa1\xe1\xa0\xb7\xe1\xa0\x8e\xe1\xa0\xa1|\\n1234|\\n"
    ᠨᠡᠷ᠎ᠡ|
    1234|
  • python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Lamnso'

Sequence of language Lamnso' from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0064 'd' Ll

1

LATIN SMALL LETTER D
U+007A 'z' Ll

1

LATIN SMALL LETTER Z
U+0259 '\u0259' Ll

1

LATIN SMALL LETTER SCHWA
U+0259 '\u0259' Ll

1

LATIN SMALL LETTER SCHWA
U+0300 '\u0300' Mn

0

COMBINING GRAVE ACCENT
U+006E 'n' Ll

1

LATIN SMALL LETTER N

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "dz\xc9\x99\xc9\x99\xcc\x80n|\\n12345|\\n"
    dzəə̀n|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Navajo

Sequence of language Navajo from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0042 'B' Lu

1

LATIN CAPITAL LETTER B
U+0065 'e' Ll

1

LATIN SMALL LETTER E
U+0065 'e' Ll

1

LATIN SMALL LETTER E
U+0068 'h' Ll

1

LATIN SMALL LETTER H
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+007A 'z' Ll

1

LATIN SMALL LETTER Z
U+0105 '\u0105' Ll

1

LATIN SMALL LETTER A WITH OGONEK
U+0301 '\u0301' Mn

0

COMBINING ACUTE ACCENT
U+0105 '\u0105' Ll

1

LATIN SMALL LETTER A WITH OGONEK

Total codepoints: 9

  • Shell test using printf(1), '|' should align in output:

    $ printf "Beehaz\xc4\x85\xcc\x81\xc4\x85|\\n12345678|\\n"
    Beehazą́ą|
    12345678|
  • python wcwidth.wcswidth() measures width 8, while zoc measures width 9.

Tamazight, Central Atlas

Sequence of language Tamazight, Central Atlas from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0054 'T' Lu

1

LATIN CAPITAL LETTER T
U+0049 'I' Lu

1

LATIN CAPITAL LETTER I
U+0053 'S' Lu

1

LATIN CAPITAL LETTER S
U+0323 '\u0323' Mn

0

COMBINING DOT BELOW
U+0045 'E' Lu

1

LATIN CAPITAL LETTER E
U+0052 'R' Lu

1

LATIN CAPITAL LETTER R
U+0052 'R' Lu

1

LATIN CAPITAL LETTER R
U+0049 'I' Lu

1

LATIN CAPITAL LETTER I
U+0048 'H' Lu

1

LATIN CAPITAL LETTER H
U+0323 '\u0323' Mn

0

COMBINING DOT BELOW
U+0054 'T' Lu

1

LATIN CAPITAL LETTER T

Total codepoints: 11

  • Shell test using printf(1), '|' should align in output:

    $ printf "TIS\xcc\xa3ERRIH\xcc\xa3T|\\n123456789|\\n"
    TIṢERRIḤT|
    123456789|
  • python wcwidth.wcswidth() measures width 9, while zoc measures width 11.

Gilyak

Sequence of language Gilyak from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+043D '\u043d' Ll

1

CYRILLIC SMALL LETTER EN
U+0430 '\u0430' Ll

1

CYRILLIC SMALL LETTER A
U+043C '\u043c' Ll

1

CYRILLIC SMALL LETTER EM
U+0430 '\u0430' Ll

1

CYRILLIC SMALL LETTER A
U+0434 '\u0434' Ll

1

CYRILLIC SMALL LETTER DE
U+0438 '\u0438' Ll

1

CYRILLIC SMALL LETTER I
U+0432 '\u0432' Ll

1

CYRILLIC SMALL LETTER VE
U+04CA '\u04ca' Ll

1

CYRILLIC SMALL LETTER EN WITH TAIL
U+0447 '\u0447' Ll

1

CYRILLIC SMALL LETTER CHE
U+043E '\u043e' Ll

1

CYRILLIC SMALL LETTER O
U+0493 '\u0493' Ll

1

CYRILLIC SMALL LETTER GHE WITH STROKE
U+0440 '\u0440' Ll

1

CYRILLIC SMALL LETTER ER
U+030C '\u030c' Mn

0

COMBINING CARON

Total codepoints: 13

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xd0\xbd\xd0\xb0\xd0\xbc\xd0\xb0\xd0\xb4\xd0\xb8\xd0\xb2\xd3\x8a\xd1\x87\xd0\xbe\xd2\x93\xd1\x80\xcc\x8c|\\n123456789012|\\n"
    намадивӊчоғр̌|
    123456789012|
  • python wcwidth.wcswidth() measures width 12, while zoc measures width 13.

Ditammari

Sequence of language Ditammari from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+006D 'm' Ll

1

LATIN SMALL LETTER M
U+0075 'u' Ll

1

LATIN SMALL LETTER U
U+0077 'w' Ll

1

LATIN SMALL LETTER W
U+025B '\u025b' Ll

1

LATIN SMALL LETTER OPEN E
U+0303 '\u0303' Mn

0

COMBINING TILDE
U+0072 'r' Ll

1

LATIN SMALL LETTER R
U+0069 'i' Ll

1

LATIN SMALL LETTER I
U+006D 'm' Ll

1

LATIN SMALL LETTER M
U+0075 'u' Ll

1

LATIN SMALL LETTER U

Total codepoints: 9

  • Shell test using printf(1), '|' should align in output:

    $ printf "muw\xc9\x9b\xcc\x83rimu|\\n12345678|\\n"
    muwɛ̃rimu|
    12345678|
  • python wcwidth.wcswidth() measures width 8, while zoc measures width 9.

Assyrian Neo-Aramaic

Sequence of language Assyrian Neo-Aramaic from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+072C '\u072c' Lo

1

SYRIAC LETTER TAW
U+071D '\u071d' Lo

1

SYRIAC LETTER YUDH
U+0712 '\u0712' Lo

1

SYRIAC LETTER BETH
U+0742 '\u0742' Mn

0

SYRIAC RUKKAKHA
U+0720 '\u0720' Lo

1

SYRIAC LETTER LAMADH
U+071D '\u071d' Lo

1

SYRIAC LETTER YUDH
U+0710 '\u0710' Lo

1

SYRIAC LETTER ALAPH

Total codepoints: 7

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xdc\xac\xdc\x9d\xdc\x92\xdd\x82\xdc\xa0\xdc\x9d\xdc\x90|\\n123456|\\n"
    ܬܝܒ݂ܠܝܐ|
    123456|
  • python wcwidth.wcswidth() measures width 6, while zoc measures width 7.

Farsi, Western

Sequence of language Farsi, Western from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+06A9 '\u06a9' Lo

1

ARABIC LETTER KEHEH
U+0644 '\u0644' Lo

1

ARABIC LETTER LAM
U+06CC '\u06cc' Lo

1

ARABIC LETTER FARSI YEH
U+0647 '\u0647' Lo

1

ARABIC LETTER HEH
U+0654 '\u0654' Mn

0

ARABIC HAMZA ABOVE

Total codepoints: 5

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xda\xa9\xd9\x84\xdb\x8c\xd9\x87\xd9\x94|\\n1234|\\n"
    کلیهٔ|
    1234|
  • python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Otomi, Mezquital

Sequence of language Otomi, Mezquital from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0058 'X' Lu

1

LATIN CAPITAL LETTER X
U+0049 'I' Lu

1

LATIN CAPITAL LETTER I
U+004A 'J' Lu

1

LATIN CAPITAL LETTER J
U+004D 'M' Lu

1

LATIN CAPITAL LETTER M
U+004F 'O' Lu

1

LATIN CAPITAL LETTER O
U+004A 'J' Lu

1

LATIN CAPITAL LETTER J
U+004F 'O' Lu

1

LATIN CAPITAL LETTER O
U+0331 '\u0331' Mn

0

COMBINING MACRON BELOW
U+0049 'I' Lu

1

LATIN CAPITAL LETTER I

Total codepoints: 9

  • Shell test using printf(1), '|' should align in output:

    $ printf "XIJMOJO\xcc\xb1I|\\n12345678|\\n"
    XIJMOJO̱I|
    12345678|
  • python wcwidth.wcswidth() measures width 8, while zoc measures width 9.

Veps

Sequence of language Veps from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0075 'u' Ll

1

LATIN SMALL LETTER U
U+0308 '\u0308' Mn

0

COMBINING DIAERESIS
U+0068 'h' Ll

1

LATIN SMALL LETTER H
U+0074 't' Ll

1

LATIN SMALL LETTER T
U+0068 'h' Ll

1

LATIN SMALL LETTER H
U+0069 'i' Ll

1

LATIN SMALL LETTER I
U+006E 'n' Ll

1

LATIN SMALL LETTER N
U+0065 'e' Ll

1

LATIN SMALL LETTER E

Total codepoints: 8

  • Shell test using printf(1), '|' should align in output:

    $ printf "u\xcc\x88hthine|\\n1234567|\\n"
    ühthine|
    1234567|
  • python wcwidth.wcswidth() measures width 7, while zoc measures width 8.

Waama

Sequence of language Waama from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+006E 'n' Ll

1

LATIN SMALL LETTER N
U+0300 '\u0300' Mn

0

COMBINING GRAVE ACCENT

Total codepoints: 2

  • Shell test using printf(1), '|' should align in output:

    $ printf "n\xcc\x80|\\n1|\\n"
    ǹ|
    1|
  • python wcwidth.wcswidth() measures width 1, while zoc measures width 2.

Dinka, Northeastern

Sequence of language Dinka, Northeastern from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0062 'b' Ll

1

LATIN SMALL LETTER B
U+025B '\u025b' Ll

1

LATIN SMALL LETTER OPEN E
U+0308 '\u0308' Mn

0

COMBINING DIAERESIS
U+0069 'i' Ll

1

LATIN SMALL LETTER I

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "b\xc9\x9b\xcc\x88i|\\n123|\\n"
    bɛ̈i|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Dari

Sequence of language Dari from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+06A9 '\u06a9' Lo

1

ARABIC LETTER KEHEH
U+0644 '\u0644' Lo

1

ARABIC LETTER LAM
U+06CC '\u06cc' Lo

1

ARABIC LETTER FARSI YEH
U+0647 '\u0647' Lo

1

ARABIC LETTER HEH
U+0654 '\u0654' Mn

0

ARABIC HAMZA ABOVE

Total codepoints: 5

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xda\xa9\xd9\x84\xdb\x8c\xd9\x87\xd9\x94|\\n1234|\\n"
    کلیهٔ|
    1234|
  • python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Éwé

Sequence of language Éwé from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0068 'h' Ll

1

LATIN SMALL LETTER H
U+006C 'l' Ll

1

LATIN SMALL LETTER L
U+0254 '\u0254' Ll

1

LATIN SMALL LETTER OPEN O
U+0303 '\u0303' Mn

0

COMBINING TILDE
U+006E 'n' Ll

1

LATIN SMALL LETTER N
U+0075 'u' Ll

1

LATIN SMALL LETTER U
U+0077 'w' Ll

1

LATIN SMALL LETTER W
U+0254 '\u0254' Ll

1

LATIN SMALL LETTER OPEN O
U+0077 'w' Ll

1

LATIN SMALL LETTER W
U+0254 '\u0254' Ll

1

LATIN SMALL LETTER OPEN O

Total codepoints: 10

  • Shell test using printf(1), '|' should align in output:

    $ printf "hl\xc9\x94\xcc\x83nuw\xc9\x94w\xc9\x94|\\n123456789|\\n"
    hlɔ̃nuwɔwɔ|
    123456789|
  • python wcwidth.wcswidth() measures width 9, while zoc measures width 10.

Baatonum

Sequence of language Baatonum from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+006D 'm' Ll

1

LATIN SMALL LETTER M
U+025B '\u025b' Ll

1

LATIN SMALL LETTER OPEN E
U+0300 '\u0300' Mn

0

COMBINING GRAVE ACCENT

Total codepoints: 3

  • Shell test using printf(1), '|' should align in output:

    $ printf "m\xc9\x9b\xcc\x80|\\n12|\\n"
    mɛ̀|
    12|
  • python wcwidth.wcswidth() measures width 2, while zoc measures width 3.

Urdu (2)

Sequence of language Urdu (2) from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0627 '\u0627' Lo

1

ARABIC LETTER ALEF
U+0642 '\u0642' Lo

1

ARABIC LETTER QAF
U+0648 '\u0648' Lo

1

ARABIC LETTER WAW
U+0627 '\u0627' Lo

1

ARABIC LETTER ALEF
U+0645 '\u0645' Lo

1

ARABIC LETTER MEEM
U+0650 '\u0650' Mn

0

ARABIC KASRA

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xd8\xa7\xd9\x82\xd9\x88\xd8\xa7\xd9\x85\xd9\x90|\\n12345|\\n"
    اقوامِ|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Urdu

Sequence of language Urdu from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0627 '\u0627' Lo

1

ARABIC LETTER ALEF
U+0642 '\u0642' Lo

1

ARABIC LETTER QAF
U+0648 '\u0648' Lo

1

ARABIC LETTER WAW
U+0627 '\u0627' Lo

1

ARABIC LETTER ALEF
U+0645 '\u0645' Lo

1

ARABIC LETTER MEEM
U+0650 '\u0650' Mn

0

ARABIC KASRA

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xd8\xa7\xd9\x82\xd9\x88\xd8\xa7\xd9\x85\xd9\x90|\\n12345|\\n"
    اقوامِ|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Uduk

Sequence of language Uduk from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0070 'p' Ll

1

LATIN SMALL LETTER P
U+0331 '\u0331' Mn

0

COMBINING MACRON BELOW
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+0072 'r' Ll

1

LATIN SMALL LETTER R
U+0061 'a' Ll

1

LATIN SMALL LETTER A

Total codepoints: 5

  • Shell test using printf(1), '|' should align in output:

    $ printf "p\xcc\xb1ara|\\n1234|\\n"
    p̱ara|
    1234|
  • python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Mazahua Central

Sequence of language Mazahua Central from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0054 'T' Lu

1

LATIN CAPITAL LETTER T
U+0045 'E' Lu

1

LATIN CAPITAL LETTER E
U+0331 '\u0331' Mn

0

COMBINING MACRON BELOW
U+0027 "'" Po

1

APOSTROPHE
U+0045 'E' Lu

1

LATIN CAPITAL LETTER E
U+0331 '\u0331' Mn

0

COMBINING MACRON BELOW

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "TE\xcc\xb1'E\xcc\xb1|\\n1234|\\n"
    TE̱'E̱|
    1234|
  • python wcwidth.wcswidth() measures width 4, while zoc measures width 6.

Secoya

Sequence of language Secoya from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0063 'c' Ll

1

LATIN SMALL LETTER C
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+006E 'n' Ll

1

LATIN SMALL LETTER N
U+00EB '\xeb' Ll

1

LATIN SMALL LETTER E WITH DIAERESIS
U+006F 'o' Ll

1

LATIN SMALL LETTER O
U+0077 'w' Ll

1

LATIN SMALL LETTER W
U+00EB '\xeb' Ll

1

LATIN SMALL LETTER E WITH DIAERESIS
U+0331 '\u0331' Mn

0

COMBINING MACRON BELOW

Total codepoints: 8

  • Shell test using printf(1), '|' should align in output:

    $ printf "can\xc3\xabow\xc3\xab\xcc\xb1|\\n1234567|\\n"
    canëowë̱|
    1234567|
  • python wcwidth.wcswidth() measures width 7, while zoc measures width 8.

Gen

Sequence of language Gen from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0064 'd' Ll

1

LATIN SMALL LETTER D
U+0254 '\u0254' Ll

1

LATIN SMALL LETTER OPEN O
U+0300 '\u0300' Mn

0

COMBINING GRAVE ACCENT
U+006E 'n' Ll

1

LATIN SMALL LETTER N
U+006E 'n' Ll

1

LATIN SMALL LETTER N
U+0061 'a' Ll

1

LATIN SMALL LETTER A

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "d\xc9\x94\xcc\x80nna|\\n12345|\\n"
    dɔ̀nna|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Picard

Sequence of language Picard from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0076 'v' Ll

1

LATIN SMALL LETTER V
U+0072 'r' Ll

1

LATIN SMALL LETTER R
U+0065 'e' Ll

1

LATIN SMALL LETTER E
U+030A '\u030a' Mn

0

COMBINING RING ABOVE
U+0079 'y' Ll

1

LATIN SMALL LETTER Y
U+006D 'm' Ll

1

LATIN SMALL LETTER M
U+0069 'i' Ll

1

LATIN SMALL LETTER I
U+006E 'n' Ll

1

LATIN SMALL LETTER N
U+0074 't' Ll

1

LATIN SMALL LETTER T

Total codepoints: 9

  • Shell test using printf(1), '|' should align in output:

    $ printf "vre\xcc\x8aymint|\\n12345678|\\n"
    vre̊ymint|
    12345678|
  • python wcwidth.wcswidth() measures width 8, while zoc measures width 9.

Mixtec, Metlatónoc

Sequence of language Mixtec, Metlatónoc from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+006E 'n' Ll

1

LATIN SMALL LETTER N
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+0027 "'" Po

1

APOSTROPHE
U+006E 'n' Ll

1

LATIN SMALL LETTER N
U+0075 'u' Ll

1

LATIN SMALL LETTER U
U+0331 '\u0331' Mn

0

COMBINING MACRON BELOW

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "na'nu\xcc\xb1|\\n12345|\\n"
    na'nu̱|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Arabic, Standard

Sequence of language Arabic, Standard from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0627 '\u0627' Lo

1

ARABIC LETTER ALEF
U+0639 '\u0639' Lo

1

ARABIC LETTER AIN
U+062A '\u062a' Lo

1

ARABIC LETTER TEH
U+064F '\u064f' Mn

0

ARABIC DAMMA
U+0645 '\u0645' Lo

1

ARABIC LETTER MEEM
U+062F '\u062f' Lo

1

ARABIC LETTER DAL

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xd8\xa7\xd8\xb9\xd8\xaa\xd9\x8f\xd9\x85\xd8\xaf|\\n12345|\\n"
    اعتُمد|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Ga

Sequence of language Ga from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+0073 's' Ll

1

LATIN SMALL LETTER S
U+0068 'h' Ll

1

LATIN SMALL LETTER H
U+0254 '\u0254' Ll

1

LATIN SMALL LETTER OPEN O
U+0303 '\u0303' Mn

0

COMBINING TILDE

Total codepoints: 5

  • Shell test using printf(1), '|' should align in output:

    $ printf "ash\xc9\x94\xcc\x83|\\n1234|\\n"
    ashɔ̃|
    1234|
  • python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Panjabi, Western

Sequence of language Panjabi, Western from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0627 '\u0627' Lo

1

ARABIC LETTER ALEF
U+064F '\u064f' Mn

0

ARABIC DAMMA
U+0646 '\u0646' Lo

1

ARABIC LETTER NOON
U+06CC '\u06cc' Lo

1

ARABIC LETTER FARSI YEH

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xd8\xa7\xd9\x8f\xd9\x86\xdb\x8c|\\n123|\\n"
    اُنی|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Dangme

Sequence of language Dangme from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+006E 'n' Ll

1

LATIN SMALL LETTER N
U+0254 '\u0254' Ll

1

LATIN SMALL LETTER OPEN O
U+0301 '\u0301' Mn

0

COMBINING ACUTE ACCENT

Total codepoints: 3

  • Shell test using printf(1), '|' should align in output:

    $ printf "n\xc9\x94\xcc\x81|\\n12|\\n"
    nɔ́|
    12|
  • python wcwidth.wcswidth() measures width 2, while zoc measures width 3.

Dagaare, Southern

Sequence of language Dagaare, Southern from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+006B 'k' Ll

1

LATIN SMALL LETTER K
U+0075 'u' Ll

1

LATIN SMALL LETTER U
U+0303 '\u0303' Mn

0

COMBINING TILDE
U+0075 'u' Ll

1

LATIN SMALL LETTER U
U+0303 '\u0303' Mn

0

COMBINING TILDE

Total codepoints: 5

  • Shell test using printf(1), '|' should align in output:

    $ printf "ku\xcc\x83u\xcc\x83|\\n123|\\n"
    kũũ|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 5.

Serer-Sine

Sequence of language Serer-Sine from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0070 'p' Ll

1

LATIN SMALL LETTER P
U+0301 '\u0301' Mn

0

COMBINING ACUTE ACCENT
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+0073 's' Ll

1

LATIN SMALL LETTER S
U+0069 'i' Ll

1

LATIN SMALL LETTER I
U+006C 'l' Ll

1

LATIN SMALL LETTER L

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "p\xcc\x81asil|\\n12345|\\n"
    ṕasil|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Fon

Sequence of language Fon from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0061 'a' Ll

1

LATIN SMALL LETTER A
U+006B 'k' Ll

1

LATIN SMALL LETTER K
U+0254 '\u0254' Ll

1

LATIN SMALL LETTER OPEN O
U+0301 '\u0301' Mn

0

COMBINING ACUTE ACCENT
U+006E 'n' Ll

1

LATIN SMALL LETTER N

Total codepoints: 5

  • Shell test using printf(1), '|' should align in output:

    $ printf "ak\xc9\x94\xcc\x81n|\\n1234|\\n"
    akɔ́n|
    1234|
  • python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Aja

Sequence of language Aja from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+00E8 '\xe8' Ll

1

LATIN SMALL LETTER E WITH GRAVE
U+0067 'g' Ll

1

LATIN SMALL LETTER G
U+0062 'b' Ll

1

LATIN SMALL LETTER B
U+025B '\u025b' Ll

1

LATIN SMALL LETTER OPEN E
U+0300 '\u0300' Mn

0

COMBINING GRAVE ACCENT
U+006D 'm' Ll

1

LATIN SMALL LETTER M
U+025B '\u025b' Ll

1

LATIN SMALL LETTER OPEN E
U+0300 '\u0300' Mn

0

COMBINING GRAVE ACCENT

Total codepoints: 8

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xc3\xa8gb\xc9\x9b\xcc\x80m\xc9\x9b\xcc\x80|\\n123456|\\n"
    ègbɛ̀mɛ̀|
    123456|
  • python wcwidth.wcswidth() measures width 6, while zoc measures width 8.

Pashto, Northern

Sequence of language Pashto, Northern from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0627 '\u0627' Lo

1

ARABIC LETTER ALEF
U+0633 '\u0633' Lo

1

ARABIC LETTER SEEN
U+0627 '\u0627' Lo

1

ARABIC LETTER ALEF
U+0633 '\u0633' Lo

1

ARABIC LETTER SEEN
U+0627 '\u0627' Lo

1

ARABIC LETTER ALEF
U+064B '\u064b' Mn

0

ARABIC FATHATAN

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xd8\xa7\xd8\xb3\xd8\xa7\xd8\xb3\xd8\xa7\xd9\x8b|\\n12345|\\n"
    اساساً|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Dendi

Sequence of language Dendi from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0062 'b' Ll

1

LATIN SMALL LETTER B
U+0254 '\u0254' Ll

1

LATIN SMALL LETTER OPEN O
U+0303 '\u0303' Mn

0

COMBINING TILDE
U+014B '\u014b' Ll

1

LATIN SMALL LETTER ENG
U+0254 '\u0254' Ll

1

LATIN SMALL LETTER OPEN O
U+002E '.' Po

1

FULL STOP

Total codepoints: 6

  • Shell test using printf(1), '|' should align in output:

    $ printf "b\xc9\x94\xcc\x83\xc5\x8b\xc9\x94.|\\n12345|\\n"
    bɔ̃ŋɔ.|
    12345|
  • python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Seraiki

Sequence of language Seraiki from midpoint of alignment failure records:

Codepoint Python Category wcwidth Name
U+0627 '\u0627' Lo

1

ARABIC LETTER ALEF
U+064F '\u064f' Mn

0

ARABIC DAMMA
U+062A '\u062a' Lo

1

ARABIC LETTER TEH
U+06D2 '\u06d2' Lo

1

ARABIC LETTER YEH BARREE

Total codepoints: 4

  • Shell test using printf(1), '|' should align in output:

    $ printf "\xd8\xa7\xd9\x8f\xd8\xaa\xdb\x92|\\n123|\\n"
    اُتے|
    123|
  • python wcwidth.wcswidth() measures width 3, while zoc measures width 4.