zoc

Tested Software version 8.07.3 on Darwin Full results available at ucs-detect repository path data/macos-zoc-8.07.3.yaml

Wide character support

The best wide unicode table version for zoc appears to be 15.0.0, this is from a summary of the following results:

version	n_errors	n_total	pct_success
'5.1.0'	0	26	100.0%
'5.2.0'	55	269	79.6%
'6.0.0'	10	13	23.1%
'9.0.0'	27	5000	99.5%
'10.0.0'	6	735	99.2%
'11.0.0'	0	62	100.0%
'12.0.0'	12	62	80.6%
'12.1.0'	0	1	100.0%
'13.0.0'	2	541	99.6%
'14.0.0'	2	41	95.1%
'15.0.0'	1	15	93.3%
'15.1.0'	4	5	20.0%

Sequence of a WIDE character from Unicode Version 15.0.0, from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0001F6DC	'\U0001f6dc'	So	2	WIRELESS

Total codepoints: 1

Shell test using printf(1), '|' should align in output:
```
$ printf "\xf0\x9f\x9b\x9c|\\n12|\\n"
🛜|
12|
```
python wcwidth.wcswidth() measures width 2, while zoc measures width 1.

Emoji ZWJ support

The best Emoji ZWJ table version for zoc appears to be None, this is from a summary of the following results:

version	n_errors	n_total	pct_success
'2.0'	22	22	0.0%
'4.0'	500	500	0.0%
'5.0'	100	100	0.0%
'11.0'	73	73	0.0%
'12.0'	112	112	0.0%
'12.1'	165	165	0.0%
'13.0'	51	51	0.0%
'13.1'	83	83	0.0%
'14.0'	20	20	0.0%
'15.0'	1	1	0.0%
'15.1'	109	109	0.0%

Sequence of an Emoji ZWJ Sequence from Emoji Version 15.1, from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0001F9D1	'\U0001f9d1'	So	2	ADULT
U+200D	'\u200d'	Cf	0	ZERO WIDTH JOINER
U+0001F9BC	'\U0001f9bc'	So	2	MOTORIZED WHEELCHAIR
U+200D	'\u200d'	Cf	0	ZERO WIDTH JOINER
U+27A1	'\u27a1'	So	1	BLACK RIGHTWARDS ARROW
U+FE0F	'\ufe0f'	Mn	0	VARIATION SELECTOR-16

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "\xf0\x9f\xa7\x91\xe2\x80\x8d\xf0\x9f\xa6\xbc\xe2\x80\x8d\xe2\x9e\xa1\xef\xb8\x8f|\\n12|\\n"
🧑‍🦼‍➡️|
12|

python wcwidth.wcswidth() measures width 2, while zoc measures width 8.

Variation Selector-16 support

Emoji VS-16 results for zoc is 0 errors out of 100 total codepoints tested, 100.0% success. All codepoint combinations with Variation Selector-16 tested were successful.

Language Support

The following 7 languages were tested with 100% success:

Adyghe, Idoma, Kabardian, Tamazight, Central Atlas (Tifinagh), Tamazight, Standard Morocan, Vai, Yukaghir, Northern.

The following 91 languages are not fully supported:

lang	n_errors	n_total	pct_success
Javanese (Javanese)	500	500	0.0%
Nuosu	230	230	0.0%
Cherokee (cased)	500	507	1.4%
Tai Dam	500	511	2.2%
Maldivian	500	515	2.9%
Tamil	500	516	3.1%
Tamil (Sri Lanka)	500	516	3.1%
Burmese	500	519	3.7%
Mon	500	522	4.2%
Shan	500	523	4.4%
Dzongkha	342	359	4.7%
Gujarati	500	530	5.7%
Tibetan, Central	263	279	5.7%
Malayalam	500	533	6.2%
Tamang, Eastern	42	45	6.7%
Kannada	500	536	6.7%
Khün	412	442	6.8%
Khmer, Central	492	528	6.8%
Bengali	500	540	7.4%
Chakma	500	540	7.4%
Telugu	500	550	9.1%
Nepali	500	554	9.7%
Sanskrit	500	563	11.2%
Sanskrit (Grantha)	500	565	11.5%
Marathi	500	571	12.4%
Hindi	500	576	13.2%
Sinhala	500	577	13.3%
Panjabi, Eastern	500	578	13.5%
Bhojpuri	500	584	14.4%
Thai (2)	267	313	14.7%
Maithili	500	613	18.4%
Thai	273	341	19.9%
Magahi	500	643	22.2%
Vietnamese	500	660	24.2%
Tagalog (Tagalog)	21	31	32.3%
Lao	270	426	36.6%
Lingala (tones)	500	844	40.8%
Vietnamese (Han nom)	107	199	46.2%
Pular (Adlam)	500	1044	52.1%
Yiddish, Eastern	500	1062	52.9%
Bamun	500	1138	56.1%
Orok	490	1245	60.6%
Tem	500	1290	61.2%
Nanai	379	1207	68.6%
Evenki	267	899	70.3%
Yaneshaʼ	500	1762	71.6%
Ticuna	500	1767	71.7%
Amarakaeri	401	1446	72.3%
South Azerbaijani	385	1396	72.4%
Yoruba	500	2177	77.0%
Chickasaw	122	554	78.0%
Siona	273	1492	81.7%
Fur	228	1838	87.6%
Chinantec, Chiltepec	213	1729	87.7%
Gumuz	132	1283	89.7%
Bora	162	1598	89.9%
Mòoré	226	2447	90.8%
Mongolian, Halh (Mongolian)	3	33	90.9%
Lamnso'	197	2237	91.2%
Navajo	138	1600	91.4%
Tamazight, Central Atlas	154	1822	91.5%
Gilyak	124	1504	91.8%
Ditammari	139	1882	92.6%
Assyrian Neo-Aramaic	74	1160	93.6%
Farsi, Western	102	1822	94.4%
Otomi, Mezquital	85	1849	95.4%
Veps	59	1323	95.5%
Waama	38	1000	96.2%
Dinka, Northeastern	56	1529	96.3%
Dari	66	1872	96.5%
Éwé	55	2230	97.5%
Baatonum	47	1939	97.6%
Urdu (2)	52	2251	97.7%
Urdu	50	2237	97.8%
Uduk	71	3247	97.8%
Mazahua Central	34	1574	97.8%
Secoya	29	1409	97.9%
Gen	46	2309	98.0%
Picard	36	2024	98.2%
Mixtec, Metlatónoc	24	1367	98.2%
Arabic, Standard	20	1348	98.5%
Ga	26	2039	98.7%
Panjabi, Western	21	2419	99.1%
Dangme	22	2912	99.2%
Dagaare, Southern	19	2582	99.3%
Serer-Sine	7	1596	99.6%
Fon	10	2520	99.6%
Aja	7	2061	99.7%
Pashto, Northern	4	2242	99.8%
Dendi	2	1569	99.9%
Seraiki	2	2242	99.9%

Javanese (Javanese)

Sequence of language Javanese (Javanese) from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+A9CB	'\ua9cb'	Po	1	JAVANESE PADA ADEG ADEG
U+A9B1	'\ua9b1'	Lo	1	JAVANESE LETTER SA
U+A9A7	'\ua9a7'	Lo	1	JAVANESE LETTER BA
U+A9BC	'\ua9bc'	Mn	0	JAVANESE VOWEL SIGN PEPET
U+A9A4	'\ua9a4'	Lo	1	JAVANESE LETTER NA

Total codepoints: 5

Shell test using printf(1), '|' should align in output:

$ printf "\xea\xa7\x8b\xea\xa6\xb1\xea\xa6\xa7\xea\xa6\xbc\xea\xa6\xa4|\\n1234|\\n"
꧋ꦱꦧꦼꦤ|
1234|

python wcwidth.wcswidth() measures width 4, while zoc measures width 10.

Nuosu

Sequence of language Nuosu from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+300A	'\u300a'	Ps	2	LEFT DOUBLE ANGLE BRACKET
U+A2E7	'\ua2e7'	Lo	2	YI SYLLABLE ZZYT
U+A0C5	'\ua0c5'	Lo	2	YI SYLLABLE MU
U+A2BD	'\ua2bd'	Lo	2	YI SYLLABLE COT
U+A305	'\ua305'	Lo	2	YI SYLLABLE NZY
U+A14D	'\ua14d'	Lo	2	YI SYLLABLE DDU
U+A11C	'\ua11c'	Lo	2	YI SYLLABLE TI
U+A2CA	'\ua2ca'	Lo	2	YI SYLLABLE CYT
U+A12F	'\ua12f'	Lo	2	YI SYLLABLE TEP
U+A489	'\ua489'	Lo	2	YI SYLLABLE YY
U+300B	'\u300b'	Pe	2	RIGHT DOUBLE ANGLE BRACKET

Total codepoints: 11

Shell test using printf(1), '|' should align in output:

$ printf "\xe3\x80\x8a\xea\x8b\xa7\xea\x83\x85\xea\x8a\xbd\xea\x8c\x85\xea\x85\x8d\xea\x84\x9c\xea\x8b\x8a\xea\x84\xaf\xea\x92\x89\xe3\x80\x8b|\\n1234567890123456789012|\\n"
《ꋧꃅꊽꌅꅍꄜꋊꄯꒉ》|
1234567890123456789012|

python wcwidth.wcswidth() measures width 22, while zoc measures width 13.

Cherokee (cased)

Sequence of language Cherokee (cased) from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+13C2	'\u13c2'	Lu	1	CHEROKEE LETTER NI
U+AB7C	'\uab7c'	Ll	1	CHEROKEE SMALL LETTER GV
U+AB8E	'\uab8e'	Ll	1	CHEROKEE SMALL LETTER NA
U+ABAB	'\uabab'	Ll	1	CHEROKEE SMALL LETTER DV

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "\xe1\x8f\x82\xea\xad\xbc\xea\xae\x8e\xea\xae\xab|\\n1234|\\n"
Ꮒꭼꮎꮫ|
1234|

python wcwidth.wcswidth() measures width 4, while zoc measures width 7.

Tai Dam

Sequence of language Tai Dam from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+AA81	'\uaa81'	Lo	1	TAI VIET LETTER HIGH KO
U+AAAB	'\uaaab'	Lo	1	TAI VIET LETTER HIGH VO
U+AAB1	'\uaab1'	Lo	1	TAI VIET VOWEL AA
U+AAA3	'\uaaa3'	Lo	1	TAI VIET LETTER HIGH MO

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "\xea\xaa\x81\xea\xaa\xab\xea\xaa\xb1\xea\xaa\xa3|\\n1234|\\n"
ꪁꪫꪱꪣ|
1234|

python wcwidth.wcswidth() measures width 4, while zoc measures width 8.

Maldivian

Sequence of language Maldivian from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0791	'\u0791'	Lo	1	THAANA LETTER DAVIYANI
U+07A8	'\u07a8'	Mn	0	THAANA IBIFILI
U+0790	'\u0790'	Lo	1	THAANA LETTER SEENU
U+07AC	'\u07ac'	Mn	0	THAANA EBEFILI
U+0789	'\u0789'	Lo	1	THAANA LETTER MEEMU
U+07B0	'\u07b0'	Mn	0	THAANA SUKUN
U+0784	'\u0784'	Lo	1	THAANA LETTER BAA
U+07A6	'\u07a6'	Mn	0	THAANA ABAFILI
U+0783	'\u0783'	Lo	1	THAANA LETTER RAA

Total codepoints: 9

Shell test using printf(1), '|' should align in output:

$ printf "\xde\x91\xde\xa8\xde\x90\xde\xac\xde\x89\xde\xb0\xde\x84\xde\xa6\xde\x83|\\n12345|\\n"
ޑިސެމްބަރ|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 9.

Tamil

Sequence of language Tamil from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0BAE	'\u0bae'	Lo	1	TAMIL LETTER MA
U+0BA9	'\u0ba9'	Lo	1	TAMIL LETTER NNNA
U+0BBF	'\u0bbf'	Mc	0	TAMIL VOWEL SIGN I
U+0BA4	'\u0ba4'	Lo	1	TAMIL LETTER TA

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xae\xae\xe0\xae\xa9\xe0\xae\xbf\xe0\xae\xa4|\\n123|\\n"
மனித|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Tamil (Sri Lanka)

Sequence of language Tamil (Sri Lanka) from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0BAE	'\u0bae'	Lo	1	TAMIL LETTER MA
U+0BA9	'\u0ba9'	Lo	1	TAMIL LETTER NNNA
U+0BBF	'\u0bbf'	Mc	0	TAMIL VOWEL SIGN I
U+0BA4	'\u0ba4'	Lo	1	TAMIL LETTER TA

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xae\xae\xe0\xae\xa9\xe0\xae\xbf\xe0\xae\xa4|\\n123|\\n"
மனித|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Burmese

Sequence of language Burmese from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+1021	'\u1021'	Lo	1	MYANMAR LETTER A
U+1015	'\u1015'	Lo	1	MYANMAR LETTER PA
U+103C	'\u103c'	Mc	0	MYANMAR CONSONANT SIGN MEDIAL RA
U+100A	'\u100a'	Lo	1	MYANMAR LETTER NNYA
U+103A	'\u103a'	Mn	0	MYANMAR SIGN ASAT
U+1015	'\u1015'	Lo	1	MYANMAR LETTER PA
U+103C	'\u103c'	Mc	0	MYANMAR CONSONANT SIGN MEDIAL RA
U+100A	'\u100a'	Lo	1	MYANMAR LETTER NNYA
U+103A	'\u103a'	Mn	0	MYANMAR SIGN ASAT
U+1006	'\u1006'	Lo	1	MYANMAR LETTER CHA
U+102D	'\u102d'	Mn	0	MYANMAR VOWEL SIGN I
U+102F	'\u102f'	Mn	0	MYANMAR VOWEL SIGN U
U+1004	'\u1004'	Lo	1	MYANMAR LETTER NGA
U+103A	'\u103a'	Mn	0	MYANMAR SIGN ASAT
U+101B	'\u101b'	Lo	1	MYANMAR LETTER RA
U+102C	'\u102c'	Mc	0	MYANMAR VOWEL SIGN AA

Total codepoints: 16

Shell test using printf(1), '|' should align in output:

$ printf "\xe1\x80\xa1\xe1\x80\x95\xe1\x80\xbc\xe1\x80\x8a\xe1\x80\xba\xe1\x80\x95\xe1\x80\xbc\xe1\x80\x8a\xe1\x80\xba\xe1\x80\x86\xe1\x80\xad\xe1\x80\xaf\xe1\x80\x84\xe1\x80\xba\xe1\x80\x9b\xe1\x80\xac|\\n12345678|\\n"
အပြည်ပြည်ဆိုင်ရာ|
12345678|

python wcwidth.wcswidth() measures width 8, while zoc measures width 16.

Mon

Sequence of language Mon from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+101C	'\u101c'	Lo	1	MYANMAR LETTER LA
U+102D	'\u102d'	Mn	0	MYANMAR VOWEL SIGN I
U+1000	'\u1000'	Lo	1	MYANMAR LETTER KA
U+103A	'\u103a'	Mn	0	MYANMAR SIGN ASAT
U+101C	'\u101c'	Lo	1	MYANMAR LETTER LA
U+101C	'\u101c'	Lo	1	MYANMAR LETTER LA
U+1031	'\u1031'	Mc	0	MYANMAR VOWEL SIGN E
U+102C	'\u102c'	Mc	0	MYANMAR VOWEL SIGN AA
U+105A	'\u105a'	Lo	1	MYANMAR LETTER MON NGA
U+103A	'\u103a'	Mn	0	MYANMAR SIGN ASAT

Total codepoints: 10

Shell test using printf(1), '|' should align in output:

$ printf "\xe1\x80\x9c\xe1\x80\xad\xe1\x80\x80\xe1\x80\xba\xe1\x80\x9c\xe1\x80\x9c\xe1\x80\xb1\xe1\x80\xac\xe1\x81\x9a\xe1\x80\xba|\\n12345|\\n"
လိက်လလောၚ်|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 10.

Shan

Sequence of language Shan from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+101C	'\u101c'	Lo	1	MYANMAR LETTER LA
U+102D	'\u102d'	Mn	0	MYANMAR VOWEL SIGN I
U+1075	'\u1075'	Lo	1	MYANMAR LETTER SHAN KA
U+103A	'\u103a'	Mn	0	MYANMAR SIGN ASAT
U+1088	'\u1088'	Mc	0	MYANMAR SIGN SHAN TONE-3
U+1015	'\u1015'	Lo	1	MYANMAR LETTER PA
U+102D	'\u102d'	Mn	0	MYANMAR VOWEL SIGN I
U+102F	'\u102f'	Mn	0	MYANMAR VOWEL SIGN U
U+107C	'\u107c'	Lo	1	MYANMAR LETTER SHAN NA
U+103A	'\u103a'	Mn	0	MYANMAR SIGN ASAT
U+107D	'\u107d'	Lo	1	MYANMAR LETTER SHAN PHA
U+1062	'\u1062'	Mc	0	MYANMAR VOWEL SIGN SGAW KAREN EU
U+101D	'\u101d'	Lo	1	MYANMAR LETTER WA
U+103A	'\u103a'	Mn	0	MYANMAR SIGN ASAT
U+1087	'\u1087'	Mc	0	MYANMAR SIGN SHAN TONE-2

Total codepoints: 15

Shell test using printf(1), '|' should align in output:

$ printf "\xe1\x80\x9c\xe1\x80\xad\xe1\x81\xb5\xe1\x80\xba\xe1\x82\x88\xe1\x80\x95\xe1\x80\xad\xe1\x80\xaf\xe1\x81\xbc\xe1\x80\xba\xe1\x81\xbd\xe1\x81\xa2\xe1\x80\x9d\xe1\x80\xba\xe1\x82\x87|\\n123456|\\n"
လိၵ်ႈပိုၼ်ၽၢဝ်ႇ|
123456|

python wcwidth.wcswidth() measures width 6, while zoc measures width 15.

Dzongkha

Sequence of language Dzongkha from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0F60	'\u0f60'	Lo	1	TIBETAN LETTER -A
U+0F42	'\u0f42'	Lo	1	TIBETAN LETTER GA
U+0FB2	'\u0fb2'	Mn	0	TIBETAN SUBJOINED LETTER RA
U+0F7C	'\u0f7c'	Mn	0	TIBETAN VOWEL SIGN O
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F56	'\u0f56'	Lo	1	TIBETAN LETTER BA
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F58	'\u0f58'	Lo	1	TIBETAN LETTER MA
U+0F72	'\u0f72'	Mn	0	TIBETAN VOWEL SIGN I
U+0F60	'\u0f60'	Lo	1	TIBETAN LETTER -A
U+0F72	'\u0f72'	Mn	0	TIBETAN VOWEL SIGN I
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F51	'\u0f51'	Lo	1	TIBETAN LETTER DA
U+0F56	'\u0f56'	Lo	1	TIBETAN LETTER BA
U+0F44	'\u0f44'	Lo	1	TIBETAN LETTER NGA
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F46	'\u0f46'	Lo	1	TIBETAN LETTER CHA
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F42	'\u0f42'	Lo	1	TIBETAN LETTER GA
U+0F72	'\u0f72'	Mn	0	TIBETAN VOWEL SIGN I
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F60	'\u0f60'	Lo	1	TIBETAN LETTER -A
U+0F5B	'\u0f5b'	Lo	1	TIBETAN LETTER DZA
U+0F58	'\u0f58'	Lo	1	TIBETAN LETTER MA
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F42	'\u0f42'	Lo	1	TIBETAN LETTER GA
U+0FB3	'\u0fb3'	Mn	0	TIBETAN SUBJOINED LETTER LA
U+0F72	'\u0f72'	Mn	0	TIBETAN VOWEL SIGN I
U+0F44	'\u0f44'	Lo	1	TIBETAN LETTER NGA
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F42	'\u0f42'	Lo	1	TIBETAN LETTER GA
U+0F66	'\u0f66'	Lo	1	TIBETAN LETTER SA
U+0F63	'\u0f63'	Lo	1	TIBETAN LETTER LA
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F56	'\u0f56'	Lo	1	TIBETAN LETTER BA
U+0F66	'\u0f66'	Lo	1	TIBETAN LETTER SA
U+0F92	'\u0f92'	Mn	0	TIBETAN SUBJOINED LETTER GA
U+0FB2	'\u0fb2'	Mn	0	TIBETAN SUBJOINED LETTER RA
U+0F42	'\u0f42'	Lo	1	TIBETAN LETTER GA
U+0F66	'\u0f66'	Lo	1	TIBETAN LETTER SA
U+0F0D	'\u0f0d'	Po	1	TIBETAN MARK SHAD

Total codepoints: 41

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xbd\xa0\xe0\xbd\x82\xe0\xbe\xb2\xe0\xbd\xbc\xe0\xbc\x8b\xe0\xbd\x96\xe0\xbc\x8b\xe0\xbd\x98\xe0\xbd\xb2\xe0\xbd\xa0\xe0\xbd\xb2\xe0\xbc\x8b\xe0\xbd\x91\xe0\xbd\x96\xe0\xbd\x84\xe0\xbc\x8b\xe0\xbd\x86\xe0\xbc\x8b\xe0\xbd\x82\xe0\xbd\xb2\xe0\xbc\x8b\xe0\xbd\xa0\xe0\xbd\x9b\xe0\xbd\x98\xe0\xbc\x8b\xe0\xbd\x82\xe0\xbe\xb3\xe0\xbd\xb2\xe0\xbd\x84\xe0\xbc\x8b\xe0\xbd\x82\xe0\xbd\xa6\xe0\xbd\xa3\xe0\xbc\x8b\xe0\xbd\x96\xe0\xbd\xa6\xe0\xbe\x92\xe0\xbe\xb2\xe0\xbd\x82\xe0\xbd\xa6\xe0\xbc\x8d|\\n12345678901234567890123456789012|\\n"
འགྲོ་བ་མིའི་དབང་ཆ་གི་འཛམ་གླིང་གསལ་བསྒྲགས།|
12345678901234567890123456789012|

python wcwidth.wcswidth() measures width 32, while zoc measures width 41.

Gujarati

Sequence of language Gujarati from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0AAE	'\u0aae'	Lo	1	GUJARATI LETTER MA
U+0ABE	'\u0abe'	Mc	0	GUJARATI VOWEL SIGN AA
U+0AA8	'\u0aa8'	Lo	1	GUJARATI LETTER NA
U+0AB5	'\u0ab5'	Lo	1	GUJARATI LETTER VA

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xaa\xae\xe0\xaa\xbe\xe0\xaa\xa8\xe0\xaa\xb5|\\n123|\\n"
માનવ|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Tibetan, Central

Sequence of language Tibetan, Central from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0F61	'\u0f61'	Lo	1	TIBETAN LETTER YA
U+0F7C	'\u0f7c'	Mn	0	TIBETAN VOWEL SIGN O
U+0F44	'\u0f44'	Lo	1	TIBETAN LETTER NGA
U+0F66	'\u0f66'	Lo	1	TIBETAN LETTER SA
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F41	'\u0f41'	Lo	1	TIBETAN LETTER KHA
U+0FB1	'\u0fb1'	Mn	0	TIBETAN SUBJOINED LETTER YA
U+0F56	'\u0f56'	Lo	1	TIBETAN LETTER BA
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F42	'\u0f42'	Lo	1	TIBETAN LETTER GA
U+0F66	'\u0f66'	Lo	1	TIBETAN LETTER SA
U+0F63	'\u0f63'	Lo	1	TIBETAN LETTER LA
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F56	'\u0f56'	Lo	1	TIBETAN LETTER BA
U+0F66	'\u0f66'	Lo	1	TIBETAN LETTER SA
U+0F92	'\u0f92'	Mn	0	TIBETAN SUBJOINED LETTER GA
U+0FB2	'\u0fb2'	Mn	0	TIBETAN SUBJOINED LETTER RA
U+0F42	'\u0f42'	Lo	1	TIBETAN LETTER GA
U+0F66	'\u0f66'	Lo	1	TIBETAN LETTER SA
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F60	'\u0f60'	Lo	1	TIBETAN LETTER -A
U+0F42	'\u0f42'	Lo	1	TIBETAN LETTER GA
U+0FB2	'\u0fb2'	Mn	0	TIBETAN SUBJOINED LETTER RA
U+0F7C	'\u0f7c'	Mn	0	TIBETAN VOWEL SIGN O
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F56	'\u0f56'	Lo	1	TIBETAN LETTER BA
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F58	'\u0f58'	Lo	1	TIBETAN LETTER MA
U+0F72	'\u0f72'	Mn	0	TIBETAN VOWEL SIGN I
U+0F60	'\u0f60'	Lo	1	TIBETAN LETTER -A
U+0F72	'\u0f72'	Mn	0	TIBETAN VOWEL SIGN I
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F50	'\u0f50'	Lo	1	TIBETAN LETTER THA
U+0F7C	'\u0f7c'	Mn	0	TIBETAN VOWEL SIGN O
U+0F56	'\u0f56'	Lo	1	TIBETAN LETTER BA
U+0F0B	'\u0f0b'	Po	1	TIBETAN MARK INTERSYLLABIC TSHEG
U+0F50	'\u0f50'	Lo	1	TIBETAN LETTER THA
U+0F44	'\u0f44'	Lo	1	TIBETAN LETTER NGA
U+0F0C	'\u0f0c'	Po	1	TIBETAN MARK DELIMITER TSHEG BSTAR
U+0F0D	'\u0f0d'	Po	1	TIBETAN MARK SHAD

Total codepoints: 40

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xbd\xa1\xe0\xbd\xbc\xe0\xbd\x84\xe0\xbd\xa6\xe0\xbc\x8b\xe0\xbd\x81\xe0\xbe\xb1\xe0\xbd\x96\xe0\xbc\x8b\xe0\xbd\x82\xe0\xbd\xa6\xe0\xbd\xa3\xe0\xbc\x8b\xe0\xbd\x96\xe0\xbd\xa6\xe0\xbe\x92\xe0\xbe\xb2\xe0\xbd\x82\xe0\xbd\xa6\xe0\xbc\x8b\xe0\xbd\xa0\xe0\xbd\x82\xe0\xbe\xb2\xe0\xbd\xbc\xe0\xbc\x8b\xe0\xbd\x96\xe0\xbc\x8b\xe0\xbd\x98\xe0\xbd\xb2\xe0\xbd\xa0\xe0\xbd\xb2\xe0\xbc\x8b\xe0\xbd\x90\xe0\xbd\xbc\xe0\xbd\x96\xe0\xbc\x8b\xe0\xbd\x90\xe0\xbd\x84\xe0\xbc\x8c\xe0\xbc\x8d|\\n1234567890123456789012345678901|\\n"
ཡོངས་ཁྱབ་གསལ་བསྒྲགས་འགྲོ་བ་མིའི་ཐོབ་ཐང༌།|
1234567890123456789012345678901|

python wcwidth.wcswidth() measures width 31, while zoc measures width 40.

Malayalam

Sequence of language Malayalam from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0D2E	'\u0d2e'	Lo	1	MALAYALAM LETTER MA
U+0D28	'\u0d28'	Lo	1	MALAYALAM LETTER NA
U+0D41	'\u0d41'	Mn	0	MALAYALAM VOWEL SIGN U
U+0D37	'\u0d37'	Lo	1	MALAYALAM LETTER SSA
U+0D4D	'\u0d4d'	Mn	0	MALAYALAM SIGN VIRAMA
U+0D2F	'\u0d2f'	Lo	1	MALAYALAM LETTER YA
U+0D3E	'\u0d3e'	Mc	0	MALAYALAM VOWEL SIGN AA
U+0D35	'\u0d35'	Lo	1	MALAYALAM LETTER VA
U+0D15	'\u0d15'	Lo	1	MALAYALAM LETTER KA
U+0D3E	'\u0d3e'	Mc	0	MALAYALAM VOWEL SIGN AA
U+0D36	'\u0d36'	Lo	1	MALAYALAM LETTER SHA
U+0D19	'\u0d19'	Lo	1	MALAYALAM LETTER NGA
U+0D4D	'\u0d4d'	Mn	0	MALAYALAM SIGN VIRAMA
U+0D19	'\u0d19'	Lo	1	MALAYALAM LETTER NGA
U+0D33	'\u0d33'	Lo	1	MALAYALAM LETTER LLA
U+0D46	'\u0d46'	Mc	0	MALAYALAM VOWEL SIGN E
U+0D15	'\u0d15'	Lo	1	MALAYALAM LETTER KA
U+0D4D	'\u0d4d'	Mn	0	MALAYALAM SIGN VIRAMA
U+0D15	'\u0d15'	Lo	1	MALAYALAM LETTER KA
U+0D41	'\u0d41'	Mn	0	MALAYALAM VOWEL SIGN U
U+0D31	'\u0d31'	Lo	1	MALAYALAM LETTER RRA
U+0D3F	'\u0d3f'	Mc	0	MALAYALAM VOWEL SIGN I
U+0D15	'\u0d15'	Lo	1	MALAYALAM LETTER KA
U+0D4D	'\u0d4d'	Mn	0	MALAYALAM SIGN VIRAMA
U+0D15	'\u0d15'	Lo	1	MALAYALAM LETTER KA
U+0D41	'\u0d41'	Mn	0	MALAYALAM VOWEL SIGN U
U+0D28	'\u0d28'	Lo	1	MALAYALAM LETTER NA
U+0D4D	'\u0d4d'	Mn	0	MALAYALAM SIGN VIRAMA
U+0D28	'\u0d28'	Lo	1	MALAYALAM LETTER NA

Total codepoints: 29

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xb4\xae\xe0\xb4\xa8\xe0\xb5\x81\xe0\xb4\xb7\xe0\xb5\x8d\xe0\xb4\xaf\xe0\xb4\xbe\xe0\xb4\xb5\xe0\xb4\x95\xe0\xb4\xbe\xe0\xb4\xb6\xe0\xb4\x99\xe0\xb5\x8d\xe0\xb4\x99\xe0\xb4\xb3\xe0\xb5\x86\xe0\xb4\x95\xe0\xb5\x8d\xe0\xb4\x95\xe0\xb5\x81\xe0\xb4\xb1\xe0\xb4\xbf\xe0\xb4\x95\xe0\xb5\x8d\xe0\xb4\x95\xe0\xb5\x81\xe0\xb4\xa8\xe0\xb5\x8d\xe0\xb4\xa8|\\n12345678901234567|\\n"
മനുഷ്യാവകാശങ്ങളെക്കുറിക്കുന്ന|
12345678901234567|

python wcwidth.wcswidth() measures width 17, while zoc measures width 29.

Tamang, Eastern

Sequence of language Tamang, Eastern from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+092E	'\u092e'	Lo	1	DEVANAGARI LETTER MA
U+094D	'\u094d'	Mn	0	DEVANAGARI SIGN VIRAMA
U+0939	'\u0939'	Lo	1	DEVANAGARI LETTER HA
U+0940	'\u0940'	Mc	0	DEVANAGARI VOWEL SIGN II
U+0938	'\u0938'	Lo	1	DEVANAGARI LETTER SA
U+0947	'\u0947'	Mn	0	DEVANAGARI VOWEL SIGN E

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xa4\xae\xe0\xa5\x8d\xe0\xa4\xb9\xe0\xa5\x80\xe0\xa4\xb8\xe0\xa5\x87|\\n123|\\n"
म्हीसे|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 6.

Kannada

Sequence of language Kannada from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0CAE	'\u0cae'	Lo	1	KANNADA LETTER MA
U+0CBE	'\u0cbe'	Mc	0	KANNADA VOWEL SIGN AA
U+0CA8	'\u0ca8'	Lo	1	KANNADA LETTER NA
U+0CB5	'\u0cb5'	Lo	1	KANNADA LETTER VA

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xb2\xae\xe0\xb2\xbe\xe0\xb2\xa8\xe0\xb2\xb5|\\n123|\\n"
ಮಾನವ|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Khün

Sequence of language Khün from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+1A20	'\u1a20'	Lo	1	TAI THAM LETTER HIGH KA
U+1A32	'\u1a32'	Lo	1	TAI THAM LETTER HIGH TA
U+1A65	'\u1a65'	Mn	0	TAI THAM VOWEL SIGN I
U+1A20	'\u1a20'	Lo	1	TAI THAM LETTER HIGH KA
U+1A63	'\u1a63'	Mc	0	TAI THAM VOWEL SIGN AA
U+1A45	'\u1a45'	Lo	1	TAI THAM LETTER WA
U+1A64	'\u1a64'	Mc	0	TAI THAM VOWEL SIGN TALL AA
U+1A75	'\u1a75'	Mn	0	TAI THAM SIGN TONE-1
U+1A2F	'\u1a2f'	Lo	1	TAI THAM LETTER DA
U+1A60	'\u1a60'	Mn	0	TAI THAM SIGN SAKOT
U+1A45	'\u1a45'	Lo	1	TAI THAM LETTER WA
U+1A60	'\u1a60'	Mn	0	TAI THAM SIGN SAKOT
U+1A3F	'\u1a3f'	Lo	1	TAI THAM LETTER LOW YA
U+1A62	'\u1a62'	Mn	0	TAI THAM VOWEL SIGN MAI SAT
U+1A3E	'\u1a3e'	Lo	1	TAI THAM LETTER MA
U+1A36	'\u1a36'	Lo	1	TAI THAM LETTER NA
U+1A69	'\u1a69'	Mn	0	TAI THAM VOWEL SIGN U
U+1A54	'\u1a54'	Lo	1	TAI THAM LETTER GREAT SA
U+1A29	'\u1a29'	Lo	1	TAI THAM LETTER LOW CA
U+1A63	'\u1a63'	Mc	0	TAI THAM VOWEL SIGN AA
U+1A60	'\u1a60'	Mn	0	TAI THAM SIGN SAKOT
U+1A32	'\u1a32'	Lo	1	TAI THAM LETTER HIGH TA

Total codepoints: 22

Shell test using printf(1), '|' should align in output:

$ printf "\xe1\xa8\xa0\xe1\xa8\xb2\xe1\xa9\xa5\xe1\xa8\xa0\xe1\xa9\xa3\xe1\xa9\x85\xe1\xa9\xa4\xe1\xa9\xb5\xe1\xa8\xaf\xe1\xa9\xa0\xe1\xa9\x85\xe1\xa9\xa0\xe1\xa8\xbf\xe1\xa9\xa2\xe1\xa8\xbe\xe1\xa8\xb6\xe1\xa9\xa9\xe1\xa9\x94\xe1\xa8\xa9\xe1\xa9\xa3\xe1\xa9\xa0\xe1\xa8\xb2|\\n123456789012|\\n"
ᨠᨲᩥᨠᩣᩅᩤ᩵ᨯ᩠ᩅ᩠ᨿᩢᨾᨶᩩᩔᨩᩣ᩠ᨲ|
123456789012|

python wcwidth.wcswidth() measures width 12, while zoc measures width 22.

Khmer, Central

Sequence of language Khmer, Central from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+179F	'\u179f'	Lo	1	KHMER LETTER SA
U+17C1	'\u17c1'	Mc	0	KHMER VOWEL SIGN E
U+1785	'\u1785'	Lo	1	KHMER LETTER CA
U+1780	'\u1780'	Lo	1	KHMER LETTER KA
U+17D2	'\u17d2'	Mn	0	KHMER SIGN COENG
U+178A	'\u178a'	Lo	1	KHMER LETTER DA
U+17B8	'\u17b8'	Mn	0	KHMER VOWEL SIGN II
U+1794	'\u1794'	Lo	1	KHMER LETTER BA
U+17D2	'\u17d2'	Mn	0	KHMER SIGN COENG
U+179A	'\u179a'	Lo	1	KHMER LETTER RO
U+1780	'\u1780'	Lo	1	KHMER LETTER KA
U+17B6	'\u17b6'	Mc	0	KHMER VOWEL SIGN AA
U+179F	'\u179f'	Lo	1	KHMER LETTER SA
U+1787	'\u1787'	Lo	1	KHMER LETTER CO
U+17B6	'\u17b6'	Mc	0	KHMER VOWEL SIGN AA
U+179F	'\u179f'	Lo	1	KHMER LETTER SA
U+1780	'\u1780'	Lo	1	KHMER LETTER KA
U+179B	'\u179b'	Lo	1	KHMER LETTER LO
U+179F	'\u179f'	Lo	1	KHMER LETTER SA
U+17D2	'\u17d2'	Mn	0	KHMER SIGN COENG
U+178A	'\u178a'	Lo	1	KHMER LETTER DA
U+17B8	'\u17b8'	Mn	0	KHMER VOWEL SIGN II
U+1796	'\u1796'	Lo	1	KHMER LETTER PO
U+17B8	'\u17b8'	Mn	0	KHMER VOWEL SIGN II
U+179F	'\u179f'	Lo	1	KHMER LETTER SA
U+17B7	'\u17b7'	Mn	0	KHMER VOWEL SIGN I
U+1791	'\u1791'	Lo	1	KHMER LETTER TO
U+17D2	'\u17d2'	Mn	0	KHMER SIGN COENG
U+1792	'\u1792'	Lo	1	KHMER LETTER THO
U+17B7	'\u17b7'	Mn	0	KHMER VOWEL SIGN I
U+1798	'\u1798'	Lo	1	KHMER LETTER MO
U+1793	'\u1793'	Lo	1	KHMER LETTER NO
U+17BB	'\u17bb'	Mn	0	KHMER VOWEL SIGN U
U+179F	'\u179f'	Lo	1	KHMER LETTER SA
U+17D2	'\u17d2'	Mn	0	KHMER SIGN COENG
U+179F	'\u179f'	Lo	1	KHMER LETTER SA

Total codepoints: 36

Shell test using printf(1), '|' should align in output:

$ printf "\xe1\x9e\x9f\xe1\x9f\x81\xe1\x9e\x85\xe1\x9e\x80\xe1\x9f\x92\xe1\x9e\x8a\xe1\x9e\xb8\xe1\x9e\x94\xe1\x9f\x92\xe1\x9e\x9a\xe1\x9e\x80\xe1\x9e\xb6\xe1\x9e\x9f\xe1\x9e\x87\xe1\x9e\xb6\xe1\x9e\x9f\xe1\x9e\x80\xe1\x9e\x9b\xe1\x9e\x9f\xe1\x9f\x92\xe1\x9e\x8a\xe1\x9e\xb8\xe1\x9e\x96\xe1\x9e\xb8\xe1\x9e\x9f\xe1\x9e\xb7\xe1\x9e\x91\xe1\x9f\x92\xe1\x9e\x92\xe1\x9e\xb7\xe1\x9e\x98\xe1\x9e\x93\xe1\x9e\xbb\xe1\x9e\x9f\xe1\x9f\x92\xe1\x9e\x9f|\\n1234567890123456789012|\\n"
សេចក្ដីប្រកាសជាសកលស្ដីពីសិទ្ធិមនុស្ស|
1234567890123456789012|

python wcwidth.wcswidth() measures width 22, while zoc measures width 36.

Bengali

Sequence of language Bengali from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+09AE	'\u09ae'	Lo	1	BENGALI LETTER MA
U+09BE	'\u09be'	Mc	0	BENGALI VOWEL SIGN AA
U+09A8	'\u09a8'	Lo	1	BENGALI LETTER NA
U+09AC	'\u09ac'	Lo	1	BENGALI LETTER BA
U+09BE	'\u09be'	Mc	0	BENGALI VOWEL SIGN AA
U+09A7	'\u09a7'	Lo	1	BENGALI LETTER DHA
U+09BF	'\u09bf'	Mc	0	BENGALI VOWEL SIGN I
U+0995	'\u0995'	Lo	1	BENGALI LETTER KA
U+09BE	'\u09be'	Mc	0	BENGALI VOWEL SIGN AA
U+09B0	'\u09b0'	Lo	1	BENGALI LETTER RA
U+09C7	'\u09c7'	Mc	0	BENGALI VOWEL SIGN E
U+09B0	'\u09b0'	Lo	1	BENGALI LETTER RA

Total codepoints: 12

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xa6\xae\xe0\xa6\xbe\xe0\xa6\xa8\xe0\xa6\xac\xe0\xa6\xbe\xe0\xa6\xa7\xe0\xa6\xbf\xe0\xa6\x95\xe0\xa6\xbe\xe0\xa6\xb0\xe0\xa7\x87\xe0\xa6\xb0|\\n1234567|\\n"
মানবাধিকারের|
1234567|

python wcwidth.wcswidth() measures width 7, while zoc measures width 12.

Chakma

Sequence of language Chakma from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0001111F	'\U0001111f'	Lo	1	CHAKMA LETTER MAA
U+0001111A	'\U0001111a'	Lo	1	CHAKMA LETTER NAA
U+0001112C	'\U0001112c'	Mc	0	CHAKMA VOWEL SIGN E
U+0001112D	'\U0001112d'	Mn	0	CHAKMA VOWEL SIGN AI
U+00011103	'\U00011103'	Lo	1	CHAKMA LETTER AA
U+00011107	'\U00011107'	Lo	1	CHAKMA LETTER KAA
U+00011134	'\U00011134'	Mn	0	CHAKMA MAAYYAA
U+00011107	'\U00011107'	Lo	1	CHAKMA LETTER KAA
U+00011125	'\U00011125'	Lo	1	CHAKMA LETTER SAA
U+00011127	'\U00011127'	Mn	0	CHAKMA VOWEL SIGN A
U+00011101	'\U00011101'	Mn	0	CHAKMA SIGN ANUSVARA
U+00011122	'\U00011122'	Lo	1	CHAKMA LETTER RAA
U+00011134	'\U00011134'	Mn	0	CHAKMA MAAYYAA

Total codepoints: 13

Shell test using printf(1), '|' should align in output:

$ printf "\xf0\x91\x84\x9f\xf0\x91\x84\x9a\xf0\x91\x84\xac\xf0\x91\x84\xad\xf0\x91\x84\x83\xf0\x91\x84\x87\xf0\x91\x84\xb4\xf0\x91\x84\x87\xf0\x91\x84\xa5\xf0\x91\x84\xa7\xf0\x91\x84\x81\xf0\x91\x84\xa2\xf0\x91\x84\xb4|\\n1234567|\\n"
𑄟𑄚𑄬𑄭𑄃𑄇𑄴𑄇𑄥𑄧𑄁𑄢𑄴|
1234567|

python wcwidth.wcswidth() measures width 7, while zoc measures width 13.

Telugu

Sequence of language Telugu from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0C2E	'\u0c2e'	Lo	1	TELUGU LETTER MA
U+0C3E	'\u0c3e'	Mn	0	TELUGU VOWEL SIGN AA
U+0C28	'\u0c28'	Lo	1	TELUGU LETTER NA
U+0C35	'\u0c35'	Lo	1	TELUGU LETTER VA
U+0C38	'\u0c38'	Lo	1	TELUGU LETTER SA
U+0C4D	'\u0c4d'	Mn	0	TELUGU SIGN VIRAMA
U+0C35	'\u0c35'	Lo	1	TELUGU LETTER VA
U+0C24	'\u0c24'	Lo	1	TELUGU LETTER TA
U+0C4D	'\u0c4d'	Mn	0	TELUGU SIGN VIRAMA
U+0C35	'\u0c35'	Lo	1	TELUGU LETTER VA
U+0C2E	'\u0c2e'	Lo	1	TELUGU LETTER MA
U+0C41	'\u0c41'	Mc	0	TELUGU VOWEL SIGN U
U+0C32	'\u0c32'	Lo	1	TELUGU LETTER LA

Total codepoints: 13

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xb0\xae\xe0\xb0\xbe\xe0\xb0\xa8\xe0\xb0\xb5\xe0\xb0\xb8\xe0\xb1\x8d\xe0\xb0\xb5\xe0\xb0\xa4\xe0\xb1\x8d\xe0\xb0\xb5\xe0\xb0\xae\xe0\xb1\x81\xe0\xb0\xb2|\\n123456789|\\n"
మానవస్వత్వముల|
123456789|

python wcwidth.wcswidth() measures width 9, while zoc measures width 13.

Nepali

Sequence of language Nepali from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+092E	'\u092e'	Lo	1	DEVANAGARI LETTER MA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0928	'\u0928'	Lo	1	DEVANAGARI LETTER NA
U+0935	'\u0935'	Lo	1	DEVANAGARI LETTER VA

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xa4\xae\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa4\xb5|\\n123|\\n"
मानव|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Sanskrit

Sequence of language Sanskrit from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+092E	'\u092e'	Lo	1	DEVANAGARI LETTER MA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0928	'\u0928'	Lo	1	DEVANAGARI LETTER NA
U+0935	'\u0935'	Lo	1	DEVANAGARI LETTER VA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0927	'\u0927'	Lo	1	DEVANAGARI LETTER DHA
U+093F	'\u093f'	Mc	0	DEVANAGARI VOWEL SIGN I
U+0915	'\u0915'	Lo	1	DEVANAGARI LETTER KA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0930	'\u0930'	Lo	1	DEVANAGARI LETTER RA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0923	'\u0923'	Lo	1	DEVANAGARI LETTER NNA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0902	'\u0902'	Mn	0	DEVANAGARI SIGN ANUSVARA

Total codepoints: 14

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xa4\xae\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa4\xb5\xe0\xa4\xbe\xe0\xa4\xa7\xe0\xa4\xbf\xe0\xa4\x95\xe0\xa4\xbe\xe0\xa4\xb0\xe0\xa4\xbe\xe0\xa4\xa3\xe0\xa4\xbe\xe0\xa4\x82|\\n1234567|\\n"
मानवाधिकाराणां|
1234567|

python wcwidth.wcswidth() measures width 7, while zoc measures width 14.

Sanskrit (Grantha)

Sequence of language Sanskrit (Grantha) from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0001132E	'\U0001132e'	Lo	1	GRANTHA LETTER MA
U+0001133E	'\U0001133e'	Mc	0	GRANTHA VOWEL SIGN AA
U+00011328	'\U00011328'	Lo	1	GRANTHA LETTER NA
U+00011335	'\U00011335'	Lo	1	GRANTHA LETTER VA
U+0001133E	'\U0001133e'	Mc	0	GRANTHA VOWEL SIGN AA
U+00011327	'\U00011327'	Lo	1	GRANTHA LETTER DHA
U+0001133F	'\U0001133f'	Mc	0	GRANTHA VOWEL SIGN I
U+00011315	'\U00011315'	Lo	1	GRANTHA LETTER KA
U+0001133E	'\U0001133e'	Mc	0	GRANTHA VOWEL SIGN AA
U+00011330	'\U00011330'	Lo	1	GRANTHA LETTER RA
U+0001133E	'\U0001133e'	Mc	0	GRANTHA VOWEL SIGN AA
U+00011323	'\U00011323'	Lo	1	GRANTHA LETTER NNA
U+0001133E	'\U0001133e'	Mc	0	GRANTHA VOWEL SIGN AA
U+00011302	'\U00011302'	Mc	0	GRANTHA SIGN ANUSVARA

Total codepoints: 14

Shell test using printf(1), '|' should align in output:

$ printf "\xf0\x91\x8c\xae\xf0\x91\x8c\xbe\xf0\x91\x8c\xa8\xf0\x91\x8c\xb5\xf0\x91\x8c\xbe\xf0\x91\x8c\xa7\xf0\x91\x8c\xbf\xf0\x91\x8c\x95\xf0\x91\x8c\xbe\xf0\x91\x8c\xb0\xf0\x91\x8c\xbe\xf0\x91\x8c\xa3\xf0\x91\x8c\xbe\xf0\x91\x8c\x82|\\n1234567|\\n"
𑌮𑌾𑌨𑌵𑌾𑌧𑌿𑌕𑌾𑌰𑌾𑌣𑌾𑌂|
1234567|

python wcwidth.wcswidth() measures width 7, while zoc measures width 14.

Marathi

Sequence of language Marathi from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+092E	'\u092e'	Lo	1	DEVANAGARI LETTER MA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0928	'\u0928'	Lo	1	DEVANAGARI LETTER NA
U+0935	'\u0935'	Lo	1	DEVANAGARI LETTER VA
U+0940	'\u0940'	Mc	0	DEVANAGARI VOWEL SIGN II

Total codepoints: 5

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xa4\xae\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa4\xb5\xe0\xa5\x80|\\n123|\\n"
मानवी|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 5.

Hindi

Sequence of language Hindi from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+092E	'\u092e'	Lo	1	DEVANAGARI LETTER MA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0928	'\u0928'	Lo	1	DEVANAGARI LETTER NA
U+0935	'\u0935'	Lo	1	DEVANAGARI LETTER VA

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xa4\xae\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa4\xb5|\\n123|\\n"
मानव|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Sinhala

Sequence of language Sinhala from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0DB8	'\u0db8'	Lo	1	SINHALA LETTER MAYANNA
U+0DCF	'\u0dcf'	Mc	0	SINHALA VOWEL SIGN AELA-PILLA
U+0DB1	'\u0db1'	Lo	1	SINHALA LETTER DANTAJA NAYANNA
U+0DC0	'\u0dc0'	Lo	1	SINHALA LETTER VAYANNA

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xb6\xb8\xe0\xb7\x8f\xe0\xb6\xb1\xe0\xb7\x80|\\n123|\\n"
මානව|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Panjabi, Eastern

Sequence of language Panjabi, Eastern from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0A2E	'\u0a2e'	Lo	1	GURMUKHI LETTER MA
U+0A28	'\u0a28'	Lo	1	GURMUKHI LETTER NA
U+0A41	'\u0a41'	Mn	0	GURMUKHI VOWEL SIGN U
U+0A71	'\u0a71'	Mn	0	GURMUKHI ADDAK
U+0A16	'\u0a16'	Lo	1	GURMUKHI LETTER KHA
U+0A40	'\u0a40'	Mc	0	GURMUKHI VOWEL SIGN II

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xa8\xae\xe0\xa8\xa8\xe0\xa9\x81\xe0\xa9\xb1\xe0\xa8\x96\xe0\xa9\x80|\\n123|\\n"
ਮਨੁੱਖੀ|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 6.

Bhojpuri

Sequence of language Bhojpuri from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+092E	'\u092e'	Lo	1	DEVANAGARI LETTER MA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0928	'\u0928'	Lo	1	DEVANAGARI LETTER NA
U+0935	'\u0935'	Lo	1	DEVANAGARI LETTER VA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0927	'\u0927'	Lo	1	DEVANAGARI LETTER DHA
U+093F	'\u093f'	Mc	0	DEVANAGARI VOWEL SIGN I
U+0915	'\u0915'	Lo	1	DEVANAGARI LETTER KA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0930	'\u0930'	Lo	1	DEVANAGARI LETTER RA

Total codepoints: 10

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xa4\xae\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa4\xb5\xe0\xa4\xbe\xe0\xa4\xa7\xe0\xa4\xbf\xe0\xa4\x95\xe0\xa4\xbe\xe0\xa4\xb0|\\n123456|\\n"
मानवाधिकार|
123456|

python wcwidth.wcswidth() measures width 6, while zoc measures width 10.

Thai (2)

Sequence of language Thai (2) from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0E1B	'\u0e1b'	Lo	1	THAI CHARACTER PO PLA
U+0E0F	'\u0e0f'	Lo	1	THAI CHARACTER TO PATAK
U+0E34	'\u0e34'	Mn	0	THAI CHARACTER SARA I
U+0E0D	'\u0e0d'	Lo	1	THAI CHARACTER YO YING
U+0E0D	'\u0e0d'	Lo	1	THAI CHARACTER YO YING
U+0E32	'\u0e32'	Lo	1	THAI CHARACTER SARA AA
U+0E2A	'\u0e2a'	Lo	1	THAI CHARACTER SO SUA
U+0E32	'\u0e32'	Lo	1	THAI CHARACTER SARA AA
U+0E01	'\u0e01'	Lo	1	THAI CHARACTER KO KAI
U+0E25	'\u0e25'	Lo	1	THAI CHARACTER LO LING
U+0E27	'\u0e27'	Lo	1	THAI CHARACTER WO WAEN
U+0E48	'\u0e48'	Mn	0	THAI CHARACTER MAI EK
U+0E32	'\u0e32'	Lo	1	THAI CHARACTER SARA AA
U+0E14	'\u0e14'	Lo	1	THAI CHARACTER DO DEK
U+0E49	'\u0e49'	Mn	0	THAI CHARACTER MAI THO
U+0E27	'\u0e27'	Lo	1	THAI CHARACTER WO WAEN
U+0E22	'\u0e22'	Lo	1	THAI CHARACTER YO YAK
U+0E2A	'\u0e2a'	Lo	1	THAI CHARACTER SO SUA
U+0E34	'\u0e34'	Mn	0	THAI CHARACTER SARA I
U+0E17	'\u0e17'	Lo	1	THAI CHARACTER THO THAHAN
U+0E18	'\u0e18'	Lo	1	THAI CHARACTER THO THONG
U+0E34	'\u0e34'	Mn	0	THAI CHARACTER SARA I
U+0E21	'\u0e21'	Lo	1	THAI CHARACTER MO MA
U+0E19	'\u0e19'	Lo	1	THAI CHARACTER NO NU
U+0E38	'\u0e38'	Mn	0	THAI CHARACTER SARA U
U+0E29	'\u0e29'	Lo	1	THAI CHARACTER SO RUSI
U+0E22	'\u0e22'	Lo	1	THAI CHARACTER YO YAK
U+0E0A	'\u0e0a'	Lo	1	THAI CHARACTER CHO CHANG
U+0E19	'\u0e19'	Lo	1	THAI CHARACTER NO NU

Total codepoints: 29

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xb8\x9b\xe0\xb8\x8f\xe0\xb8\xb4\xe0\xb8\x8d\xe0\xb8\x8d\xe0\xb8\xb2\xe0\xb8\xaa\xe0\xb8\xb2\xe0\xb8\x81\xe0\xb8\xa5\xe0\xb8\xa7\xe0\xb9\x88\xe0\xb8\xb2\xe0\xb8\x94\xe0\xb9\x89\xe0\xb8\xa7\xe0\xb8\xa2\xe0\xb8\xaa\xe0\xb8\xb4\xe0\xb8\x97\xe0\xb8\x98\xe0\xb8\xb4\xe0\xb8\xa1\xe0\xb8\x99\xe0\xb8\xb8\xe0\xb8\xa9\xe0\xb8\xa2\xe0\xb8\x8a\xe0\xb8\x99|\\n12345678901234567890123|\\n"
ปฏิญญาสากลว่าด้วยสิทธิมนุษยชน|
12345678901234567890123|

python wcwidth.wcswidth() measures width 23, while zoc measures width 29.

Maithili

Sequence of language Maithili from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0938	'\u0938'	Lo	1	DEVANAGARI LETTER SA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0930	'\u0930'	Lo	1	DEVANAGARI LETTER RA
U+094D	'\u094d'	Mn	0	DEVANAGARI SIGN VIRAMA
U+0935	'\u0935'	Lo	1	DEVANAGARI LETTER VA
U+092D	'\u092d'	Lo	1	DEVANAGARI LETTER BHA
U+094C	'\u094c'	Mc	0	DEVANAGARI VOWEL SIGN AU
U+092E	'\u092e'	Lo	1	DEVANAGARI LETTER MA

Total codepoints: 8

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xa4\xb8\xe0\xa4\xbe\xe0\xa4\xb0\xe0\xa5\x8d\xe0\xa4\xb5\xe0\xa4\xad\xe0\xa5\x8c\xe0\xa4\xae|\\n12345|\\n"
सार्वभौम|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 8.

Thai

Sequence of language Thai from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0E1B	'\u0e1b'	Lo	1	THAI CHARACTER PO PLA
U+0E0F	'\u0e0f'	Lo	1	THAI CHARACTER TO PATAK
U+0E34	'\u0e34'	Mn	0	THAI CHARACTER SARA I
U+0E0D	'\u0e0d'	Lo	1	THAI CHARACTER YO YING
U+0E0D	'\u0e0d'	Lo	1	THAI CHARACTER YO YING
U+0E32	'\u0e32'	Lo	1	THAI CHARACTER SARA AA
U+0E2A	'\u0e2a'	Lo	1	THAI CHARACTER SO SUA
U+0E32	'\u0e32'	Lo	1	THAI CHARACTER SARA AA
U+0E01	'\u0e01'	Lo	1	THAI CHARACTER KO KAI
U+0E25	'\u0e25'	Lo	1	THAI CHARACTER LO LING
U+0E27	'\u0e27'	Lo	1	THAI CHARACTER WO WAEN
U+0E48	'\u0e48'	Mn	0	THAI CHARACTER MAI EK
U+0E32	'\u0e32'	Lo	1	THAI CHARACTER SARA AA
U+0E14	'\u0e14'	Lo	1	THAI CHARACTER DO DEK
U+0E49	'\u0e49'	Mn	0	THAI CHARACTER MAI THO
U+0E27	'\u0e27'	Lo	1	THAI CHARACTER WO WAEN
U+0E22	'\u0e22'	Lo	1	THAI CHARACTER YO YAK
U+0E2A	'\u0e2a'	Lo	1	THAI CHARACTER SO SUA
U+0E34	'\u0e34'	Mn	0	THAI CHARACTER SARA I
U+0E17	'\u0e17'	Lo	1	THAI CHARACTER THO THAHAN
U+0E18	'\u0e18'	Lo	1	THAI CHARACTER THO THONG
U+0E34	'\u0e34'	Mn	0	THAI CHARACTER SARA I
U+0E21	'\u0e21'	Lo	1	THAI CHARACTER MO MA
U+0E19	'\u0e19'	Lo	1	THAI CHARACTER NO NU
U+0E38	'\u0e38'	Mn	0	THAI CHARACTER SARA U
U+0E29	'\u0e29'	Lo	1	THAI CHARACTER SO RUSI
U+0E22	'\u0e22'	Lo	1	THAI CHARACTER YO YAK
U+0E0A	'\u0e0a'	Lo	1	THAI CHARACTER CHO CHANG
U+0E19	'\u0e19'	Lo	1	THAI CHARACTER NO NU

Total codepoints: 29

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xb8\x9b\xe0\xb8\x8f\xe0\xb8\xb4\xe0\xb8\x8d\xe0\xb8\x8d\xe0\xb8\xb2\xe0\xb8\xaa\xe0\xb8\xb2\xe0\xb8\x81\xe0\xb8\xa5\xe0\xb8\xa7\xe0\xb9\x88\xe0\xb8\xb2\xe0\xb8\x94\xe0\xb9\x89\xe0\xb8\xa7\xe0\xb8\xa2\xe0\xb8\xaa\xe0\xb8\xb4\xe0\xb8\x97\xe0\xb8\x98\xe0\xb8\xb4\xe0\xb8\xa1\xe0\xb8\x99\xe0\xb8\xb8\xe0\xb8\xa9\xe0\xb8\xa2\xe0\xb8\x8a\xe0\xb8\x99|\\n12345678901234567890123|\\n"
ปฏิญญาสากลว่าด้วยสิทธิมนุษยชน|
12345678901234567890123|

python wcwidth.wcswidth() measures width 23, while zoc measures width 29.

Magahi

Sequence of language Magahi from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+092E	'\u092e'	Lo	1	DEVANAGARI LETTER MA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0928	'\u0928'	Lo	1	DEVANAGARI LETTER NA
U+0935	'\u0935'	Lo	1	DEVANAGARI LETTER VA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0927	'\u0927'	Lo	1	DEVANAGARI LETTER DHA
U+093F	'\u093f'	Mc	0	DEVANAGARI VOWEL SIGN I
U+0915	'\u0915'	Lo	1	DEVANAGARI LETTER KA
U+093E	'\u093e'	Mc	0	DEVANAGARI VOWEL SIGN AA
U+0930	'\u0930'	Lo	1	DEVANAGARI LETTER RA

Total codepoints: 10

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xa4\xae\xe0\xa4\xbe\xe0\xa4\xa8\xe0\xa4\xb5\xe0\xa4\xbe\xe0\xa4\xa7\xe0\xa4\xbf\xe0\xa4\x95\xe0\xa4\xbe\xe0\xa4\xb0|\\n123456|\\n"
मानवाधिकार|
123456|

python wcwidth.wcswidth() measures width 6, while zoc measures width 10.

Vietnamese

Sequence of language Vietnamese from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0074	't'	Ll	1	LATIN SMALL LETTER T
U+006F	'o'	Ll	1	LATIN SMALL LETTER O
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+0300	'\u0300'	Mn	0	COMBINING GRAVE ACCENT
U+006E	'n'	Ll	1	LATIN SMALL LETTER N

Total codepoints: 5

Shell test using printf(1), '|' should align in output:
```
$ printf "toa\xcc\x80n|\\n1234|\\n"
toàn|
1234|
```
python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Tagalog (Tagalog)

Sequence of language Tagalog (Tagalog) from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+170E	'\u170e'	Lo	1	TAGALOG LETTER LA
U+1711	'\u1711'	Lo	1	TAGALOG LETTER HA
U+1706	'\u1706'	Lo	1	TAGALOG LETTER TA
U+1714	'\u1714'	Mn	0	TAGALOG SIGN VIRAMA

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "\xe1\x9c\x8e\xe1\x9c\x91\xe1\x9c\x86\xe1\x9c\x94|\\n123|\\n"
ᜎᜑᜆ᜔|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Lao

Sequence of language Lao from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0E9B	'\u0e9b'	Lo	1	LAO LETTER PO
U+0EB0	'\u0eb0'	Lo	1	LAO VOWEL SIGN A
U+0E81	'\u0e81'	Lo	1	LAO LETTER KO
U+0EB2	'\u0eb2'	Lo	1	LAO VOWEL SIGN AA
U+0E94	'\u0e94'	Lo	1	LAO LETTER DO
U+0EAA	'\u0eaa'	Lo	1	LAO LETTER SO SUNG
U+0EB2	'\u0eb2'	Lo	1	LAO VOWEL SIGN AA
U+0E81	'\u0e81'	Lo	1	LAO LETTER KO
U+0EBB	'\u0ebb'	Mn	0	LAO VOWEL SIGN MAI KON
U+0E99	'\u0e99'	Lo	1	LAO LETTER NO

Total codepoints: 10

Shell test using printf(1), '|' should align in output:

$ printf "\xe0\xba\x9b\xe0\xba\xb0\xe0\xba\x81\xe0\xba\xb2\xe0\xba\x94\xe0\xba\xaa\xe0\xba\xb2\xe0\xba\x81\xe0\xba\xbb\xe0\xba\x99|\\n123456789|\\n"
ປະກາດສາກົນ|
123456789|

python wcwidth.wcswidth() measures width 9, while zoc measures width 10.

Lingala (tones)

Sequence of language Lingala (tones) from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+004D	'M'	Lu	1	LATIN CAPITAL LETTER M
U+004F	'O'	Lu	1	LATIN CAPITAL LETTER O
U+004C	'L'	Lu	1	LATIN CAPITAL LETTER L
U+0186	'\u0186'	Lu	1	LATIN CAPITAL LETTER OPEN O
U+0301	'\u0301'	Mn	0	COMBINING ACUTE ACCENT
U+004E	'N'	Lu	1	LATIN CAPITAL LETTER N
U+0047	'G'	Lu	1	LATIN CAPITAL LETTER G
U+0186	'\u0186'	Lu	1	LATIN CAPITAL LETTER OPEN O
U+0301	'\u0301'	Mn	0	COMBINING ACUTE ACCENT

Total codepoints: 9

Shell test using printf(1), '|' should align in output:

$ printf "MOL\xc6\x86\xcc\x81NG\xc6\x86\xcc\x81|\\n1234567|\\n"
MOLƆ́NGƆ́|
1234567|

python wcwidth.wcswidth() measures width 7, while zoc measures width 9.

Vietnamese (Han nom)

Sequence of language Vietnamese (Han nom) from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0002321C	'\U0002321c'	Lo	2	CJK UNIFIED IDEOGRAPH-2321C
U+0031	'1'	Nd	1	DIGIT ONE
U+0030	'0'	Nd	1	DIGIT ZERO
U+00023383	'\U00023383'	Lo	2	CJK UNIFIED IDEOGRAPH-23383
U+0031	'1'	Nd	1	DIGIT ONE
U+0032	'2'	Nd	1	DIGIT TWO
U+000221A5	'\U000221a5'	Lo	2	CJK UNIFIED IDEOGRAPH-221A5
U+0031	'1'	Nd	1	DIGIT ONE
U+0039	'9'	Nd	1	DIGIT NINE
U+0034	'4'	Nd	1	DIGIT FOUR
U+0038	'8'	Nd	1	DIGIT EIGHT

Total codepoints: 11

Shell test using printf(1), '|' should align in output:

$ printf "\xf0\xa3\x88\x9c10\xf0\xa3\x8e\x8312\xf0\xa2\x86\xa51948|\\n12345678901234|\\n"
𣈜10𣎃12𢆥1948|
12345678901234|

python wcwidth.wcswidth() measures width 14, while zoc measures width 13.

Pular (Adlam)

Sequence of language Pular (Adlam) from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0001E916	'\U0001e916'	Lu	1	ADLAM CAPITAL LETTER HA
U+0001E90B	'\U0001e90b'	Lu	1	ADLAM CAPITAL LETTER I
U+0001E902	'\U0001e902'	Lu	1	ADLAM CAPITAL LETTER LAAM
U+0001E946	'\U0001e946'	Mn	0	ADLAM GEMINATION MARK
U+0001E900	'\U0001e900'	Lu	1	ADLAM CAPITAL LETTER ALIF
U+0001E912	'\U0001e912'	Lu	1	ADLAM CAPITAL LETTER YA
U+0001E900	'\U0001e900'	Lu	1	ADLAM CAPITAL LETTER ALIF
U+0001E910	'\U0001e910'	Lu	1	ADLAM CAPITAL LETTER NUN
U+0001E911	'\U0001e911'	Lu	1	ADLAM CAPITAL LETTER KAF
U+0001E90C	'\U0001e90c'	Lu	1	ADLAM CAPITAL LETTER O
U+0001E945	'\U0001e945'	Mn	0	ADLAM VOWEL LENGTHENER
U+0001E908	'\U0001e908'	Lu	1	ADLAM CAPITAL LETTER RA
U+0001E909	'\U0001e909'	Lu	1	ADLAM CAPITAL LETTER E

Total codepoints: 13

Shell test using printf(1), '|' should align in output:

$ printf "\xf0\x9e\xa4\x96\xf0\x9e\xa4\x8b\xf0\x9e\xa4\x82\xf0\x9e\xa5\x86\xf0\x9e\xa4\x80\xf0\x9e\xa4\x92\xf0\x9e\xa4\x80\xf0\x9e\xa4\x90\xf0\x9e\xa4\x91\xf0\x9e\xa4\x8c\xf0\x9e\xa5\x85\xf0\x9e\xa4\x88\xf0\x9e\xa4\x89|\\n12345678901|\\n"
𞤖𞤋𞤂𞥆𞤀𞤒𞤀𞤐𞤑𞤌𞥅𞤈𞤉|
12345678901|

python wcwidth.wcswidth() measures width 11, while zoc measures width 13.

Yiddish, Eastern

Sequence of language Yiddish, Eastern from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+05D0	'\u05d0'	Lo	1	HEBREW LETTER ALEF
U+05B7	'\u05b7'	Mn	0	HEBREW POINT PATAH
U+05DC	'\u05dc'	Lo	1	HEBREW LETTER LAMED
U+05F0	'\u05f0'	Lo	1	HEBREW LIGATURE YIDDISH DOUBLE VAV
U+05E2	'\u05e2'	Lo	1	HEBREW LETTER AYIN
U+05DC	'\u05dc'	Lo	1	HEBREW LETTER LAMED
U+05D8	'\u05d8'	Lo	1	HEBREW LETTER TET
U+05DC	'\u05dc'	Lo	1	HEBREW LETTER LAMED
U+05E2	'\u05e2'	Lo	1	HEBREW LETTER AYIN
U+05DB	'\u05db'	Lo	1	HEBREW LETTER KAF
U+05E2	'\u05e2'	Lo	1	HEBREW LETTER AYIN

Total codepoints: 11

Shell test using printf(1), '|' should align in output:

$ printf "\xd7\x90\xd6\xb7\xd7\x9c\xd7\xb0\xd7\xa2\xd7\x9c\xd7\x98\xd7\x9c\xd7\xa2\xd7\x9b\xd7\xa2|\\n1234567890|\\n"
אַלװעלטלעכע|
1234567890|

python wcwidth.wcswidth() measures width 10, while zoc measures width 11.

Bamun

Sequence of language Bamun from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+004E	'N'	Lu	1	LATIN CAPITAL LETTER N
U+004A	'J'	Lu	1	LATIN CAPITAL LETTER J
U+0055	'U'	Lu	1	LATIN CAPITAL LETTER U
U+0301	'\u0301'	Mn	0	COMBINING ACUTE ACCENT

Total codepoints: 4

Shell test using printf(1), '|' should align in output:
```
$ printf "NJU\xcc\x81|\\n123|\\n"
NJÚ|
123|
```
python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Orok

Sequence of language Orok from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0427	'\u0427'	Lu	1	CYRILLIC CAPITAL LETTER CHE
U+0438	'\u0438'	Ll	1	CYRILLIC SMALL LETTER I
U+043F	'\u043f'	Ll	1	CYRILLIC SMALL LETTER PE
U+0430	'\u0430'	Ll	1	CYRILLIC SMALL LETTER A
U+0304	'\u0304'	Mn	0	COMBINING MACRON
U+043B	'\u043b'	Ll	1	CYRILLIC SMALL LETTER EL
U+0438	'\u0438'	Ll	1	CYRILLIC SMALL LETTER I
U+043D	'\u043d'	Ll	1	CYRILLIC SMALL LETTER EN
U+043D	'\u043d'	Ll	1	CYRILLIC SMALL LETTER EN
U+0435	'\u0435'	Ll	1	CYRILLIC SMALL LETTER IE
U+0304	'\u0304'	Mn	0	COMBINING MACRON
U+0441	'\u0441'	Ll	1	CYRILLIC SMALL LETTER ES
U+0430	'\u0430'	Ll	1	CYRILLIC SMALL LETTER A
U+043B	'\u043b'	Ll	1	CYRILLIC SMALL LETTER EL

Total codepoints: 14

Shell test using printf(1), '|' should align in output:

$ printf "\xd0\xa7\xd0\xb8\xd0\xbf\xd0\xb0\xcc\x84\xd0\xbb\xd0\xb8\xd0\xbd\xd0\xbd\xd0\xb5\xcc\x84\xd1\x81\xd0\xb0\xd0\xbb|\\n123456789012|\\n"
Чипа̄линне̄сал|
123456789012|

python wcwidth.wcswidth() measures width 12, while zoc measures width 14.

Tem

Sequence of language Tem from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0196	'\u0196'	Lu	1	LATIN CAPITAL LETTER IOTA
U+0072	'r'	Ll	1	LATIN SMALL LETTER R
U+028A	'\u028a'	Ll	1	LATIN SMALL LETTER UPSILON
U+0301	'\u0301'	Mn	0	COMBINING ACUTE ACCENT
U+002D	'-'	Pd	1	HYPHEN-MINUS
U+0064	'd'	Ll	1	LATIN SMALL LETTER D
U+025B	'\u025b'	Ll	1	LATIN SMALL LETTER OPEN E
U+0301	'\u0301'	Mn	0	COMBINING ACUTE ACCENT
U+025B	'\u025b'	Ll	1	LATIN SMALL LETTER OPEN E

Total codepoints: 9

Shell test using printf(1), '|' should align in output:

$ printf "\xc6\x96r\xca\x8a\xcc\x81-d\xc9\x9b\xcc\x81\xc9\x9b|\\n1234567|\\n"
Ɩrʊ́-dɛ́ɛ|
1234567|

python wcwidth.wcswidth() measures width 7, while zoc measures width 9.

Nanai

Sequence of language Nanai from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+041D	'\u041d'	Lu	1	CYRILLIC CAPITAL LETTER EN
U+0430	'\u0430'	Ll	1	CYRILLIC SMALL LETTER A
U+0438	'\u0438'	Ll	1	CYRILLIC SMALL LETTER I
U+0306	'\u0306'	Mn	0	COMBINING BREVE

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "\xd0\x9d\xd0\xb0\xd0\xb8\xcc\x86|\\n123|\\n"
Най|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Evenki

Sequence of language Evenki from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0411	'\u0411'	Lu	1	CYRILLIC CAPITAL LETTER BE
U+0443	'\u0443'	Ll	1	CYRILLIC SMALL LETTER U
U+0433	'\u0433'	Ll	1	CYRILLIC SMALL LETTER GHE
U+0430	'\u0430'	Ll	1	CYRILLIC SMALL LETTER A
U+0304	'\u0304'	Mn	0	COMBINING MACRON
U+0434	'\u0434'	Ll	1	CYRILLIC SMALL LETTER DE
U+0443	'\u0443'	Ll	1	CYRILLIC SMALL LETTER U

Total codepoints: 7

Shell test using printf(1), '|' should align in output:

$ printf "\xd0\x91\xd1\x83\xd0\xb3\xd0\xb0\xcc\x84\xd0\xb4\xd1\x83|\\n123456|\\n"
Буга̄ду|
123456|

python wcwidth.wcswidth() measures width 6, while zoc measures width 7.

Yaneshaʼ

Sequence of language Yaneshaʼ from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0303	'\u0303'	Mn	0	COMBINING TILDE
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+006C	'l'	Ll	1	LATIN SMALL LETTER L
U+006C	'l'	Ll	1	LATIN SMALL LETTER L
U+006F	'o'	Ll	1	LATIN SMALL LETTER O
U+0068	'h'	Ll	1	LATIN SMALL LETTER H
U+0075	'u'	Ll	1	LATIN SMALL LETTER U
U+0065	'e'	Ll	1	LATIN SMALL LETTER E
U+006E	'n'	Ll	1	LATIN SMALL LETTER N

Total codepoints: 9

Shell test using printf(1), '|' should align in output:

$ printf "\xcc\x83allohuen|\\n12345678|\\n"
̃allohuen|
12345678|

python wcwidth.wcswidth() measures width 8, while zoc measures width 9.

Ticuna

Sequence of language Ticuna from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+004E	'N'	Lu	1	LATIN CAPITAL LETTER N
U+00FC	'\xfc'	Ll	1	LATIN SMALL LETTER U WITH DIAERESIS
U+0078	'x'	Ll	1	LATIN SMALL LETTER X
U+00FC	'\xfc'	Ll	1	LATIN SMALL LETTER U WITH DIAERESIS
U+0303	'\u0303'	Mn	0	COMBINING TILDE

Total codepoints: 5

Shell test using printf(1), '|' should align in output:

$ printf "N\xc3\xbcx\xc3\xbc\xcc\x83|\\n1234|\\n"
Nüxü̃|
1234|

python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Amarakaeri

Sequence of language Amarakaeri from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+006F	'o'	Ll	1	LATIN SMALL LETTER O
U+0027	"'"	Po	1	APOSTROPHE
U+006E	'n'	Ll	1	LATIN SMALL LETTER N
U+006F	'o'	Ll	1	LATIN SMALL LETTER O
U+0070	'p'	Ll	1	LATIN SMALL LETTER P
U+006F	'o'	Ll	1	LATIN SMALL LETTER O
U+0065	'e'	Ll	1	LATIN SMALL LETTER E
U+0331	'\u0331'	Mn	0	COMBINING MACRON BELOW
U+0070	'p'	Ll	1	LATIN SMALL LETTER P
U+006F	'o'	Ll	1	LATIN SMALL LETTER O

Total codepoints: 10

Shell test using printf(1), '|' should align in output:

$ printf "o'nopoe\xcc\xb1po|\\n123456789|\\n"
o'nopoe̱po|
123456789|

python wcwidth.wcswidth() measures width 9, while zoc measures width 10.

South Azerbaijani

Sequence of language South Azerbaijani from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0049	'I'	Lu	1	LATIN CAPITAL LETTER I
U+0307	'\u0307'	Mn	0	COMBINING DOT ABOVE
U+004E	'N'	Lu	1	LATIN CAPITAL LETTER N
U+0053	'S'	Lu	1	LATIN CAPITAL LETTER S
U+0041	'A'	Lu	1	LATIN CAPITAL LETTER A
U+004E	'N'	Lu	1	LATIN CAPITAL LETTER N

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "I\xcc\x87NSAN|\\n12345|\\n"
İNSAN|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Yoruba

Sequence of language Yoruba from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+1EB8	'\u1eb8'	Lu	1	LATIN CAPITAL LETTER E WITH DOT BELOW
U+0300	'\u0300'	Mn	0	COMBINING GRAVE ACCENT
U+0054	'T'	Lu	1	LATIN CAPITAL LETTER T
U+1ECC	'\u1ecc'	Lu	1	LATIN CAPITAL LETTER O WITH DOT BELOW
U+0301	'\u0301'	Mn	0	COMBINING ACUTE ACCENT

Total codepoints: 5

Shell test using printf(1), '|' should align in output:

$ printf "\xe1\xba\xb8\xcc\x80T\xe1\xbb\x8c\xcc\x81|\\n123|\\n"
Ẹ̀TỌ́|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 5.

Chickasaw

Sequence of language Chickasaw from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+004D	'M'	Lu	1	LATIN CAPITAL LETTER M
U+00F3	'\xf3'	Ll	1	LATIN SMALL LETTER O WITH ACUTE
U+0331	'\u0331'	Mn	0	COMBINING MACRON BELOW
U+006D	'm'	Ll	1	LATIN SMALL LETTER M
U+0061	'a'	Ll	1	LATIN SMALL LETTER A

Total codepoints: 5

Shell test using printf(1), '|' should align in output:

$ printf "M\xc3\xb3\xcc\xb1ma|\\n1234|\\n"
Mó̱ma|
1234|

python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Siona

Sequence of language Siona from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0067	'g'	Ll	1	LATIN SMALL LETTER G
U+0075	'u'	Ll	1	LATIN SMALL LETTER U
U+00EB	'\xeb'	Ll	1	LATIN SMALL LETTER E WITH DIAERESIS
U+0331	'\u0331'	Mn	0	COMBINING MACRON BELOW
U+006E	'n'	Ll	1	LATIN SMALL LETTER N
U+0061	'a'	Ll	1	LATIN SMALL LETTER A

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "gu\xc3\xab\xcc\xb1na|\\n12345|\\n"
guë̱na|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Fur

Sequence of language Fur from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0044	'D'	Lu	1	LATIN CAPITAL LETTER D
U+00E1	'\xe1'	Ll	1	LATIN SMALL LETTER A WITH ACUTE
U+0331	'\u0331'	Mn	0	COMBINING MACRON BELOW
U+006C	'l'	Ll	1	LATIN SMALL LETTER L
U+0064	'd'	Ll	1	LATIN SMALL LETTER D
U+0268	'\u0268'	Ll	1	LATIN SMALL LETTER I WITH STROKE
U+0301	'\u0301'	Mn	0	COMBINING ACUTE ACCENT
U+014B	'\u014b'	Ll	1	LATIN SMALL LETTER ENG
U+00E1	'\xe1'	Ll	1	LATIN SMALL LETTER A WITH ACUTE
U+A78C	'\ua78c'	Ll	1	LATIN SMALL LETTER SALTILLO
U+014B	'\u014b'	Ll	1	LATIN SMALL LETTER ENG

Total codepoints: 11

Shell test using printf(1), '|' should align in output:

$ printf "D\xc3\xa1\xcc\xb1ld\xc9\xa8\xcc\x81\xc5\x8b\xc3\xa1\xea\x9e\x8c\xc5\x8b|\\n123456789|\\n"
Dá̱ldɨ́ŋáꞌŋ|
123456789|

python wcwidth.wcswidth() measures width 9, while zoc measures width 11.

Chinantec, Chiltepec

Sequence of language Chinantec, Chiltepec from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+006D	'm'	Ll	1	LATIN SMALL LETTER M
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+006B	'k'	Ll	1	LATIN SMALL LETTER K
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+006C	'l'	Ll	1	LATIN SMALL LETTER L
U+006F	'o'	Ll	1	LATIN SMALL LETTER O
U+006F	'o'	Ll	1	LATIN SMALL LETTER O
U+0331	'\u0331'	Mn	0	COMBINING MACRON BELOW

Total codepoints: 8

Shell test using printf(1), '|' should align in output:

$ printf "makaloo\xcc\xb1|\\n1234567|\\n"
makaloo̱|
1234567|

python wcwidth.wcswidth() measures width 7, while zoc measures width 8.

Gumuz

Sequence of language Gumuz from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+006D	'm'	Ll	1	LATIN SMALL LETTER M
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+0067	'g'	Ll	1	LATIN SMALL LETTER G
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+0063	'c'	Ll	1	LATIN SMALL LETTER C
U+0327	'\u0327'	Mn	0	COMBINING CEDILLA

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "magac\xcc\xa7|\\n12345|\\n"
magaç|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Bora

Sequence of language Bora from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+006D	'm'	Ll	1	LATIN SMALL LETTER M
U+0268	'\u0268'	Ll	1	LATIN SMALL LETTER I WITH STROKE
U+0301	'\u0301'	Mn	0	COMBINING ACUTE ACCENT
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+006D	'm'	Ll	1	LATIN SMALL LETTER M
U+00FA	'\xfa'	Ll	1	LATIN SMALL LETTER U WITH ACUTE
U+006E	'n'	Ll	1	LATIN SMALL LETTER N
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+0061	'a'	Ll	1	LATIN SMALL LETTER A

Total codepoints: 9

Shell test using printf(1), '|' should align in output:

$ printf "m\xc9\xa8\xcc\x81am\xc3\xbanaa|\\n12345678|\\n"
mɨ́amúnaa|
12345678|

python wcwidth.wcswidth() measures width 8, while zoc measures width 9.

Mòoré

Sequence of language Mòoré from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0073	's'	Ll	1	LATIN SMALL LETTER S
U+0065	'e'	Ll	1	LATIN SMALL LETTER E
U+0303	'\u0303'	Mn	0	COMBINING TILDE
U+006E	'n'	Ll	1	LATIN SMALL LETTER N

Total codepoints: 4

Shell test using printf(1), '|' should align in output:
```
$ printf "se\xcc\x83n|\\n123|\\n"
sẽn|
123|
```
python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Mongolian, Halh (Mongolian)

Sequence of language Mongolian, Halh (Mongolian) from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+1828	'\u1828'	Lo	1	MONGOLIAN LETTER NA
U+1821	'\u1821'	Lo	1	MONGOLIAN LETTER E
U+1837	'\u1837'	Lo	1	MONGOLIAN LETTER RA
U+180E	'\u180e'	Cf	0	MONGOLIAN VOWEL SEPARATOR
U+1821	'\u1821'	Lo	1	MONGOLIAN LETTER E

Total codepoints: 5

Shell test using printf(1), '|' should align in output:

$ printf "\xe1\xa0\xa8\xe1\xa0\xa1\xe1\xa0\xb7\xe1\xa0\x8e\xe1\xa0\xa1|\\n1234|\\n"
ᠨᠡᠷ᠎ᠡ|
1234|

python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Lamnso'

Sequence of language Lamnso' from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0064	'd'	Ll	1	LATIN SMALL LETTER D
U+007A	'z'	Ll	1	LATIN SMALL LETTER Z
U+0259	'\u0259'	Ll	1	LATIN SMALL LETTER SCHWA
U+0259	'\u0259'	Ll	1	LATIN SMALL LETTER SCHWA
U+0300	'\u0300'	Mn	0	COMBINING GRAVE ACCENT
U+006E	'n'	Ll	1	LATIN SMALL LETTER N

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "dz\xc9\x99\xc9\x99\xcc\x80n|\\n12345|\\n"
dzəə̀n|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Navajo

Sequence of language Navajo from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0042	'B'	Lu	1	LATIN CAPITAL LETTER B
U+0065	'e'	Ll	1	LATIN SMALL LETTER E
U+0065	'e'	Ll	1	LATIN SMALL LETTER E
U+0068	'h'	Ll	1	LATIN SMALL LETTER H
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+007A	'z'	Ll	1	LATIN SMALL LETTER Z
U+0105	'\u0105'	Ll	1	LATIN SMALL LETTER A WITH OGONEK
U+0301	'\u0301'	Mn	0	COMBINING ACUTE ACCENT
U+0105	'\u0105'	Ll	1	LATIN SMALL LETTER A WITH OGONEK

Total codepoints: 9

Shell test using printf(1), '|' should align in output:

$ printf "Beehaz\xc4\x85\xcc\x81\xc4\x85|\\n12345678|\\n"
Beehazą́ą|
12345678|

python wcwidth.wcswidth() measures width 8, while zoc measures width 9.

Tamazight, Central Atlas

Sequence of language Tamazight, Central Atlas from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0054	'T'	Lu	1	LATIN CAPITAL LETTER T
U+0049	'I'	Lu	1	LATIN CAPITAL LETTER I
U+0053	'S'	Lu	1	LATIN CAPITAL LETTER S
U+0323	'\u0323'	Mn	0	COMBINING DOT BELOW
U+0045	'E'	Lu	1	LATIN CAPITAL LETTER E
U+0052	'R'	Lu	1	LATIN CAPITAL LETTER R
U+0052	'R'	Lu	1	LATIN CAPITAL LETTER R
U+0049	'I'	Lu	1	LATIN CAPITAL LETTER I
U+0048	'H'	Lu	1	LATIN CAPITAL LETTER H
U+0323	'\u0323'	Mn	0	COMBINING DOT BELOW
U+0054	'T'	Lu	1	LATIN CAPITAL LETTER T

Total codepoints: 11

Shell test using printf(1), '|' should align in output:

$ printf "TIS\xcc\xa3ERRIH\xcc\xa3T|\\n123456789|\\n"
TIṢERRIḤT|
123456789|

python wcwidth.wcswidth() measures width 9, while zoc measures width 11.

Gilyak

Sequence of language Gilyak from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+043D	'\u043d'	Ll	1	CYRILLIC SMALL LETTER EN
U+0430	'\u0430'	Ll	1	CYRILLIC SMALL LETTER A
U+043C	'\u043c'	Ll	1	CYRILLIC SMALL LETTER EM
U+0430	'\u0430'	Ll	1	CYRILLIC SMALL LETTER A
U+0434	'\u0434'	Ll	1	CYRILLIC SMALL LETTER DE
U+0438	'\u0438'	Ll	1	CYRILLIC SMALL LETTER I
U+0432	'\u0432'	Ll	1	CYRILLIC SMALL LETTER VE
U+04CA	'\u04ca'	Ll	1	CYRILLIC SMALL LETTER EN WITH TAIL
U+0447	'\u0447'	Ll	1	CYRILLIC SMALL LETTER CHE
U+043E	'\u043e'	Ll	1	CYRILLIC SMALL LETTER O
U+0493	'\u0493'	Ll	1	CYRILLIC SMALL LETTER GHE WITH STROKE
U+0440	'\u0440'	Ll	1	CYRILLIC SMALL LETTER ER
U+030C	'\u030c'	Mn	0	COMBINING CARON

Total codepoints: 13

Shell test using printf(1), '|' should align in output:

$ printf "\xd0\xbd\xd0\xb0\xd0\xbc\xd0\xb0\xd0\xb4\xd0\xb8\xd0\xb2\xd3\x8a\xd1\x87\xd0\xbe\xd2\x93\xd1\x80\xcc\x8c|\\n123456789012|\\n"
намадивӊчоғр̌|
123456789012|

python wcwidth.wcswidth() measures width 12, while zoc measures width 13.

Ditammari

Sequence of language Ditammari from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+006D	'm'	Ll	1	LATIN SMALL LETTER M
U+0075	'u'	Ll	1	LATIN SMALL LETTER U
U+0077	'w'	Ll	1	LATIN SMALL LETTER W
U+025B	'\u025b'	Ll	1	LATIN SMALL LETTER OPEN E
U+0303	'\u0303'	Mn	0	COMBINING TILDE
U+0072	'r'	Ll	1	LATIN SMALL LETTER R
U+0069	'i'	Ll	1	LATIN SMALL LETTER I
U+006D	'm'	Ll	1	LATIN SMALL LETTER M
U+0075	'u'	Ll	1	LATIN SMALL LETTER U

Total codepoints: 9

Shell test using printf(1), '|' should align in output:

$ printf "muw\xc9\x9b\xcc\x83rimu|\\n12345678|\\n"
muwɛ̃rimu|
12345678|

python wcwidth.wcswidth() measures width 8, while zoc measures width 9.

Assyrian Neo-Aramaic

Sequence of language Assyrian Neo-Aramaic from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+072C	'\u072c'	Lo	1	SYRIAC LETTER TAW
U+071D	'\u071d'	Lo	1	SYRIAC LETTER YUDH
U+0712	'\u0712'	Lo	1	SYRIAC LETTER BETH
U+0742	'\u0742'	Mn	0	SYRIAC RUKKAKHA
U+0720	'\u0720'	Lo	1	SYRIAC LETTER LAMADH
U+071D	'\u071d'	Lo	1	SYRIAC LETTER YUDH
U+0710	'\u0710'	Lo	1	SYRIAC LETTER ALAPH

Total codepoints: 7

Shell test using printf(1), '|' should align in output:

$ printf "\xdc\xac\xdc\x9d\xdc\x92\xdd\x82\xdc\xa0\xdc\x9d\xdc\x90|\\n123456|\\n"
ܬܝܒ݂ܠܝܐ|
123456|

python wcwidth.wcswidth() measures width 6, while zoc measures width 7.

Farsi, Western

Sequence of language Farsi, Western from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+06A9	'\u06a9'	Lo	1	ARABIC LETTER KEHEH
U+0644	'\u0644'	Lo	1	ARABIC LETTER LAM
U+06CC	'\u06cc'	Lo	1	ARABIC LETTER FARSI YEH
U+0647	'\u0647'	Lo	1	ARABIC LETTER HEH
U+0654	'\u0654'	Mn	0	ARABIC HAMZA ABOVE

Total codepoints: 5

Shell test using printf(1), '|' should align in output:

$ printf "\xda\xa9\xd9\x84\xdb\x8c\xd9\x87\xd9\x94|\\n1234|\\n"
کلیهٔ|
1234|

python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Otomi, Mezquital

Sequence of language Otomi, Mezquital from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0058	'X'	Lu	1	LATIN CAPITAL LETTER X
U+0049	'I'	Lu	1	LATIN CAPITAL LETTER I
U+004A	'J'	Lu	1	LATIN CAPITAL LETTER J
U+004D	'M'	Lu	1	LATIN CAPITAL LETTER M
U+004F	'O'	Lu	1	LATIN CAPITAL LETTER O
U+004A	'J'	Lu	1	LATIN CAPITAL LETTER J
U+004F	'O'	Lu	1	LATIN CAPITAL LETTER O
U+0331	'\u0331'	Mn	0	COMBINING MACRON BELOW
U+0049	'I'	Lu	1	LATIN CAPITAL LETTER I

Total codepoints: 9

Shell test using printf(1), '|' should align in output:

$ printf "XIJMOJO\xcc\xb1I|\\n12345678|\\n"
XIJMOJO̱I|
12345678|

python wcwidth.wcswidth() measures width 8, while zoc measures width 9.

Veps

Sequence of language Veps from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0075	'u'	Ll	1	LATIN SMALL LETTER U
U+0308	'\u0308'	Mn	0	COMBINING DIAERESIS
U+0068	'h'	Ll	1	LATIN SMALL LETTER H
U+0074	't'	Ll	1	LATIN SMALL LETTER T
U+0068	'h'	Ll	1	LATIN SMALL LETTER H
U+0069	'i'	Ll	1	LATIN SMALL LETTER I
U+006E	'n'	Ll	1	LATIN SMALL LETTER N
U+0065	'e'	Ll	1	LATIN SMALL LETTER E

Total codepoints: 8

Shell test using printf(1), '|' should align in output:

$ printf "u\xcc\x88hthine|\\n1234567|\\n"
ühthine|
1234567|

python wcwidth.wcswidth() measures width 7, while zoc measures width 8.

Waama

Sequence of language Waama from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+006E	'n'	Ll	1	LATIN SMALL LETTER N
U+0300	'\u0300'	Mn	0	COMBINING GRAVE ACCENT

Total codepoints: 2

Shell test using printf(1), '|' should align in output:
```
$ printf "n\xcc\x80|\\n1|\\n"
ǹ|
1|
```
python wcwidth.wcswidth() measures width 1, while zoc measures width 2.

Dinka, Northeastern

Sequence of language Dinka, Northeastern from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0062	'b'	Ll	1	LATIN SMALL LETTER B
U+025B	'\u025b'	Ll	1	LATIN SMALL LETTER OPEN E
U+0308	'\u0308'	Mn	0	COMBINING DIAERESIS
U+0069	'i'	Ll	1	LATIN SMALL LETTER I

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "b\xc9\x9b\xcc\x88i|\\n123|\\n"
bɛ̈i|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Dari

Sequence of language Dari from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+06A9	'\u06a9'	Lo	1	ARABIC LETTER KEHEH
U+0644	'\u0644'	Lo	1	ARABIC LETTER LAM
U+06CC	'\u06cc'	Lo	1	ARABIC LETTER FARSI YEH
U+0647	'\u0647'	Lo	1	ARABIC LETTER HEH
U+0654	'\u0654'	Mn	0	ARABIC HAMZA ABOVE

Total codepoints: 5

Shell test using printf(1), '|' should align in output:

$ printf "\xda\xa9\xd9\x84\xdb\x8c\xd9\x87\xd9\x94|\\n1234|\\n"
کلیهٔ|
1234|

python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Éwé

Sequence of language Éwé from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0068	'h'	Ll	1	LATIN SMALL LETTER H
U+006C	'l'	Ll	1	LATIN SMALL LETTER L
U+0254	'\u0254'	Ll	1	LATIN SMALL LETTER OPEN O
U+0303	'\u0303'	Mn	0	COMBINING TILDE
U+006E	'n'	Ll	1	LATIN SMALL LETTER N
U+0075	'u'	Ll	1	LATIN SMALL LETTER U
U+0077	'w'	Ll	1	LATIN SMALL LETTER W
U+0254	'\u0254'	Ll	1	LATIN SMALL LETTER OPEN O
U+0077	'w'	Ll	1	LATIN SMALL LETTER W
U+0254	'\u0254'	Ll	1	LATIN SMALL LETTER OPEN O

Total codepoints: 10

Shell test using printf(1), '|' should align in output:

$ printf "hl\xc9\x94\xcc\x83nuw\xc9\x94w\xc9\x94|\\n123456789|\\n"
hlɔ̃nuwɔwɔ|
123456789|

python wcwidth.wcswidth() measures width 9, while zoc measures width 10.

Baatonum

Sequence of language Baatonum from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+006D	'm'	Ll	1	LATIN SMALL LETTER M
U+025B	'\u025b'	Ll	1	LATIN SMALL LETTER OPEN E
U+0300	'\u0300'	Mn	0	COMBINING GRAVE ACCENT

Total codepoints: 3

Shell test using printf(1), '|' should align in output:
```
$ printf "m\xc9\x9b\xcc\x80|\\n12|\\n"
mɛ̀|
12|
```
python wcwidth.wcswidth() measures width 2, while zoc measures width 3.

Urdu (2)

Sequence of language Urdu (2) from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0627	'\u0627'	Lo	1	ARABIC LETTER ALEF
U+0642	'\u0642'	Lo	1	ARABIC LETTER QAF
U+0648	'\u0648'	Lo	1	ARABIC LETTER WAW
U+0627	'\u0627'	Lo	1	ARABIC LETTER ALEF
U+0645	'\u0645'	Lo	1	ARABIC LETTER MEEM
U+0650	'\u0650'	Mn	0	ARABIC KASRA

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "\xd8\xa7\xd9\x82\xd9\x88\xd8\xa7\xd9\x85\xd9\x90|\\n12345|\\n"
اقوامِ|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Urdu

Sequence of language Urdu from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0627	'\u0627'	Lo	1	ARABIC LETTER ALEF
U+0642	'\u0642'	Lo	1	ARABIC LETTER QAF
U+0648	'\u0648'	Lo	1	ARABIC LETTER WAW
U+0627	'\u0627'	Lo	1	ARABIC LETTER ALEF
U+0645	'\u0645'	Lo	1	ARABIC LETTER MEEM
U+0650	'\u0650'	Mn	0	ARABIC KASRA

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "\xd8\xa7\xd9\x82\xd9\x88\xd8\xa7\xd9\x85\xd9\x90|\\n12345|\\n"
اقوامِ|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Uduk

Sequence of language Uduk from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0070	'p'	Ll	1	LATIN SMALL LETTER P
U+0331	'\u0331'	Mn	0	COMBINING MACRON BELOW
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+0072	'r'	Ll	1	LATIN SMALL LETTER R
U+0061	'a'	Ll	1	LATIN SMALL LETTER A

Total codepoints: 5

Shell test using printf(1), '|' should align in output:
```
$ printf "p\xcc\xb1ara|\\n1234|\\n"
p̱ara|
1234|
```
python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Mazahua Central

Sequence of language Mazahua Central from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0054	'T'	Lu	1	LATIN CAPITAL LETTER T
U+0045	'E'	Lu	1	LATIN CAPITAL LETTER E
U+0331	'\u0331'	Mn	0	COMBINING MACRON BELOW
U+0027	"'"	Po	1	APOSTROPHE
U+0045	'E'	Lu	1	LATIN CAPITAL LETTER E
U+0331	'\u0331'	Mn	0	COMBINING MACRON BELOW

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "TE\xcc\xb1'E\xcc\xb1|\\n1234|\\n"
TE̱'E̱|
1234|

python wcwidth.wcswidth() measures width 4, while zoc measures width 6.

Secoya

Sequence of language Secoya from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0063	'c'	Ll	1	LATIN SMALL LETTER C
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+006E	'n'	Ll	1	LATIN SMALL LETTER N
U+00EB	'\xeb'	Ll	1	LATIN SMALL LETTER E WITH DIAERESIS
U+006F	'o'	Ll	1	LATIN SMALL LETTER O
U+0077	'w'	Ll	1	LATIN SMALL LETTER W
U+00EB	'\xeb'	Ll	1	LATIN SMALL LETTER E WITH DIAERESIS
U+0331	'\u0331'	Mn	0	COMBINING MACRON BELOW

Total codepoints: 8

Shell test using printf(1), '|' should align in output:

$ printf "can\xc3\xabow\xc3\xab\xcc\xb1|\\n1234567|\\n"
canëowë̱|
1234567|

python wcwidth.wcswidth() measures width 7, while zoc measures width 8.

Gen

Sequence of language Gen from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0064	'd'	Ll	1	LATIN SMALL LETTER D
U+0254	'\u0254'	Ll	1	LATIN SMALL LETTER OPEN O
U+0300	'\u0300'	Mn	0	COMBINING GRAVE ACCENT
U+006E	'n'	Ll	1	LATIN SMALL LETTER N
U+006E	'n'	Ll	1	LATIN SMALL LETTER N
U+0061	'a'	Ll	1	LATIN SMALL LETTER A

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "d\xc9\x94\xcc\x80nna|\\n12345|\\n"
dɔ̀nna|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Picard

Sequence of language Picard from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0076	'v'	Ll	1	LATIN SMALL LETTER V
U+0072	'r'	Ll	1	LATIN SMALL LETTER R
U+0065	'e'	Ll	1	LATIN SMALL LETTER E
U+030A	'\u030a'	Mn	0	COMBINING RING ABOVE
U+0079	'y'	Ll	1	LATIN SMALL LETTER Y
U+006D	'm'	Ll	1	LATIN SMALL LETTER M
U+0069	'i'	Ll	1	LATIN SMALL LETTER I
U+006E	'n'	Ll	1	LATIN SMALL LETTER N
U+0074	't'	Ll	1	LATIN SMALL LETTER T

Total codepoints: 9

Shell test using printf(1), '|' should align in output:

$ printf "vre\xcc\x8aymint|\\n12345678|\\n"
vre̊ymint|
12345678|

python wcwidth.wcswidth() measures width 8, while zoc measures width 9.

Mixtec, Metlatónoc

Sequence of language Mixtec, Metlatónoc from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+006E	'n'	Ll	1	LATIN SMALL LETTER N
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+0027	"'"	Po	1	APOSTROPHE
U+006E	'n'	Ll	1	LATIN SMALL LETTER N
U+0075	'u'	Ll	1	LATIN SMALL LETTER U
U+0331	'\u0331'	Mn	0	COMBINING MACRON BELOW

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "na'nu\xcc\xb1|\\n12345|\\n"
na'nu̱|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Arabic, Standard

Sequence of language Arabic, Standard from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0627	'\u0627'	Lo	1	ARABIC LETTER ALEF
U+0639	'\u0639'	Lo	1	ARABIC LETTER AIN
U+062A	'\u062a'	Lo	1	ARABIC LETTER TEH
U+064F	'\u064f'	Mn	0	ARABIC DAMMA
U+0645	'\u0645'	Lo	1	ARABIC LETTER MEEM
U+062F	'\u062f'	Lo	1	ARABIC LETTER DAL

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "\xd8\xa7\xd8\xb9\xd8\xaa\xd9\x8f\xd9\x85\xd8\xaf|\\n12345|\\n"
اعتُمد|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Ga

Sequence of language Ga from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+0073	's'	Ll	1	LATIN SMALL LETTER S
U+0068	'h'	Ll	1	LATIN SMALL LETTER H
U+0254	'\u0254'	Ll	1	LATIN SMALL LETTER OPEN O
U+0303	'\u0303'	Mn	0	COMBINING TILDE

Total codepoints: 5

Shell test using printf(1), '|' should align in output:

$ printf "ash\xc9\x94\xcc\x83|\\n1234|\\n"
ashɔ̃|
1234|

python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Panjabi, Western

Sequence of language Panjabi, Western from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0627	'\u0627'	Lo	1	ARABIC LETTER ALEF
U+064F	'\u064f'	Mn	0	ARABIC DAMMA
U+0646	'\u0646'	Lo	1	ARABIC LETTER NOON
U+06CC	'\u06cc'	Lo	1	ARABIC LETTER FARSI YEH

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "\xd8\xa7\xd9\x8f\xd9\x86\xdb\x8c|\\n123|\\n"
اُنی|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Dangme

Sequence of language Dangme from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+006E	'n'	Ll	1	LATIN SMALL LETTER N
U+0254	'\u0254'	Ll	1	LATIN SMALL LETTER OPEN O
U+0301	'\u0301'	Mn	0	COMBINING ACUTE ACCENT

Total codepoints: 3

Shell test using printf(1), '|' should align in output:
```
$ printf "n\xc9\x94\xcc\x81|\\n12|\\n"
nɔ́|
12|
```
python wcwidth.wcswidth() measures width 2, while zoc measures width 3.

Dagaare, Southern

Sequence of language Dagaare, Southern from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+006B	'k'	Ll	1	LATIN SMALL LETTER K
U+0075	'u'	Ll	1	LATIN SMALL LETTER U
U+0303	'\u0303'	Mn	0	COMBINING TILDE
U+0075	'u'	Ll	1	LATIN SMALL LETTER U
U+0303	'\u0303'	Mn	0	COMBINING TILDE

Total codepoints: 5

Shell test using printf(1), '|' should align in output:

$ printf "ku\xcc\x83u\xcc\x83|\\n123|\\n"
kũũ|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 5.

Serer-Sine

Sequence of language Serer-Sine from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0070	'p'	Ll	1	LATIN SMALL LETTER P
U+0301	'\u0301'	Mn	0	COMBINING ACUTE ACCENT
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+0073	's'	Ll	1	LATIN SMALL LETTER S
U+0069	'i'	Ll	1	LATIN SMALL LETTER I
U+006C	'l'	Ll	1	LATIN SMALL LETTER L

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "p\xcc\x81asil|\\n12345|\\n"
ṕasil|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Fon

Sequence of language Fon from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0061	'a'	Ll	1	LATIN SMALL LETTER A
U+006B	'k'	Ll	1	LATIN SMALL LETTER K
U+0254	'\u0254'	Ll	1	LATIN SMALL LETTER OPEN O
U+0301	'\u0301'	Mn	0	COMBINING ACUTE ACCENT
U+006E	'n'	Ll	1	LATIN SMALL LETTER N

Total codepoints: 5

Shell test using printf(1), '|' should align in output:

$ printf "ak\xc9\x94\xcc\x81n|\\n1234|\\n"
akɔ́n|
1234|

python wcwidth.wcswidth() measures width 4, while zoc measures width 5.

Aja

Sequence of language Aja from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+00E8	'\xe8'	Ll	1	LATIN SMALL LETTER E WITH GRAVE
U+0067	'g'	Ll	1	LATIN SMALL LETTER G
U+0062	'b'	Ll	1	LATIN SMALL LETTER B
U+025B	'\u025b'	Ll	1	LATIN SMALL LETTER OPEN E
U+0300	'\u0300'	Mn	0	COMBINING GRAVE ACCENT
U+006D	'm'	Ll	1	LATIN SMALL LETTER M
U+025B	'\u025b'	Ll	1	LATIN SMALL LETTER OPEN E
U+0300	'\u0300'	Mn	0	COMBINING GRAVE ACCENT

Total codepoints: 8

Shell test using printf(1), '|' should align in output:

$ printf "\xc3\xa8gb\xc9\x9b\xcc\x80m\xc9\x9b\xcc\x80|\\n123456|\\n"
ègbɛ̀mɛ̀|
123456|

python wcwidth.wcswidth() measures width 6, while zoc measures width 8.

Pashto, Northern

Sequence of language Pashto, Northern from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0627	'\u0627'	Lo	1	ARABIC LETTER ALEF
U+0633	'\u0633'	Lo	1	ARABIC LETTER SEEN
U+0627	'\u0627'	Lo	1	ARABIC LETTER ALEF
U+0633	'\u0633'	Lo	1	ARABIC LETTER SEEN
U+0627	'\u0627'	Lo	1	ARABIC LETTER ALEF
U+064B	'\u064b'	Mn	0	ARABIC FATHATAN

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "\xd8\xa7\xd8\xb3\xd8\xa7\xd8\xb3\xd8\xa7\xd9\x8b|\\n12345|\\n"
اساساً|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Dendi

Sequence of language Dendi from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0062	'b'	Ll	1	LATIN SMALL LETTER B
U+0254	'\u0254'	Ll	1	LATIN SMALL LETTER OPEN O
U+0303	'\u0303'	Mn	0	COMBINING TILDE
U+014B	'\u014b'	Ll	1	LATIN SMALL LETTER ENG
U+0254	'\u0254'	Ll	1	LATIN SMALL LETTER OPEN O
U+002E	'.'	Po	1	FULL STOP

Total codepoints: 6

Shell test using printf(1), '|' should align in output:

$ printf "b\xc9\x94\xcc\x83\xc5\x8b\xc9\x94.|\\n12345|\\n"
bɔ̃ŋɔ.|
12345|

python wcwidth.wcswidth() measures width 5, while zoc measures width 6.

Seraiki

Sequence of language Seraiki from midpoint of alignment failure records:

Codepoint	Python	Category	wcwidth	Name
U+0627	'\u0627'	Lo	1	ARABIC LETTER ALEF
U+064F	'\u064f'	Mn	0	ARABIC DAMMA
U+062A	'\u062a'	Lo	1	ARABIC LETTER TEH
U+06D2	'\u06d2'	Lo	1	ARABIC LETTER YEH BARREE

Total codepoints: 4

Shell test using printf(1), '|' should align in output:

$ printf "\xd8\xa7\xd9\x8f\xd8\xaa\xdb\x92|\\n123|\\n"
اُتے|
123|

python wcwidth.wcswidth() measures width 3, while zoc measures width 4.

Files

zoc.rst

Latest commit

History

zoc.rst

File metadata and controls

zoc

Wide character support

Emoji ZWJ support

Variation Selector-16 support

Language Support

Javanese (Javanese)

Nuosu

Cherokee (cased)

Tai Dam

Maldivian

Tamil

Tamil (Sri Lanka)

Burmese

Mon

Shan

Dzongkha

Gujarati

Tibetan, Central

Malayalam

Tamang, Eastern

Kannada

Khün

Khmer, Central

Bengali

Chakma

Telugu

Nepali

Sanskrit

Sanskrit (Grantha)

Marathi

Hindi

Sinhala

Panjabi, Eastern

Bhojpuri

Thai (2)

Maithili

Thai

Magahi

Vietnamese

Tagalog (Tagalog)

Lao

Lingala (tones)

Vietnamese (Han nom)

Pular (Adlam)

Yiddish, Eastern

Bamun

Orok

Tem

Nanai

Evenki

Yaneshaʼ

Ticuna

Amarakaeri

South Azerbaijani

Yoruba

Chickasaw

Siona

Fur

Chinantec, Chiltepec

Gumuz

Bora

Mòoré

Mongolian, Halh (Mongolian)

Lamnso'

Navajo

Tamazight, Central Atlas

Gilyak

Ditammari

Assyrian Neo-Aramaic

Farsi, Western

Otomi, Mezquital

Veps

Waama

Dinka, Northeastern