Skip to content

failing grapheme-break test #181

@stevengj

Description

@stevengj

The grapheme-break test is currently failing:

$ test/graphemetest data/GraphemeBreakTest.txt
checking line 100...
checking line 200...
checking line 300...
checking line 400...
checking line 500...
checking line 600...
line 621: grapheme mismatch: "/a🏿/👶‍/🛑" instead of "/a🏿/👶‍🛑"

At first I thought that was due to the dependence on Prepended_Concatenation_Mark, which was added in Unicode 13 (https://www.unicode.org/reports/tr29/tr29-37.html#Grapheme_Cluster_Break_Property_Values), but that should already be included in UTF8PROC_BOUNDCLASS_CONTROL which we generate from GraphemeBreakProperty.txt … so there must be some other change in the grapheme-break algorithms that we are not handling.

The failing test is from the following line in GraphemeBreakTest.txt:

/ 0061 + 1F3FF / 1F476 + 200D + 1F6D1 /	#  / [0.2] LATIN SMALL LETTER A (Other) + [9.0] EMOJI MODIFIER FITZPATRICK TYPE-6 (Extend) / [999.0] BABY (ExtPict) + [9.0] ZERO WIDTH JOINER (ZWJ_ExtCccZwj) + [11.0] OCTAGONAL SIGN (ExtPict) / [0.3]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions