Skip to content

Implementation divergence for Script_Extensions values #3590

Open
@gibson042

Description

@gibson042

(based on a comment from @Jack-Works in #3587 (comment))

Regular expression pattern \p{Script_Extensions=…} and alias \p{scx=…} are used to match code points by the Script_Extensions property, which is set-valued (cf. UTS #18 and UAX #24). UTS 18 makes clear that the value for that property is always a non-empty set containing either the single value Unknown or one or more valid Script (sc) values, and UAX #24 constrains the regular expression pattern to validity on the basis of the property name (which in this case means valid members of a Script_Extensions set).

Implementations generally accept any script value or alias thereof, but there are some special cases:

  • \p{scx=Unknown} and alias \p{scx=Zzzz}: accepted by GraalJS, LibJS, SpiderMonkey, and V8, but rejected as invalid by other implementations
  • \p{scx=Katakana_Or_Hiragana} and alias \p{scx=Hrkt}: rejected as invalid by all implementations

Unknown and alias Zzzz should obviously be supported; it is specifically mentioned in UAX 24 as covering unassigned, private use, and non-character code points.

Katakana_Or_Hiragana and alias Hrkt, on the other hand, should be supported if and only if they are supported for Script. No implementation seems to do so—which makes sense, because those values are essentially a degenerate attempt to fulfill the purpose better served by Script_Extensions, and no code point has that value in either Script or Script_Extensions.

So I think we're missing test coverage for the first case (to be fixed by tc39/test262#4473), but also possibly a note in the algorithm for UnicodePropertyValueExpression :: UnicodePropertyName = UnicodePropertyValue

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions