Description
(based on a comment from @Jack-Works in #3587 (comment))
Regular expression pattern \p{Script_Extensions=…}
and alias \p{scx=…}
are used to match code points by the Script_Extensions property, which is set-valued (cf. UTS #18 and UAX #24). UTS 18 makes clear that the value for that property is always a non-empty set containing either the single value Unknown or one or more valid Script (sc) values, and UAX #24 constrains the regular expression pattern to validity on the basis of the property name (which in this case means valid members of a Script_Extensions set).
Implementations generally accept any script value or alias thereof, but there are some special cases:
\p{scx=Unknown}
and alias\p{scx=Zzzz}
: accepted by GraalJS, LibJS, SpiderMonkey, and V8, but rejected as invalid by other implementations\p{scx=Katakana_Or_Hiragana}
and alias\p{scx=Hrkt}
: rejected as invalid by all implementations
Unknown and alias Zzzz should obviously be supported; it is specifically mentioned in UAX 24 as covering unassigned, private use, and non-character code points.
Katakana_Or_Hiragana and alias Hrkt, on the other hand, should be supported if and only if they are supported for Script. No implementation seems to do so—which makes sense, because those values are essentially a degenerate attempt to fulfill the purpose better served by Script_Extensions, and no code point has that value in either Script or Script_Extensions.
So I think we're missing test coverage for the first case (to be fixed by tc39/test262#4473), but also possibly a note in the algorithm for UnicodePropertyValueExpression :: UnicodePropertyName =
UnicodePropertyValue