Skip to content

Incorrect transform result for /(?i:\u1C89)/ #106

@JLHwung

Description

@JLHwung

Current output:

/(?i:\u1C89)/

Expected output:

/(?:[\u1C89\u1C8A])/

Context: The U+1C89 (CYRILLIC CAPITAL LETTER TJE) and U+1C8A (CYRILLIC SMALL LETTER TJE) are introduced in Unicode 16.0 released this year. According to UnicodeData.txt and SpecialCasing.txt they should be match by /\u1C89/i. However, because node.js 22 does not support them in toUpperCase, they are incorrectly categorized as unicode mappings:

if(
// TODO: Make this not depend on the engine in which this build script
// runs. (If V8 has a bug, then the generated data has the same bug.)
!RegExp(String.fromCodePoint(from), 'i').test(String.fromCodePoint(to))
) {
extend(filteredMappings, from, to);

To address this, we can add UnicodeData.txt and SpecialCasing.txt to the node-unicode-data and replace the engine test by the result only derived from the latest UCD.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions