Skip to content

UDL Operator can't recognize unicode characters #12161

@byzod

Description

@byzod

Description of the Issue

UDL operator takes unicode characters as valid input but it didn't work actually

Steps to Reproduce the Issue

  1. Create a new language template, add = * × ; ; to operators 1 and color it for better observation
  2. Open file with content:
    123*456=789
    123×456=789
    123;456=789
    123;456=789

Expected Behavior

All Characters except numbers are highlighted with style you set for operators 1

Actual Behavior

Only * = ; are highlighted
1

Debug Information

Notepad++ v8.4.5 (64-bit)
Build time : Sep 3 2022 - 04:05:32
Path : C:\App\something\Notepad++\Notepad++.exe
Command Line : "K:\Down\1.tst"
Admin mode : OFF
Local Conf mode : ON
Cloud Config : OFF
OS Name : Windows 10 Home China (64-bit)
OS Version : 21H2
OS Build : 19044.1889
Current ANSI codepage : 936
Plugins :
BetterMultiSelection (1.5)
ComparePlus (1)
CSharpRegexTools4Npp (1.1.2)
EmmetNPP (1.0.2)
HexEditor (0.9.12)
JSMinNPP (1.2006)
mimeTools (2.8)
nppAutoDetectIndent (2.3)
NppConverter (4.4)
NppExport (0.4)
NppTextViz (0.4.2)
PythonScript (2)

More info

The ×(0x00D7) is a common sign in math while (0xFF1B) is simply ; in chinese form, the UDL.xml records those correctly but it don't work. The .xml file reads:

<Keywords name="Operators1">= * &#x00D7; ; &#xFF1B;</Keywords>

Metadata

Metadata

Assignees

No one assigned

    Labels

    udlEverything related to User Defined Language

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions