-
-
Notifications
You must be signed in to change notification settings - Fork 105
Allow Unicode characters in Selectors
#510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #510 +/- ##
====================================
+ Coverage 97% 98% +1%
====================================
Files 99 100 +1
Lines 3431 3558 +127
====================================
+ Hits 3339 3484 +145
+ Misses 92 74 -18 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Refactored internal `ParserSettings` to convert instance-level properties and methods to static or const members.
a9ef0c5 to
6b1dbec
Compare
karljj1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great :)
|
Thanks for the review. The PR in its present form represents a major breaking change because it fundamentally alters how format strings are parsed: [Edit] |
Implement proposals from review: * SelectorFilterType.Alphanumeric: alphanumeric characters (upper and lower case), plus '_' and '-' * SelectorFilterType.VisualUnicodeChars: All Unicode characters are allowed in a selector, except 68 non-visual characters: Control Characters (U+0000–U+001F, U+007F), Format Characters (Category: Cf), Directional Formatting (Category: Cf), Invisible Separator, Common Combining Marks (Category: Mn), Whitespace Characters (non-glyph spacing).
imprima
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent
|



Changes
Selectors inPlaceholders may now contain most Unicode characters whenParserSettings.SelectorCharFilter = SelectorFilterType.VisualUnicodeChars.Disallowed characters are:
{}[]()\.?,Merged from feat: Filter
Selectorchars by allowlist or blocklilst #511:class CharSet. It represents a set of characters that supports efficient storage and lookup for both ASCII and non-ASCII characters. It is used in theParseras allow list or block list. The speed for parsingPlaceholdera decreases by ~25% compared to v3.2.0 to v3.6.1.Parserto useCharSetand handle the definedFilterTypeParserSettings: Re-order members, update internal properties to better align with classCharSet.Example:
ParserSettings.SelectorCharFilter = SelectorFilterType.Alphanumericis the default and allows alphanumeric characters plus_and-.Benchmark
after implementing class
CharSetinParserParser.ParseFormat("{SomePlaceholder1}{SomePlaceholder2}{SomePlaceholder3}{SomePlaceholder4}{SomePlaceholder5}");
27% faster
Resolves #454