You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Analyzing Unicode properties is critical to text-processing algorithms and Unicode-compatible parsing. Current, Racket supports testing the general category of a character, using char-general-category and related predicates, like char-alphabetic?, &c. However, there are many other Unicode properties (see UAX 44), and there are no procedures in Racket for inspecting them.
Currently, analysis of other properties can be added to programs using the Unicode Chars package, but given that Racket has Unicode chars and strings as primitive data types, such fundamental analysis tools really belong in the core language, not an external dependency. Additionally, relying on an external library for extending the set of text operations risks having a combined set of operations that are not in-sync, with respect to the version of the Unicode Standard to which they adhere, introducing subtle bugs and incompatibilities.
If it would be helpful, I’d be happy to develop a draft of the extended char/string API, for discussion, and help investigate how best to implement the API.
The text was updated successfully, but these errors were encountered:
Analyzing Unicode properties is critical to text-processing algorithms and Unicode-compatible parsing. Current, Racket supports testing the general category of a character, using
char-general-category
and related predicates, likechar-alphabetic?
, &c. However, there are many other Unicode properties (see UAX 44), and there are no procedures in Racket for inspecting them.Currently, analysis of other properties can be added to programs using the Unicode Chars package, but given that Racket has Unicode chars and strings as primitive data types, such fundamental analysis tools really belong in the core language, not an external dependency. Additionally, relying on an external library for extending the set of text operations risks having a combined set of operations that are not in-sync, with respect to the version of the Unicode Standard to which they adhere, introducing subtle bugs and incompatibilities.
If it would be helpful, I’d be happy to develop a draft of the extended char/string API, for discussion, and help investigate how best to implement the API.
The text was updated successfully, but these errors were encountered: