Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Extend set of Unicode property tests in racket/base #3634

Open
tail-reversion opened this issue Jan 18, 2021 · 0 comments
Open

Comments

@tail-reversion
Copy link

Analyzing Unicode properties is critical to text-processing algorithms and Unicode-compatible parsing. Current, Racket supports testing the general category of a character, using char-general-category and related predicates, like char-alphabetic?, &c. However, there are many other Unicode properties (see UAX 44), and there are no procedures in Racket for inspecting them.

Currently, analysis of other properties can be added to programs using the Unicode Chars package, but given that Racket has Unicode chars and strings as primitive data types, such fundamental analysis tools really belong in the core language, not an external dependency. Additionally, relying on an external library for extending the set of text operations risks having a combined set of operations that are not in-sync, with respect to the version of the Unicode Standard to which they adhere, introducing subtle bugs and incompatibilities.

If it would be helpful, I’d be happy to develop a draft of the extended char/string API, for discussion, and help investigate how best to implement the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant