Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please add APIs to determine the Unicode category of a UTF32 code point #26719

Closed
gafter opened this issue Jul 6, 2018 · 2 comments
Closed
Milestone

Comments

@gafter
Copy link
Member

gafter commented Jul 6, 2018

Please add an API for determining the Unicode category of a code point resulting from a surrogate pair. Presumably I would compute that by starting with

        public static int System.Char.ConvertToUtf32(char highSurrogate, char lowSurrogate) {

But there is no way to get the unicode category of the resulting code point.

The closest thing we have right now is this:

        public static UnicodeCategory System.Char.GetUnicodeCategory(String s, int index) {

But that requires that we store the surrogates into a newly created string in order to get the Unicode category. That requires an unacceptably high allocation overhead.

The original problem I am trying to solve is dotnet/roslyn#9731, dotnet/roslyn#13474, and dotnet/roslyn#13560.

@stephentoub
Copy link
Member

stephentoub commented Jul 6, 2018

dotnet/coreclr#15911 added:

public static UnicodeCategory GetUnicodeCategory(int codePoint)

to CharUnicodeInfo, available in .NET Core 2.1. This was previously an internal method named InternalGetUnicodeCategory.

@gafter
Copy link
Member Author

gafter commented Jul 6, 2018

Perfect! Thanks.

@gafter gafter closed this as completed Jul 6, 2018
@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 3.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants