Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in ranges for isalpharune generation from UnicodeData.txt - CJK Ideographs #13

Open
mcrouse-chrome opened this issue Mar 4, 2021 · 0 comments

Comments

@mcrouse-chrome
Copy link

tl;dr - The behavior of isalpharune treats CJK Ideographs as non alpha.

CJK ideographs are specified in the UnicodeData.txt (9.0.0) as:

12018 4E00;<CJK Ideograph, First>;Lo;0;L;;;;;N;;;;;
12019 9FD5;<CJK Ideograph, Last>;Lo;0;L;;;;;N;;;;;

However, the result of the generation (via awk) marks these as singles. This means that any query of a rune in that range (other than these two) will return false to isalpharune().

My best guess is that the awk script wants ranges to have hex codes in the same range and these have nothing in common there. (4E00 vs 9FD5)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant