-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uniprop needs tests for all unicode properties #195
Comments
Can this be somehow generated instead of done manually? |
To generate it I would probably have to write a script to read the UNIDATA files. Maybe it would make sense to generate them after we already have some tests in place. Because otherwise how do you test the generator? More importantly, a large number of these are derived properties, that are set depending on how a bunch of other properties are set, so it would be quite hard to do. Once we have tests for all of them it may be a good idea to get some generation in place though for certain properties that would be easy to generate for (that don't rely on a large number of other properties). |
i'd like to help out with this; can i just start adding tests for properties which haven't yet been checked off? |
Yep! See S15-unicode-information/uniprop.t for the properties |
Great, thanks! Already looking at it. :-) |
Okay, i've now made PR #222; only changed five properties, so you can make sure i'm on the right track. |
Good thank you :) |
Emoji:
See: http://unicode.org/reports/tr51/#Data_Files for how they are determined (These are Boolean)
Values stored here: http://unicode.org/Public/emoji/latest/emoji-data.txt
If you hadn't seen this before, really great resource to check which symbols have a property:
http://unicode.org/cldr/utility/properties.html
Emoji_Zwj_Sequences is a property of multiple codepoints together (we may need another routine to do this since it is a property of sequences of codepoints).
Numeric Properties
String Properties
Miscellaneous Properties
Name_Alias and Script_Extensions can hold multiple values. It is not yet determined how we will access them once they are added to some backend
Catalog Properties
Enumerated Properties
Binary Properties
Total: 118 + 6 Emoji
Implementation specific properties
These are not official Unicode properties and should not have tests written for them. They are listed here for completeness.
The text was updated successfully, but these errors were encountered: