-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smaller tables #68
Smaller tables #68
Conversation
…tf8proc_totitle, as title_mapping cannot be used to get the title codepoint anymore. Rename xxx_mapping to xxx_seqindex, so programs assuming a value with the old meaning fail at compile time
Seems like travis is failing because of an expired cert
cc: @staticfloat |
The travis failiure will probably be fixed by #69 |
It might need a rebase and force push though. Changes to |
Needs a version bump. Since this is a technically a backwards incompatible change, we'll need to change the major version? Maybe we should take the opportunity to export other accessor functions, like We could unexport the |
is this ready to merge? |
Unless you want to increase the version number first for people using the property struct directly. And Unicode 9 was just released. Might be worth checking if it still fits in the tables. Some are almost full. |
Needs a rebase and update now that Unicode 9 support has been merged. |
* title-case character, if any; otherwise (if there is no title-case | ||
* variant, or if `c` is not a valid codepoint) return `c`. | ||
*/ | ||
UTF8PROC_DLLEXPORT utf8proc_int32_t utf8proc_totitle(utf8proc_int32_t c); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be equivalent to toupper
if there is no titlecase?
Awesome, thanks for updating this. Did you have any problems with the table size? |
What's the overall reduction in (compiled) table size from this patch? |
200 k (42%) 7975 elements in the sequence array now (from 7834). At 8192 it gets troublesome |
lencode = 7 | ||
end | ||
idx = pushary(array) | ||
raise "Array index out of bound" if idx > 0x1FFF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This exception triggered in Unicode 14. @benibela, what do you suggest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevengj one bit needs to be taken from lencode (7 -> 3) and given to idx (0x1FFF -> 0x3FFF, 13 -> 14)
and the same in seqindex_write_char_decomposed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make a PR?
See #67