New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Unicode 9.0.0 #171

Closed
slevithan opened this Issue Apr 11, 2017 · 2 comments

Comments

Projects
None yet
2 participants
@slevithan
Owner

slevithan commented Apr 11, 2017

From http://www.unicode.org/versions/Unicode9.0.0/:

Unicode 9.0 adds exactly 7,500 characters, for a total of 128,172 characters. These additions include six new scripts and 72 new emoji characters.

The categories, scripts, blocks, and properties addons will all need to be updated. Steps:

  • Delete files in tools/data/ for the old version (8.0.0).
  • Update the Unicode version number in tools/generate-unicode.sh.
  • Manually update the code points and ranges for the handful of properties in tools/scripts/property-regex.py, using data generatable from UnicodeSet (see comments here).
  • Run tools/generate-unicode.sh.
  • Use the generated data in tools/output/ to update the Unicode addon JS files in src/addons/.

Also update the "Uses Unicode 8.0.0" comments in README.md, unicode-blocks.js, unicode-categories.js, unicode-properties.js, and unicode-scripts.js.

@mathiasbynens

This comment has been minimized.

Show comment
Hide comment
@mathiasbynens

mathiasbynens Apr 11, 2017

Collaborator

It might be worth considering rewriting the build scripts to consume the node-unicode-data packages, e.g. https://github.com/mathiasbynens/unicode-9.0.0, directly. This would greatly simplify things and could reduce the manual work required for such updates in the future.

Collaborator

mathiasbynens commented Apr 11, 2017

It might be worth considering rewriting the build scripts to consume the node-unicode-data packages, e.g. https://github.com/mathiasbynens/unicode-9.0.0, directly. This would greatly simplify things and could reduce the manual work required for such updates in the future.

@slevithan

This comment has been minimized.

Show comment
Hide comment
@slevithan

slevithan Apr 12, 2017

Owner

If nothing else, it might be easier to get the data for the handful of properties listed in tools/scripts/property-regex.py from https://github.com/mathiasbynens/unicode-9.0.0/tree/master/Binary_Property than from UnicodeSet.

Owner

slevithan commented Apr 12, 2017

If nothing else, it might be easier to get the data for the handful of properties listed in tools/scripts/property-regex.py from https://github.com/mathiasbynens/unicode-9.0.0/tree/master/Binary_Property than from UnicodeSet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment