Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICU package improvements #336

Closed
6 of 7 tasks
springmeyer opened this issue Feb 1, 2017 · 10 comments
Closed
6 of 7 tasks

ICU package improvements #336

springmeyer opened this issue Feb 1, 2017 · 10 comments

Comments

@springmeyer
Copy link
Contributor

springmeyer commented Feb 1, 2017

Shortlist of things needed for the ICU package to work well for both mbgl-native and mapnik usage.

Current status is that the ICU 55.1 is tested and works with Mapnik while the ICU 58.0 package is tested and works for MBGL. Ideally we'd test and confirm a single package that works for both important downstream apps. And then rebuild our boost_libregex_icu package against it.

TODO:

  • Trim the .dat size (using https://ssl.icu-project.org/datacustom/) to only include:
  • Change -DUCONFIG_NO_BREAK_ITERATION=1 to -DUCONFIG_NO_BREAK_ITERATION=0 to allow Mapnik line breaking to work
    • Confirm this does not impact binary size of mbgl once libicuuc is linked
  • Investigate if we can disable more parts of the icu code (and if disabling saves on meaningful binary size) that are unneeded by both Mapnik and MBGL:
    • -DUCONFIG_NO_FORMATTING=1?
    • -DUCONFIG_NO_TRANSLITERATION=1?
    • -DUCONFIG_NO_REGULAR_EXPRESSIONS=1?

/cc @ChrisLoer @artemp

@ChrisLoer
Copy link
Contributor

👍

When I started working on arabic shaping, I was using a convenience function introduced in ICU 58, ubidi_transform, but as I fleshed out the ICU functionality I stopped depending on that function. So in theory, if there were any other compatibility concern, I think MBGL could go back to 55.1. On the other hand, newer is better, right?

@springmeyer
Copy link
Contributor Author

springmeyer commented Feb 6, 2017

I'm working on this today and so far have found:

  • We were not passing -O3. Just passing this triggers more compiler optimizations and smaller binaries, at least on OS X. On OS X this dropped the binary of libicuuc.a from 3.4 -> 2.4
  • The CPPFLAGS may not be being picked up. On OS X at least I found I needed to export them in order for them to be respected by the ICU build. After exporting CPPFLAGS I found that setting -DUCONFIG_NO_BREAK_ITERATION=1 dropped the binary size from 2.4 MB to 2.2 MB.

Next steps:

  • Test a build in mbgl with -DUCONFIG_NO_BREAK_ITERATION=1 removed and ensure libmgbl-core does not increase in size by .2 MB (since we hope the unused break iteration code will be stripped).

@springmeyer
Copy link
Contributor Author

The CPPFLAGS may not be being picked up. On OS X at least I found I needed to export them in order for them to be respected by the ICU build.

Also not seeing CPPFLAGS being picked up on linux: https://travis-ci.org/mapbox/mason/jobs/199016774#L2373

@springmeyer
Copy link
Contributor Author

Also noticing -DU_CHAR_TYPE is set in the CPPFLAGS but I think the define is -DUCHAR_TYPE in ICU per https://ssl.icu-project.org/apiref/icu4c/umachine_8h_source.html.

@springmeyer
Copy link
Contributor Author

Test a build in mbgl with -DUCONFIG_NO_BREAK_ITERATION=1 removed and ensure libmgbl-core does not increase in size by .2 MB (since we hope the unused break iteration code will be stripped).

Confirmed. While re-enabling break iteration increases libicuuc.a by .2 MB it does not impact the final linked size of mbgl. I determined this by building mbgl node bindings like BUILDTYPE=Release make node normally with existing ICU unchanged. I found that there were 87 symbols from the ICU C API:

$ nm build/macos/Release/mbgl-node.node | grep _u_ | wc -l
87

And only 2 from the C++ API:

nm build/macos/Release/mbgl-node.node | grep icu | wc -l
2

These symbols resulted in a binary size of 4.7 MB:

ls -lh build/macos/Release/mbgl-node.node
-rwxr-xr-x 1 dane staff 4.7M Feb  6 15:11 build/macos/Release/mbgl-node.node

After changing out the mason_packages/osx-x86_64/icu/58.1/ for my own build with -DUCONFIG_NO_BREAK_ITERATION=1 and running make clean and rebuilding I found the results were the same: no increase in visible symbols and no increase in binary size.

@springmeyer
Copy link
Contributor Author

-DUCONFIG_NO_FORMATTING=1?
-DUCONFIG_NO_TRANSLITERATION=1?
-DUCONFIG_NO_REGULAR_EXPRESSIONS=1?

Found I could also disable these and save another .1 MB. The main ones needed for Mapnik are -DUCONFIG_NO_COLLATION=0 and -DUCONFIG_NO_BREAK_ITERATION=0 because boost_regex needs them.

Next step now is to shrink the data library size, but am currently blocked on this because the customizer is not available for boost 1.58 (only 1.57 and earlier) - refs https://ssl.icu-project.org/trac/ticket/12835

@ChrisLoer
Copy link
Contributor

@boundsj Fixing the mistakes @springmeyer found could potentially slim down the iOS 3.5 SDK size, although I think it'll be a pretty small difference.

@springmeyer
Copy link
Contributor Author

springmeyer commented Feb 10, 2017

Update on my progress: still blocked by the inability to shrink the .dat size. I tried using the custom tools in https://github.com/nodejs/node/tree/master/tools/icu. I got them working over in icu-fixes...shrink-icu-data. However they strip out the break iterator data and many needed locales, and it is not obvious to me how to configure those tools to keep these. So, https://ssl.icu-project.org/datacustom/ still is the only way to shrink the data file, and its still not available for 58.1. I can't ship Mapnik binaries without it because the .dat would swell from 8 MB to 25 MB. So the only remaining option I can think of is to create an icu 57 package. So, when I get time I'll copy the 58 package and modify to make it work with 57.

@ChrisLoer
Copy link
Contributor

@brunoabinader Made a great change to MBGL so that it's no longer dependent on having UCHAR_TYPE defined to char16_t (mapbox/mapbox-gl-native#8252). So at least on the MBGL side, we should be able to remove all the configuration code that specifies char type. Not sure how that will interact with other users of the library.

@springmeyer
Copy link
Contributor Author

Thanks @ChrisLoer - not having to customize sounds ideal. Overall update on this ticket (and general icu packages) is:

  • icu 55.1 and 58.1 packages both had major problems (lack of optimization flags, massive .dat file bundled in the download)
  • Mapnik works with the existing 55 and 58 packages, but cannot be deployed to production with them because of the data library size.
  • ICU 58 was shipped without any ability to reduce the data library size due to https://ssl.icu-project.org/trac/ticket/12835
  • So icu 57 is the latest viable target for Mapnik at this time, until the icu project offers some way to reduce the data library size. Therefore, I've added an icu 57 package in Mapnik dep updates #390.
  • @jfirebaugh wanted -Os for mapbox-gl so we now also have an -Os variant of icu 58 after Add icu built with -Os [skip ci] #383

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants