Only pre-initialize ICU when using Windows Kit ICU#5011
Only pre-initialize ICU when using Windows Kit ICU#5011chakrabot merged 1 commit intochakra-core:masterfrom
Conversation
|
@jefgen the docs say u_init can be called multiple times with no effect, however @srl295 mentions in nodejs/node#13053 that this is not the case. If u_init can't be guaranteed to no-op on the nth call for n > 1, we may need to look into cases where Intl can get initialized twice, such as across different iframes or node-land |
|
As you noted in the comments in the code change, we don't expose the API Edit: If node doesn't ship with any data file (and only provides it via the udata_setCommonData function), then this error would make more sense. |
|
That is correct, we were getting back U_FILE_ACCESS_ERROR in node when the small data was used and set with udata_setCommonData. There is no particularly good way to plumb the information about if node has decided to use small data through when compiling and running ChakraCore, so I think I am happy with just keeping this code specific to windows kit ICU. |
|
@jackhorton I said:
The question was whether you could call
Hope this helps. |
|
Ah, I understand -- sorry for the confusion. That makes things quite a bit easier on our end. Thanks! |
|
Does the "small-icu" data for Node include any converters (code-pages)? If not then then U_FILE_ACCESS_ERROR returned from u_init might make sense as it is essentially trying to load the list of converters. |
|
Thanks for the reply @srl295 ! 👍 |
| // to Windows 10 ICU because we know udata_setCommonData will not be called before then (since it is not allowed). This | ||
| // change primarily exists to guard against OOMs in ICU being treated as unique bugs by Crawler, so it really only *needs* | ||
| // to be enabled when we are using Windows 10 ICU, anyways. | ||
| // TODO(jahorto): investigate why u_init can't be called after udata_setCommonData |
There was a problem hiding this comment.
Not sure I understand this. u_init should be called after udata_setCommonData (that's the design and what Node.js does). The U_FILE_ACCESS_ERROR may be something wrong with the data/configuration. Try compiling with -DUDATA_DEBUG=1 and make sure your terminal's scrollback buffer is cranked way up…
There was a problem hiding this comment.
Unless I am missing something, I didn't see where node called u_init in the same path as calling udata_setCommonData. so I assumed it was intentional.
|
Note: I still think that true "out-of-memory" errors when loading data should be reported as such, rather than being converted into other errors like FYI: I filed a ICU ticket for this: http://bugs.icu-project.org/trac/ticket/13712 |
|
ACtually u_init docs say:
Yes. That's a good point though. Could you try your test but with a full ICU dataset? Definitely if you have a 'stubdata dataset' (the 0-length data) you might get a file access error. |
|
So Node-ChakraCore with full-icu works properly without this change. A step further, this issue is definitely with calling u_init on the data file in small-icu -- this is with the released version of node-v8 8.11.1. ~/dev/node-cc2 [v8.x] $ node --version
v8.11.1
~/dev/node-cc2 [v8.x] $ node -pe process.versions
{ http_parser: '2.8.0',
node: '8.11.1',
v8: '6.2.414.50',
uv: '1.19.1',
zlib: '1.2.11',
ares: '1.10.1-DEV',
modules: '57',
nghttp2: '1.25.0',
openssl: '1.0.2o',
icu: '60.1',
unicode: '10.0',
cldr: '32.0',
tz: '2017c' }
~/dev/node-cc2 [v8.x] $ node --icu-data-dir=./deps/icu-small/source/data/in
node: could not initialize ICU (check NODE_ICU_DATA or --icu-data-dir parameters)
~/dev/node-cc2 [v8.x] $ node --icu-data-dir=./deps/icu/source/data/in
> new Intl.DateTimeFormat().format()
'4/18/2018' |
Then, I think the error result from u_init in the small-icu case is misleading for your purposes. (the u_init API docs kind of cover this, but as I said they could be more explicit for the 'customized data' case.) A better test might be ensuring that the error is something like |
|
We explicitly fail for U_FILE_ACCESS_ERROR because that is what is reported when calling u_init for Windows Kit ICU in a low-memory scenario, because it can't mmap the data file. So, we don't want to ignore that error in this specific use case, possibly unless we are in node, but as I mentioned above, we don't plumb any information through about whether chakracore is building or running in the context of node right now. I'd rather ignore all u_init errors for non-Windows Kit ICU because they will be caught by later status code checks anyways, so it doesn't put the engine in a bad state. With that being said, does ulocdata_getCLDRVersion actually load the data file? In other words, could we use it instead of u_init for this specific use case of ensuring that the data is in memory before continuing. |
|
Yes ulocdata_getCLDRVersion actually loads the data file. So if it returns a CLDR version you're good (should work in small icu). |
|
Awesome. I will test that out and update this PR if it works for all of these scenarios. Thanks! |
|
Thank you for the idea @srl295! The CLDR version should hopefully always be there, even if there are no converters at all, so this should ideally work even with a super-small data file. |
jefgen
left a comment
There was a problem hiding this comment.
Using ulocdata_getCLDRVersion should work everywhere. :)
| INTL_TRACE("Using CLDR version %d.%d.%d.%d", cldrVersion[0], cldrVersion[1], cldrVersion[2], cldrVersion[3]); | ||
| } | ||
|
|
||
| AssertOrFailFastMsg(U_SUCCESS(status), "u_init returned non-OOM failure"); |
There was a problem hiding this comment.
nit: This still says u_init. :)
kfarnung
left a comment
There was a problem hiding this comment.
Verified in node-chakracore, LGTM.
9e99307 to
8193048
Compare
8193048 to
f99c276
Compare
…Kit ICU Merge pull request #5011 from jackhorton:icu/init-windows-kit Third attempt at this after #4984 and #5001, but I have actually confirmed that this makes Node-ChakraCore's chakracore-master branch build, launch, and run Intl code successfully once more. This definitely isn't the best solution, however if we refer to https://github.com/nodejs/node-chakracore/blob/d94b22785fb6ab7dde77cb13b7e95e958e581375/src/node_i18n.cc#L568-L581, we can see that Node somewhat explicitly also does not call u_init if they call udata_setCommonData.
Third attempt at this after #4984 and #5001, but I have actually confirmed that this makes Node-ChakraCore's chakracore-master branch build, launch, and run Intl code successfully once more. This definitely isn't the best solution, however if we refer to https://github.com/nodejs/node-chakracore/blob/d94b22785fb6ab7dde77cb13b7e95e958e581375/src/node_i18n.cc#L568-L581, we can see that Node somewhat explicitly also does not call u_init if they call udata_setCommonData.