New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU-22285 omit the gb2312 & big5han collation tailorings by default #2365
ICU-22285 omit the gb2312 & big5han collation tailorings by default #2365
Conversation
./main/tests/collate/src/com/ibm/icu/dev/test/collator/CollationServiceTest.java need to be changes to remove the test against big5han and gb2312han |
https://github.com/unicode-org/icu/actions/runs/4410702565/jobs/7728445138#step:7:1
|
aaaa0ce
to
8830a4a
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
looks like you have merge conflict and need a rebase/merge |
8830a4a
to
c3af7ed
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
Amended with Java changes, and rebased. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this was merged with me as assignee in less than 24 hours before I had a chance to open and review it)
The docs do not explain how to reset the rules back to include big5han and gb2312han. If the idea is to add a rule such as "+/collations", please document that, and also test it by adding that rule to icu/.ci-builds/data-filter.json and then run a test to ensure that the collations work by adding it here:
https://github.com/unicode-org/icu/blob/main/.ci-builds/.azure-pipelines.yml#L62
Ok --> PR #2371
I tested it manually and verified that coll/zh.res is back to its old size, but I don't know how to write a test that works with and without these tailorings in the data, and tells us something useful in CI. Ideas? |
Pass a config somehow to the test? Lots of ways to do that: environment variable at runtime, I think it's worthwhile testing. Otherwise there's no code anywhere that verifies that these collations even work or run to completion; we may as well delete them entirely. |
I might create a follow-up ticket for that.
There is a lot that we don't test. In particular, we don't test each collation tailoring for whether it's there and does something interesting. Doing that would require more data-driven testing, because hardcoding such detailed behaviors would be too brittle. |
… expectations https://bugs.webkit.org/show_bug.cgi?id=268879 <rdar://problem/122437088> Reviewed by Yusuke Suzuki and Mark Lam. These collation types were deprecated for a long time and not available in Chrome / Firefox [1], and finally removed in ICU 74 [2]. [1]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Intl/Locale/getCollations#supported_collation_types [2]: unicode-org/icu#2365 * JSTests/stress/intl-enumeration.js: * JSTests/stress/intl-locale-info.js: Canonical link: https://commits.webkit.org/274236@main
This reduces the size of coll/zh.res from 918660 bytes to 682904 bytes.
Checklist