Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define sort order of "pluralCategories" array #578

Open
anba opened this issue Jun 4, 2021 · 14 comments
Open

Define sort order of "pluralCategories" array #578

anba opened this issue Jun 4, 2021 · 14 comments
Assignees
Labels
c: numbers Component: numbers, currency, units s: help wanted Status: help wanted; needs proposal champion Small Smaller change solvable in a Pull Request
Milestone

Comments

@anba
Copy link
Contributor

anba commented Jun 4, 2021

Intl.PluralRules.prototype.resolvedOptions ( ) should define the sort order of the "pluralCategories" array, e.g. ["one", "zero"] or ["zero", "one"]. ICU (and therefore all browsers) return the array in alphabetical order, i.e. ["few", "many" "one", "other", "two", "zero"].

@ryzokuken ryzokuken added c: spec Component: spec editorial issues editorial Involves an editorial fix labels Jun 4, 2021
@sffc sffc added c: numbers Component: numbers, currency, units s: discuss Status: TG2 must discuss to move forward Small Smaller change solvable in a Pull Request and removed editorial Involves an editorial fix c: spec Component: spec editorial issues labels Jun 4, 2021
@sffc
Copy link
Contributor

sffc commented Jun 4, 2021

This should be a small normative PR, since it reflects web reality.

@zbraniecki
Copy link
Member

My preference if I were to ignore web reality would be to order them the way UTS 35 does:

<!ATTLIST pluralRule count (zero | one | two | few | many | other) #REQUIRED >

@ryzokuken
Copy link
Member

Can we do that? I really like that idea, if we can choose to ignore web reality...

@sffc
Copy link
Contributor

sffc commented Jun 7, 2021

The set of six plural forms could grow in the future (unlikely but possible).

I think ICU prefers sorting alphabetically because it's predictable and algorithmic. If you have a custom sort order, then you need to have a special code path.

My preference is to reflect web reality instead of introducing a special case sort order.

@zbraniecki
Copy link
Member

zbraniecki commented Jun 7, 2021

Alphabetic is a developer paper cut - it makes API clearly return "one" before "zero" while semantically the incremental order relation to numerical values is reverse.

It's an artifact of how latin alphabet is sorted and how categories are named in latin alphabet, and should not be impacting the API itself.

The ICU4C behavior seems like optimal for API implementation author, not API user, which I believe is the wrong API design model.

I also don't see it as risk that if we'll change categories, we'll return different set. I see it as 0/10 importance. The order being semantically valid is 3/10 for me.

@FrankYFTang
Copy link
Contributor

  1. I am worry that changing the order from the current web reality may break pre-existing code
  2. if the result of a fix order is not useful for the developer to depend on, then mandating a specific order will force the implemetnation to run unnecessary cpu power for something useless and slow down the implementation unnecessary.

@ljharb
Copy link
Member

ljharb commented Jul 1, 2021

Why wouldn't it be useful to depend on? Deterministic ordering is always useful.

@jswalden
Copy link
Collaborator

jswalden commented Jul 1, 2021

The main reason to deterministically order, is to make it impossible for sites to inadvertently depend on implementation-specific behavior. No more and no less.

I doubt any existing code depends on the ordering, myself. At worst it's not more than extremely few (one or two) users.

Aesthetically I like the zero/one/two/etc. ordering. But this is for programmatic consumption. It's not unreasonable to say that users (e.g. UIs) who need the aesthetic ordering can postprocess accordingly. And implementations (save the polyfill) ship alphabetical already.

So it's not a strong preference, but if this is just a set, and using a literal Set is out because of already shipping (not to mention I can't think of anything else that uses Set for stuff like this just yet), the arbitrary ordering of alphabetical/lexicographically sorted seems preferable to me.

@sffc sffc moved this from Priority Issues to Previously Discussed in ECMA-402 Meeting Topics Jul 2, 2021
@sffc
Copy link
Contributor

sffc commented Jul 2, 2021

2021-07-01 discussion: https://github.com/tc39/ecma402/blob/master/meetings/notes-2021-07-01.md#define-sort-order-of-pluralcategories-array-578

We spent time discussing this issue, but did not reach a conclusion. I would characterize the conversation as generally in favor of specifying an order, but split about 50/50 on whether to do lexicographical order (web reality) or UTS-35 order.

I personally am in the lexicographic camp because:

  1. We should specify an ordering: Whether we like it or not, developers can and do write code that depends on behaviors that are not written down. It is better to formalize all behaviors, list ordering included, than to leave it up to Test262-compliant browsers to break web sites.
  2. Lexicographic ordering is better for computers: This is a set, not a list. If we return an array, the most efficient ordering is lexicographical ordering, where you can perform binary search. Although the list of plural categories is short, the precedent this sets could be useful in future cases where we return longer lists of items (e.g., Intl Enumeration API).
  3. There is no such thing as a human-friendly ordering of plural categories: The one thing we all agreed on is that the plural categories are arbitrary and not tied to any particular numerical value. Although plural category "few" generally refers to a larger number than "one", this is certainly not always the case in every language, since plural categories are complicated. I therefore see arguments involving UTS 35 being a better "human ordering" to be uncompelling.
  4. Lexicographic ordering is web reality: This is of course by far the most compelling reason. In order to change web reality, there needs to be a very compelling case, which I have not seen.

The next step for whoever takes this issue is to work with the relevant parties offline to find a middle ground that everyone can agree on.

@sffc sffc added s: help wanted Status: help wanted; needs proposal champion and removed s: discuss Status: TG2 must discuss to move forward labels Jul 2, 2021
@sffc sffc added this to the ES 2022 milestone Jul 2, 2021
@FrankYFTang
Copy link
Contributor

Lexicographic ordering is web reality: This is of course by far the most compelling reason. In order to change web reality, there needs to be a very compelling case, which I have not seen.

This is NOT true. Lexicographic ordering is NOT web reality for the return value of this API in the last 5 - 7 years of reality. chrome 91 / FireFox 89 and Safari Version 14.1.1 (16611.2.7.1.4) on my Mac ALL return the same as below, NOT in lexicographic ordering:

(new Intl.PluralRules("ar")).resolvedOptions().pluralCategories
 ["few", "many", "one", "two", "zero", "other"]

(new Intl.PluralRules("he")).resolvedOptions().pluralCategories
["many", "one", "two", "other"]

(new Intl.PluralRules("sl")).resolvedOptions().pluralCategories
["few", "one", "two", "other"]

(new Intl.PluralRules("ru")).resolvedOptions().pluralCategories
 ["few", "many", "one", "other"]

(new Intl.PluralRules("se")).resolvedOptions().pluralCategories
["one", "two", "other"]

(new Intl.PluralRules("cy")).resolvedOptions().pluralCategories
["few", "many", "one", "two", "zero", "other"]

(new Intl.PluralRules("br")).resolvedOptions().pluralCategories
["few", "many", "one", "two", "other"]

If we need to define an order, I propose we define it as the current web reality, in the following order
["few", "many", "one", "two", "zero", "other"]

@zbraniecki
Copy link
Member

zbraniecki commented Jul 5, 2021

Response:

  • Lexicographic ordering is better for computers - I disagree with Shane on importance on that. I think Shane is making an implicit assumption that the operation anyone would want to perform on that set is binary search. I know of no use case that would benefit from that and building an API for non-existing use cases is suboptimal.
  • There is no such thing as a human-friendly ordering of plural categories - I disagree with Shane on that assumption. I recognize that there are exceptions, but there is a general, semantic order of the categories corresponding to ordered values (numbers). Generally category zero is the "first" and in the only use case we identified that is where it should be presented (GUI CAT tool), followed by category one and so on.
  • Lexicographic ordering is web reality - as Frank mentioned, also disagree.

Zibi's position

  1. Use Case Driven - My main point is that there is a single identified use case of this data that is anyhow "common". I have no doubt that we can come up with some esoteric theoretical situation in which a particular ordering would allow for particular microoptimization (which is what I think Shane did in his binary search argument), but a) I have not seen such use case in any production system and b) conjuring a scenario for such system would actually benefit from such microoptimization. I think the (a) is hard, and adding (b) is even more unrealistic.

That single identified use case (GUI CAT tool) benefits from the ordering zero, one, two, few, many, other.

  1. Papercut Driven - Plural Categories are hard to understand, unintuitive concepts. In the absence of any strong reasons to optimize for hot paths in runtime software, or highly elegant code for common paths, the value of that API becomes in part explorative. People will inspect the resolvedOptions and learn what categories a given language work with.
    And in that, I claim that [ "one", "few", "many", "other" ] (proposed) is more intuitive than [ "few", "many", "one", "other" ] (current) for Polish and [ "zero", "one", "two", "few", "many", "other" ] (proposed) is more intuitive than [ "few", "many", "one", "two", "zero", "other" ] for Arabic.

@sffc
Copy link
Contributor

sffc commented Oct 26, 2021

I will be raising this question tomorrow morning at TC39. Slides:

https://docs.google.com/presentation/d/1tDvpl99axNaZQWm1VItYhztMMj3avV8jc8uvvXQLRI4/edit#slide=id.p

@sffc sffc modified the milestones: ES 2022, ES 2023 Jun 1, 2022
@sffc
Copy link
Contributor

sffc commented Aug 23, 2023

TG1 notes: https://github.com/tc39/notes/blob/main/meetings/2021-10/oct-26.md#specifying-order-of-lists-returned-from-intl-apis

We should do the following:

  1. Document in the style guide that we should indeed specify the order
  2. Document that the default order should be lexicographic (sorted as if with Array.prototype.sort with a sort function of undefined), but different orders can be proposed on a case-by-case basis when there is a compelling reason
  3. We can consider this particular case of plural operands to be such a case and can move forward with the normative PR to change the sort order.

Note that this should comprise 2 PRs, one editorial in the style guide, and the other normative for PluralRules.

@sffc
Copy link
Contributor

sffc commented Sep 18, 2023

@ben-allen to make the second PR for the PluralRules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: numbers Component: numbers, currency, units s: help wanted Status: help wanted; needs proposal champion Small Smaller change solvable in a Pull Request
Projects
ECMA-402 Meeting Topics
Previously Discussed
Development

No branches or pull requests

8 participants