Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the Ord impl for LanguageIdentifier / Locale #1215

Closed
3 tasks done
sffc opened this issue Oct 26, 2021 · 8 comments · Fixed by #2142
Closed
3 tasks done

Remove the Ord impl for LanguageIdentifier / Locale #1215

sffc opened this issue Oct 26, 2021 · 8 comments · Fixed by #2142
Assignees
Labels
A-design Area: Architecture or design C-locale Component: Locale identifiers, BCP47 good first issue Good for newcomers help wanted Issue needs an assignee S-small Size: One afternoon (small bug fix or enhancement)

Comments

@sffc
Copy link
Member

sffc commented Oct 26, 2021

Currently, the auto-derived Ord impl of LanguageIdentifier sorts them by struct values, which is close to but not exactly the same as lexical order of the strings. For example, "und" is sorted at the beginning, rather than in the 'u's.

What behavior do we want? Should we keep the default Ord, or should we write a custom Ord that results in the same ordering as sorting by the string values?

Needs input from:

@sffc sffc added question Unresolved questions; type unclear C-locale Component: Locale identifiers, BCP47 A-design Area: Architecture or design needs-approval One or more stakeholders need to approve proposal labels Oct 26, 2021
@Manishearth
Copy link
Member

I don't really have strong opinions, I kinda find that "und" being sorted specially is fine: it is special so it isn't surprising that it is sorted differently. If the status quo were that the derive handled und as a string then I wouldn't argue for an explicit impl that handled it specially, but given that that's already the case I don't feel the need to go to a stringy handling

@zbraniecki
Copy link
Member

Semantically Arabic and German are not quantifiable and comparable. Their stringified serializations are. So then we can argue that und is in the same box and should be alphabetically between t and w.

@dminor
Copy link
Contributor

dminor commented Nov 5, 2021

I don't have a strong opinion, but I think it would be less surprising if und were sorted alphabetically. I was surprised when I first noticed that und was sorted separately.

@sffc sffc self-assigned this Nov 5, 2021
@sffc sffc added this to the ICU4X 0.5 milestone Nov 5, 2021
@sffc sffc removed the needs-approval One or more stakeholders need to approve proposal label Nov 5, 2021
@sffc
Copy link
Member Author

sffc commented Nov 6, 2021

I think we should treat "und" as a language code to be sorted alphabetically, like "mul" and other special codes. Ultimately this will just be less surprising for people.

Marking this as a good first issue!

@sffc sffc added good first issue Good for newcomers help wanted Issue needs an assignee S-small Size: One afternoon (small bug fix or enhancement) and removed question Unresolved questions; type unclear labels Nov 6, 2021
@sffc sffc removed their assignment Nov 6, 2021
@sffc sffc added the v1 label Nov 6, 2021
@sffc sffc removed this from the ICU4X 0.5 milestone Nov 6, 2021
@sffc sffc added the backlog label Nov 6, 2021
@sffc sffc changed the title Decide on Ord behavior of LanguageIdentifier / Locale Change Ord behavior of LanguageIdentifier / Locale to be alphabetical Nov 6, 2021
@sffc sffc added the needs-approval One or more stakeholders need to approve proposal label Nov 6, 2021
@Manishearth
Copy link
Member

Yeah it's fine!

@sffc sffc removed the needs-approval One or more stakeholders need to approve proposal label Nov 6, 2021
@sffc
Copy link
Member Author

sffc commented Mar 31, 2022

OK, so the Ord should be alphabetical for Language and LanguageIdentifier now that #1695 is landed.

However, it still may not be perfect for Locale. The Ord impl for Locale relies on the derived Ord for its entire tree, which works only insofar as everything in the tree is alphabetical. The problem is that it's not:

  1. The Extensions map does not reflect the fact that Other extensions may be not in alphabetical order
  2. There's no handling of cases when a locale with extensions sorts above a locale without extensions; for example, "und-fonipa" should sort before "und-u-foo", even though a nonempty variants array sorts after an empty one.

@sffc sffc added the discuss Discuss at a future ICU4X-SC meeting label Apr 1, 2022
@sffc sffc added this to the ICU4X 1.0 milestone Apr 1, 2022
@sffc sffc removed backlog labels Apr 1, 2022
@sffc
Copy link
Member Author

sffc commented Apr 1, 2022

Discuss: Is it better to ship an Ord which may change, or ship no Ord at all?

@sffc
Copy link
Member Author

sffc commented Jun 3, 2022

Discussion:

  • @robertbastian - We have normalizing_eq and strict_cmp, which are both more clear than an Ord implementation.
  • @zbraniecki - I'm okay not shipping Ord.
  • @Manishearth - The main downside of not having Ord is that people can't put these in maps and things. But you can write your own wrapper.
  • @zbraniecki - We could document what you need to do.
  • @robertbastian - We could have wrapper structs that implement the particular behavior.
  • @sffc - I think it's useful to have Ord for compatibility with BTreeMap, etc. It should use the string-like sorting.
  • @zbraniecki - The fact that "und" sorts in the 'U's is an artifact of BCP-47. Ideally it should sort at the top of the list.
  • @sffc - I think we have three options: (1) keep the Ord impls on both and make them strict; (2) keep it on LanguageIdentifier only; or (3) don't implement Ord and keep this issue open.
  • @zbraniecki - I prefer (3) because there is a philosophical debate and we should not squat Ord if we're not sure what it should do.

Conclusion: Remove both Ord impls and hopefully everything will work. If it doesn't, consider adding wrapper structs as @robertbastian suggested.

@sffc sffc removed the discuss Discuss at a future ICU4X-SC meeting label Jun 3, 2022
@sffc sffc changed the title Change Ord behavior of LanguageIdentifier / Locale to be alphabetical Remove the Ord impl for LanguageIdentifier / Locale Jun 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-design Area: Architecture or design C-locale Component: Locale identifiers, BCP47 good first issue Good for newcomers help wanted Issue needs an assignee S-small Size: One afternoon (small bug fix or enhancement)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants