Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Locale of constructed object & language negotiation in data loading #173

Closed
sffc opened this issue Jul 10, 2020 · 6 comments · Fixed by #1237
Closed

Locale of constructed object & language negotiation in data loading #173

sffc opened this issue Jul 10, 2020 · 6 comments · Fixed by #1237
Assignees
Labels
A-design Area: Architecture or design C-meta Component: Relating to ICU4X as a whole S-epic Size: Major project (create smaller child issues) T-docs-tests Type: Code change outside core library
Milestone

Comments

@sffc
Copy link
Member

sffc commented Jul 10, 2020

ICU has three types of locales for constructed objects (return value of getLocale() of NumberFormat, for example):

  • Requested local
  • Valid locale
  • Actual locale

It's a very confusing model, so much so that ICU isn't adding new getLocale() methods.

What model do we want to adopt in ICU4X?

@sffc sffc added T-docs-tests Type: Code change outside core library C-meta Component: Relating to ICU4X as a whole A-design Area: Architecture or design discuss Discuss at a future ICU4X-SC meeting labels Jul 10, 2020
@zbraniecki
Copy link
Member

In Fluent and Mozilla we use the following lists:

  • Requested Locales (e.g. user requested ["es-CL", "es", "fr"]
  • Available Locales (e.g. we have ["es", "de", "it", "en-GB", "en-US"])
  • Last Fallback Locale (e.g. "en-US")
  • Resolved Locales (e.g. we negotiated it down to ["es", "en-US"])

For ICU4X the interesting question is about available locales in the scenario where we don't have the same number of locales available per component. For example we may have PluralRules for 110 locales, but NumberFormat data for 94 locales.

So, the negotiation needs to know what components are you asking for:

let resolved = DataProvider::negotiateLocales(requested, &[Component::PluralRules, Component::NumberFormat]);

and then we need to know whether the user wants to only get locales supported by all requested components (or data chunks), or the best available per component. Let's say we have NumberFormat for es-CL but PluralRules only for es - do we want the returned list to contain ["es-CL", "es"] and then NumberFormat will use es-CL and PluralRules will use es or do we want the resolved list to only contain es?

The answer in this example may feel natural to go for more accuracy, since plural rules are per-language and not per region, but what if we had:

  • NumberFormat in ["es-CL", "es", "fr", "it", "de"]
  • PluralRules in ["fr", "de"]

Do we then want the resolved to contain es at all? Or just ["fr"] ? I think this time we'd want ["fr", "en-US"].

Finally, once we get to the single instance, it may be tempting to say that an instance of a component is in a single locale, so we don't need a fallback chain, but that only works if the component doesn't depend on other components.

If NumberFormat depends on PluralRules, we want to make sure that they support the same locale, as in the example above - we want to use fr for both.
But MessageFormat may depend on many components. Having a single locale on it means that we either hard fail if one of the dependencies doesn't have that locale or we hard fallback on last fallback locale. It may be useful to store the fallback chain so that we have a more graceful error path if some component used by MessageFormat (say, RelativeTimeFormat, DisplayNames etc.) cannot return a value in es and we want to fallback on the next locale.

@sffc
Copy link
Member Author

sffc commented Jul 16, 2020

All great points. Answering the question "what locales do you support?" is really hard.

Maybe we should ask ourselves, what is the primary use case for asking that question? Based on my experience, I think the most common reason people ask that question is because they have custom fallback logic in the case when a locale is not supported. However, with pluggable data providers, we give users a more official way to add that fallback logic. What other use cases are we looking at?

If we don't have any clear use cases, then we could choose to keep locale negotiation exclusively in the data provider object, and not have any "GetLocale" methods on the constructed objects at all.

@sffc sffc changed the title Locale of constructed object Locale of constructed object & language negotiation in data loading Jul 17, 2020
@sffc
Copy link
Member Author

sffc commented Jul 17, 2020

I want to extend the scope of this issue to include a very closely related topic, which we discussed in yesterday's ICU4X meeting. How do we handle language negotiation when performing data loading, particularly when different categories of data support different locales?

@sffc sffc removed the discuss Discuss at a future ICU4X-SC meeting label Jul 23, 2020
@sffc sffc mentioned this issue Jul 29, 2020
@sffc sffc added this to the 2020 Q4 milestone Oct 30, 2020
@sffc sffc modified the milestones: 2020 Q4, 2021-Q1-m1 Jan 7, 2021
@sffc sffc modified the milestones: 2021-Q1-m1, 2021-Q1-m2 Feb 4, 2021
@sffc sffc modified the milestones: 2021-Q1-m2, ICU4X 0.2 Mar 12, 2021
@sffc sffc modified the milestones: ICU4X 0.2, 2021-Q2-m1 Apr 1, 2021
@mihnita mihnita added the S-large Size: A few weeks (larger feature, major refactoring) label Apr 19, 2021
@sffc sffc modified the milestones: 2021-Q2-m1, ICU4X 0.3 Apr 29, 2021
@sffc sffc modified the milestones: ICU4X 0.3, 2021 Q2-m3 May 13, 2021
@sffc sffc added S-epic Size: Major project (create smaller child issues) and removed S-large Size: A few weeks (larger feature, major refactoring) labels Jun 28, 2021
@sffc
Copy link
Member Author

sffc commented Jun 28, 2021

Changing this to S-epic because one we agree on the design, we will need to implement this in the data providers.

@mihnita
Copy link
Contributor

mihnita commented Jun 28, 2021

Posted document in the icu4x-sc / Drop Box

@sffc
Copy link
Member Author

sffc commented Aug 12, 2021

Action items:

  • Add API that gives a list of suggested locales
  • Share the doc with Markus and Mark Davis
  • Add the design doc to the repository

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-design Area: Architecture or design C-meta Component: Relating to ICU4X as a whole S-epic Size: Major project (create smaller child issues) T-docs-tests Type: Code change outside core library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants