Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Locale Extensions #844

Closed
ben-allen opened this issue Jul 11, 2023 · 6 comments · Fixed by #985
Closed

Locale Extensions #844

ben-allen opened this issue Jul 11, 2023 · 6 comments · Fixed by #985
Labels
position: negative ready to add Appears ready to add to the table of positions.

Comments

@ben-allen
Copy link

Request for Mozilla Position on an Emerging Web Specification

Other information

On the Web platform, content is localized dependent only upon a user's language or region. However, this behavior can result in annoyance, frustration, offense, or even uninteligibility for some users.

Some example situations:

  • Someone traveling overseas sees temperatures in Fahrenheit even though they are more familiar with Celsius.
  • Someone is more familiar with 12-hour time, but Intl.DateTimeFormat is rendering 24-hour time.
  • Someone sets their language dialect to one they can understand, but they prefer dates, times, and numbers to be rendered according to local standards.
  • Someone sees digits in an unfamiliar writing system.

In the native environment these problems do not occur, since users can specify these desired customizations in their system settings. However, the full amount of flexibility allowed for in the native environment is not possible in the potentially hostile web environment. This proposal defines a mechanism for making a limited subset of the Unicode Extensions for BCP 47 available for content negotiation, providing options that address some of the worst problems with incomplete localization while only exposing coarse-grained data about the users who take advantage of these improvements.

Read the complete Explainer
Slide deck about Locale Extensions

Feedback

I welcome feedback in this thread, but encourage you to file bugs against the Explainer.

@ben-allen
Copy link
Author

ben-allen commented Jul 31, 2023

@dminor @hsivonen Would love to hear if this seems like a reasonable strategy to you!

@zcorpan zcorpan changed the title Request for Mozilla Position on Locale Extensions Locale Extensions Aug 17, 2023
@ben-allen
Copy link
Author

I've made substantial revisions to this proposal, which are reflected in the new explainer. I'd love to hear your feedback! @dminor @hsivonen

@dminor
Copy link

dminor commented Sep 5, 2023

Hi Ben, we're discussing this internally and we hope to get you feedback soon.

@ben-allen
Copy link
Author

ben-allen commented Sep 8, 2023

Here's the slideset from a talk at TG2 on the proposal as stands.

@hsivonen
Copy link
Member

hsivonen commented Feb 2, 2024

Sorry about the delay. Here's my review:

Proposed position: negative.

The use cases have legitimacy, but it's not clear that the importance of the use cases overrides other concerns: primarily fingerprintability and, secondarily, reconciling in implementation the relative role of browser and operating system given that the browser language may not be coupled with the OS language and that operating systems do not consistently provide UI surface for these settings. Third, if we were to expose this information, we should consider if a HTML+CSS-based declarative solution makes more sense, particularly for numbering systems, hour cycle, dates rendered according to a calendar, and amounts with units.

Additional Notes:

  • It's a welcome improvement that the README now documents estimated population for default combinations of fw, hc, and mu. We see that the population for some of the combinations is very small. However, we still don't see anonymity set estimates for the cases of concern: where the fw/hc/mu combination overrides the locale default.

  • The relative importance and implementability of the various aspects of the proposal would be easier to assess if the proposal documented how operating systems currently deal with these settings. The README says: "In the native environment these problems are easily solved, since users can specify their preferences in their system settings.", but this does not appear to apply across all mainstream operating systems. As far as I can tell, h12 vs. h23 hour cycle setting is the only one of these that's consistently available across different mainstream operating systems (Windows 10/11, macOS, Gnome (Ubuntu), Android, iOS/iPadOS; I don't have sufficiently recent Chrome OS at hand). Furthermore, it's non-trivial to check what's supported where: Whether UI for choosing a numbering system is shown depends on the system language on Apple platforms and calendar choices seem to depend on the system language on Windows 10. On Apple platforms, the system-wide calendar setting doesn't support all CLDR calendars. The system-wide language-independent setting an Apple platforms supports only three calendars that only differ by year number and era designation (i.e. they share Gregorian/ISO month and day): Gregorian, Japanese, which is not the CLDR-primary calendar for Japan, and Buddhist, which is the CLDR-primary calendar for Thailand. Do some language/region choices unlock more calendars system-wide (as opposed to, reportedly, within the Calendar app)?

  • Both Microsoft and Apple have redesigned their system preference UIs for this area in the recent years (post-Windows 7). Is it known what decisions Microsoft and Apple took based on experience from previous designs? Have they shared characterizations of how and how much users change these preferences (if there is telemetry)?

  • The idea of bundling fw, hc, and mu makes a lot of sense from the fingerprinting perspective. However, combinations other than applying European settings to en-US don’t seem to work nicely in a fingerprinting-resisting but “Just Do What I Mean” way. Some settings are more confusing than others if set to unexpected values. In particular, fw set to an unexpected value can cause bad mistakes with e.g. travel booking. European users have encountered the issue of English-language sites showing Sunday-starting weeks, but U.S. users may not be on guard for the opposite failure mode. If a U.S. user simply wants to opt into 24-hour clock, if this choice comes bundled with making fw Monday, this might hurt more than the 24-hour preference helps. (Ideally, the travel booking problem would be avoided by sites using browser-supplied date pickers. Unfortunately, sites really like to make their own.)

  • If fw is dropped as too complicated, it’s relevant to ask if it’s really necessary to broadcast the temperature unit. People can have a per-site cookie-persisted setting for the weather site they routinely use, and removing the annoyance to have to use a site-supplied unit switcher while traveling doesn’t seem like enough of a problem to justify broadcasting a fingerprinting bit to the Web. If both fw and mu are dropped, we get the one setting that’s consistently available across operating systems: h12 vs. h23, but while people do have preferences, people who read out-of-locale content tend to be able to read the non-preferred option.

  • The "Motivation" section claims that in locales with multiple numbering systems in use (in practice Western Arabic aka. latn and script-native in either order of priority) the other numbering system would not be "immediately intelligible". This claim could use data/references to substantiate the severity level of the issue. (Users may have a preference, but to what extent does the issue rise to the level of not immediately intelligible? Previously, the primary example given has been opting into script-native Devanagari digits for Hindi, which by default uses latn digits. Without first-hand experience, the “not immediately intelligible” level of seriousness looks odd in the light of licence plates on cars in India using latn digits.)

  • In the light of CLDR data about primary calendars by region as well as calendar comprehension presumably correlating with language comprehension, the calendar system aspect could use some data/references to characterize the usability importance of sites (not calendar apps but sites displaying dates) dynamically adapting to a user preference.

  • That the numbering system should bind to an advertised language seems like the right conclusion.

  • Likewise for calendar systems (though bound to the region or likely-subtags-implied region of the language tag).

  • If we were to go forward with the general idea of exposing non-CLDR-default numbering system preference with the additional observation that it should go together with a language, and the assumption that consumers of existing things like Accept-Language might not deal with extensions, how can the problem of sites applying the numbering system preference to a non-primary language for which it doesn’t make sense be avoided? (That is, avoiding the failure mode suggests saying hi-u-nu-deva somewhere instead of having hi and nu-deva at a distance from each other.)

  • There seem to be complications with script-sharing languages that plausibly appear in preference order having different numbering system defaults in CLDR. For example, per CLDR, Marathi not only defaults to Devanagari digits but does not even offer latn digits as an alternative. If one specifies Marathi first, Hindi second as a language preference order, should it imply anything about digits for Hindi (considering that even existing preference UIs that allow for numbering system don’t seem to allow specifying them for languages other than the one highest on the user’s priority list)?

  • If we were to expose the numbering system preference, should there be a CSS property to transform digits according to the user preference (assuming appropriate surrounding context)? That is, allowing sites to say digit-transform: auto; instead of using the Locale Extensions mechanism proposed? (This wouldn't mitigate fingerprinting, as the distinction could be measured from layout box metrics.)

  • If we were to accommodate the temperature unit use case in the Web platform, would it make more sense to do so via an HTML element that marks up the default (from site point of view) temperature so that the browser can convert the temperature rendering in place in layout if the user’s preference disagrees or, to avoid fingerprinting, if the user interacts with the amount to reveal a conversion (hover, click/tap, context menu, or similar)? (We already have the time element in HTML for datetimes.)

@zcorpan zcorpan added ready to add Appears ready to add to the table of positions. position: negative labels Feb 12, 2024
hsivonen added a commit to hsivonen/standards-positions that referenced this issue Feb 15, 2024
tantek pushed a commit that referenced this issue Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
position: negative ready to add Appears ready to add to the table of positions.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
@zcorpan @dminor @hsivonen @ben-allen and others