Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No "Localizable" type #1025

Open
aphillips opened this issue Sep 20, 2021 · 33 comments
Open

No "Localizable" type #1025

aphillips opened this issue Sep 20, 2021 · 33 comments
Labels
i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on.

Comments

@aphillips
Copy link

The I18N WG has issues open against a sizeable number of specifications regarding the provisioning of language and direction metadata for natural language string values. It would promote interoperability and standardization if there were a consistent, well-known, well-described structure for this. Hence, we want to propose the addition of a Localizable or LocalizableString type that includes language and base direction metadata.

Reference: w3c/string-meta#54

@domenic
Copy link
Member

domenic commented Sep 20, 2021

Are there any concrete specs that are interested in using this?

@aphillips
Copy link
Author

@domenic Yes, we have quite a laundry list of them at this point. I don't have time right now to provide a complete list, but top of mind are the Web Payments specs (payment-request, secure-payment-confirmation, etc.), AppManifest, DCAT (vocab-dcat), and Media Capture. @marcoscaceres says that this has been raised before as a PR?

@domenic
Copy link
Member

domenic commented Sep 20, 2021

Note that JSON manifests must not use Web IDL, so any cases involving that aren't good support. Let us know when you do have time to provide such a list so we can evaluate in more depth.

@annevk
Copy link
Member

annevk commented Sep 21, 2021

Notifications API could have used this for its body and title members. Instead it has dedicated dir and lang members that end up applying to both, with appropriate defaulting.

It's not entirely clear to me that a type is the way to go. Although it encapsulates things nicely, it doesn't necessarily result in the best API for developers. Personally I have slight preference for the flatter approach from the Notifications API. It does seem good if there's shared language regarding how to render the combination of these members. (Which seems more like an Infra than a Web IDL thing to me.)

@aphillips
Copy link
Author

@annevk I18N generally recommends that each natural language text field have its own metadata. This allows different fields to have separate values, which is often appropriate.

A spec like Notifications might choose to describe language and direction for a given notification message, on the theory that the message is meant to be consistent. This is probably okay if we're only talking about a couple of fields. The more fields that are included "under the metadata blanket", the more and more likely it is that one or another need a separate base direction (or less commonly, a separate language tag). (Note that when fields have their own metadata, assigning @dir and @lang in e.g. HTML becomes straightforward).

The problem I have is that there are many many specifications that need a convenient way to specify a natural language text field that includes the necessary metadata and provides for consistent serialization/deserialization/interchange. This is why this issue exists.

(In case you're unaware of it, we have a document about this whole topic: String-Meta)

@annevk
Copy link
Member

annevk commented Oct 1, 2021

Understood, but I think it often depends on context what the right solution is. E.g., HTML doesn't have altdir, altlang, titledir, and titlelang. I can see the need for some shared infrastructure, but I'm not entirely convinced we want a type. At least, I'd like to see more APIs for which that would be the right solution first.

@domenic
Copy link
Member

domenic commented Oct 1, 2021

In particular, it's actually pretty rare to use a JS API to represent user-visible text in web platform APIs. Usually we instead present users with content through HTML that is displayed in the content area.

The only time you'd need a JS API to present user-visible text, would be when that user-visible text is presented outside the content area. This is generally frowned upon for security reasons, with notifications and web app manifest being the only notable exceptions I'm aware of. (Smaller exceptions I can think of include title="", alt="", and <title>, but those use HTML-based APIs instead of JS APIs.)

So indeed I'd encourage us to find examples here before doing any work to encourage more APIs of this sort. Especially since for the two main notable examples so far, the Notifications API and web app manifest, the proposed solution is not suitable.

@aphillips
Copy link
Author

LOL. I18N has never been happy about natural language in attributes (because the host element's metadata doesn't always apply)....... 😀

We need some form or forms of shared infrastructure. On the one hand I have a number of specifications and APIs that are defining natural language text fields. The requirements for these fields are effectively identical and interop is best served when different specs understand that this or that field "is a" localizable type.

We got some traction with JSON-LD, which helps in that space. This issue is for WebIDL, but I'm open to creating the necessary reference-ready values in a different spec instead of or in addition to here.

@domenic
Copy link
Member

domenic commented Oct 1, 2021

Again, please provide the examples. Last time we asked you brushed us off saying you didn't have time, and now you're laughing at us. This isn't really helpful to moving the conversation forward.

@aphillips
Copy link
Author

@domenic noted:

In particular, it's actually pretty rare to use a JS API to represent user-visible text in web platform APIs. Usually we instead present users with content through HTML that is displayed in the content area.

Don't JS APIs contain natural language text strings? For example, one of the reasons I'm here is the Payment Request spec. See for example here. Is this field (label) a bad example?

@aphillips
Copy link
Author

@domenic I was not intending to laugh at anyone, but I was struck by (from my perspective) the return of an old wound. That was intended more as an aside and I apologize for the off-topic remark.

I see that our comments crossed over. Hopeful my example above is helpful?

@annevk
Copy link
Member

annevk commented Oct 4, 2021

For that example it seems that adding dir and lang members would suffice.

(And I know that in theory it would be ideal if title and alt had such attributes, or they were elements of sorts, but in practice I don't think it has been much of a problem.)

@dir

This comment has been minimized.

@alvestrand
Copy link

The device label in [MEDIACAPTURE-MAIN] is another one of those outstanding issues raised by aphillips. We closed it in 2015 saying that we'd await some shared infrastructure so that we don't invent new things, and it got reopened this year (2021). At the moment we have a large installed base depending on this attribute being a string, so changing the attribute to Localizable is not feasible.

@marcoscaceres
Copy link
Member

Given the majority of cases involve a single label, it makes sense to just suggest Editors add @dir and @lang to their dictionaries. Infrastructure wise, Infra or some other spec could provide the algorithms. For @lang at least, we need to decide if we should check if IsStructurallyValid() or not validated at all (i.e., what Notification API does). For @dir, it could provide presentation guidance. I think Notification spec's Direction section basically gives us what we need:
https://notifications.spec.whatwg.org/#direction

As @domenic pointed out, we are short of API examples where we would need the more complicated Localizable structure. Payment Request at one point had a dictionary where something like that would have made sense (a bunch of caller provided error strings that were displayed in a native UI) - but it's no longer in the spec. In other words, it might be ok to wait on Localizable until we actually need it.

@marcoscaceres
Copy link
Member

I'll just add that a recurring problem with @dir and @lang is that often OS-level components don't do anything with them. An example where this situation has come up also is with Web Share :
w3c/web-share#6

This was also a problem with Payment Request, where data is passed to an OS level component and @dir and @lang wouldn't have any effect.

@annevk, when showing a notification, do you know if any OS actually supports doing something with @dir and @lang?

Generally speaking, over the last few years we've largely moved away from adding things to specifications that are not implemented (or implementable). I can absolutely see merit in adding @dir and @lang as "the right thing to do", but I'm concerned about adding features only to find it can't be implemented because OS level APIs don't have any means of supporting them. Naturally, this would be disappointing to developers and users.

I know... this is a bit of a chicken and egg problem, but I don't know what the right solution is? Do we add aspirational things? or stick to "must be implemented!" to be in spec?

As the Web becomes more integrated at the OS-level, I guess we need more OS engineers (in addition to browser folks) involved in the process too.🤔

@annevk
Copy link
Member

annevk commented Oct 5, 2021

@marcoscaceres I would expect that the situation is the same as with sharing. The information is reflected on the Notification object so some aspects are testable and if the OS ever gets support it would be easy to hook up. Now there might be a problem at that point if people have been supplying incorrect values here, copy-pasted from elsewhere. This is a frequent problem with metadata, especially if it has no effect in practice. From that perspective it might be harmful to allow people to supply this metadata until we know it will be consumed end-to-end (and thus be somewhat visible if incorrect).

@alvestrand
Copy link

The classical example where the Web platform has all the pieces is when one wants to display the label for which @lang and @dir apply in a Web page - knowing @dir reduces the chances of messing up how it's formatted (you can put LRE/RLE and PDF markers around the string).
For utility of @lang in that context, add screen readers.

@marcoscaceres
Copy link
Member

@annevk wrote:

This is a frequent problem with metadata, especially if it has no effect in practice. From that perspective it might be harmful to allow people to supply this metadata until we know it will be consumed end-to-end (and thus be somewhat visible if incorrect).

Yes, this is the core of the problem. Given the above, it really comes down to two questions:

  • should specs aspirationally add dir and lang, even when they know they are not implemented? y/n
  • If yes, then should we have Infra (or some other doc) specify the semantics, presentation, and processing requirements of dir and lang?

@r12a
Copy link

r12a commented Oct 6, 2021

Here's my point of view:
This is nothing to do with aspiration. For these particular features, it has to do with rubber-meets-the-road, practical requirements for producing technology that works for people in certain regions around the world.

Sure, we can't be certain that the features will be implemented, but if we don't spec them it's probably a fair bet that they won't, and we are actively contributing to that lack of implementation. But again, it's not the implementation game we need to be considering here, we need to look beyond that and consider whether multinational users will be able to use the technology effectively.

If you like, this is a diversity issue, a sort of Me-too for international users of the 'World Wide' Web. Ok, let's say that implementers don't implement this feature after we spec it. That's their choice, but i don't think we should be taking the decision for them. If multinational users find that the technology raises problems, they should be able to go back to the implementers to raise their diversity issues, rather than blame the WhatWG or W3C for the problem.

@annevk
Copy link
Member

annevk commented Oct 6, 2021

I think there's a couple things here:

  1. Are any of the current features actively incompatible with adding lang/dir down the road? From the examples provided, this does not appear to be the case.
  2. Where we have added lang/dir already, has that seen uptake? Unfortunately, that doesn't seem to be the case.
  3. Does adding lang/dir before there is end-to-end support (i.e., both in the respective OS API and in the browser) help end users? This is very much unclear, as per my earlier comment.

And I do think implementations are an important consideration here as without implementations you don't have a standard. And it's also fair to require some amount of implementation commitment before writing things down as not everyone enjoys writing fiction.

@alvestrand
Copy link

In most cases, there will be one default lang and dir used by the OS, and implementations will reflect that when generating things like labels.
Unfortunately that is not the same as the window.navigator.language, which is a value configurable by the user for the browser (only).
I think.

@annevk
Copy link
Member

annevk commented Oct 6, 2021

That would only help in the media capture case where the OS supplies some information (as I understand it; and even then you lack dir). Most other cases are about the application supplying the OS with information to display somewhere. But yeah, the core of the problem is that an OS is designed around displaying a single language.

@xfq

This comment has been minimized.

@xfq
Copy link
Contributor

xfq commented Oct 7, 2021

In addition to the examples mentioned above (Notifications API, Payment Request API, Web Share API, Media Capture and Streams, HTML, Web App Manifest), some other (recent?) examples include Web Authentication (using in-field metadata, not ideal) and Geolocation API.

@annevk annevk added the i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on. label Oct 7, 2021
@annevk
Copy link
Member

annevk commented Oct 7, 2021

The Geolocation example appears to be an instance of #1024.

The Web Authentication example is indeed wild, but I'm not sure how a type would help there as that attempts to address device-to-device communication so you need to define some kind of serialization, I'd think.

@alvestrand
Copy link

The Geolocation examle is using the Unicode mechanisms for in-text localization markup, but the way they are using it is positively weird, in that they suffix the code points with their localization, while all other models for in-text localization would prefix it (localization applies to the text following the markup).

I would want a signoff from the Unicode folks that this is a reasonable approach before emulating that example.

@annevk
Copy link
Member

annevk commented Oct 7, 2021

You mean Web Authentication? Note that I'm not endorsing it, I'm just saying that a type wouldn't help them.

@r12a
Copy link

r12a commented Oct 7, 2021

Note that the Geolocation issue in #1024 is about something different from the topic of this issue. It's about localisation, rather than internationalisation.

#1024 is about mechanisms that allow the developer to provide alternate sets of translated strings/messages, and of the user to switch between those languages. That's about localisation of message sets.

This issue is about allowing strings/messages to be labelled with language and direction metadata, so that they can support correct text display, where needed, ie. internationalisation.

@annevk
Copy link
Member

annevk commented Oct 7, 2021

@r12a right, but what @xfq pointed to is an instance of that, right?

@r12a
Copy link

r12a commented Oct 7, 2021

The Web Authn spec was trying to address the internationalization need, yes. Note that they added this to the spec without discussion with the i18n WG, and that we have numerous issues with what they have proposed, which we are still working through.

@annevk
Copy link
Member

annevk commented Oct 7, 2021

That's WebAuthn, but above you mentioned Geolocation.

@marcoscaceres
Copy link
Member

(I believe the "Geolocation issue" is #996, which relates to localizing the error messages, which requires careful consideration for privacy/UA-detection reasons discussed there)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on.
Development

No branches or pull requests

8 participants