Trailing position of metadata #1398

aphillips · 2021-07-06T17:41:17Z

This is a tracker issue. Only discuss things here if they are i18n WG internal meta-discussions about the issue. Contribute to the actual discussion at the following link:

§ w3c/webauthn#1646

r12a · 2021-07-07T13:08:07Z

I'm not persuaded that suffixing is a good idea at all. You already mentioned that including RLM/LRM at the start of the string is more efficient, and we don't want them to use language tag code points, which are the things that take up lots of initial bytes. So really we're talking about one extra character in the string, and then only where the bidi algorithm needs help.

So i don't think the argument about preserving data rather than medata is convincing. And anyway, if strings are going to be truncated, either (a) it's likely to be less problematic to lose one code point at the end (since we are truncating already) than to lose directional information, and (b) this format they describe isn't JSON-LD, so i don't think it's comparable to @lang, and (c) if they intend for metadata to be post-pended, they should require the consumer to capture and apply the metadata before truncating.

And, as Martin mentioned, using paired controls, such as the language tags or the RLI...PDI etc code points, where some of the metadata is effectively post-pended, is dangerous in scenarios where truncation becomes a possibility, since a missing end code point can cause problems when the text is inserted into a location. (I think we may need to make that point in string-meta, btw.)

Btw, although it does say it, I think we could improve the first para to more clearly indicate that what was put in the spec drew on Addison's personal thoughts before they could be discussed by the i18n WG.

aphillips · 2021-07-07T14:56:51Z

I think it is reasonable to separate language and directional metadata here.

Language tags can be quite long and even the shortest language tag, when encoded using the Unicode language tag characters, would require 16-bytes to encode (start tag, alpha2 primary language, cancel tag) in UTF-8. Since the tag characters probably should be removed before displaying or processing the string, a trailing position might be cleaner (it's easier to truncate a string than substringing it from the front).

Either way, adding tag characters produces problems for string concatenation and other string operations. And naive implementations that don't process the field can display tofu or garbage as if it were part of the data. Overall, using in-string metadata is a bad idea.

When talking about direction, I think it is helpful to separate bidi controls from metadata. Including LRM/RLM or a LRI/RLI/FSI + PDI enclosing sequence is, to my mind, "altering the contents" of the string to help it display correctly. Processes such as truncation (particularly with the paired controls!) or additional attempts to produce a display-ready sequence alters the meaning and display of the content. These arguments are not new: we talk exhaustively about this in String-Meta as reasons why not to use this as a way of communicating direction.

To me, bidi metadata should instead be explicit, which includes not using invisible controls to convey the value. A field like direction with values such as ltr and rtl is a better choice by far.

Overall, it would have been better if, given that webAuthn could not/would not introduce additional fields, they had adopted a serialization scheme using ASCII characters that was unambiguous and machine readable. I note that the RDF solution found in JSON-LD does this pretty well. Amusingly, the example given there uses 16 bytes to encoding an average sized language tag and the direction:

"HTML و CSS: تصميم و إنشاء مواقع الويب"^^i18n:ar-eg_rtl

... but I'd still tend to say that failing to address our comment at all and coming back in v3 to introduce true metadata would have been the better option.

aphillips added pending Issue not yet sent to WG, or raised by tracker tool & needing labels. s:webauthn https://w3c.github.io/webauthn/ labels Jul 6, 2021

aphillips removed the pending Issue not yet sent to WG, or raised by tracker tool & needing labels. label Jul 9, 2021

xfq added the needs-resolution i18n expects this item to be resolved to their satisfaction. label Jul 10, 2021

r12a added t:bidi_strings 3.5 Handling base direction for strings t:lang_strings 2.4 Identifying the language of strings labels Jul 14, 2022

w3cbot added the wg:webauthn https://www.w3.org/groups/wg/webauthn label Feb 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trailing position of metadata #1398

Trailing position of metadata #1398

aphillips commented Jul 6, 2021 •

edited

Loading

r12a commented Jul 7, 2021

aphillips commented Jul 7, 2021

Trailing position of metadata #1398

Trailing position of metadata #1398

Comments

aphillips commented Jul 6, 2021 • edited Loading

r12a commented Jul 7, 2021

aphillips commented Jul 7, 2021

aphillips commented Jul 6, 2021 •

edited

Loading