Skip to content

Latest commit

 

History

History
450 lines (233 loc) · 38.3 KB

notes-2024-04-25.md

File metadata and controls

450 lines (233 loc) · 38.3 KB

2024-04-25 ECMA-402 Meeting

Logistics

Attendees

  • Shane Carr - Google i18n (SFC), Co-Moderator
  • Ben Allen - Igalia (BAN)
  • Eemeli Aro - Mozilla (EAO)
  • Frank Yung-Fong Tang - Google i18n, V8 (FYT)
  • Jesse Alama - Igalia (JMN)
  • Philip Chimento - Igalia (PFC)
  • Richard Gibson - OpenJS Foundation (RGN)
  • Sean Burke - Mozilla (SBE)
  • Ujjwal Sharma - Igalia (USA), Co-Moderator
  • Yusuke Suzuki - Apple (YSZ)

Standing items

Status Updates

Updates from the Editors

USA: The spec was frozen until this week for the cutoff, so now we're merging editorial PRs again. There's a new branch for es2024. Reviews are welcome. The opt-out period will run until the ECMA GA in June.

https://tc39.es/ecma402/2024/

FYT: Is there a way to see a preview of the PDF? I'd appreciate the opportunity to review the PDF.

USA: After we produce the PDF, I'd be happy to share it. While the PDF is the official archival version, it is not the canonically consumed version of the spec. The ECMA staff is pondering what to do here.

FYT: In the past, tables were cut off.

USA: That's a priority for the PDF generation. We’ve managed to do this through manual tricks / injecting CSS, ECMA is searching for different solution. Regardless, plan is to have a version of the spec without clipped parts, etc.

Updates from the MessageFormat Working Group

EAO: The tech preview is live as UTS 35 section 9. Implementation experience, feedback, and comments would be highly appreciated. The WG is working on small things, like improving the test suite and refactoring the errors that may be omitted during MF processing.

SFC: Now that we’re in the tech preview stage, it might be appropriate to approach TC39-TG5 about MessageFormat.

FYT: My understanding is that the MessageFormatter, the syntax had to do with UTS 35. Something to consider is to register something to callback / access information for particular object, as in Java and C++. Should we, regardless of the syntax, start to explore how to do that in JavaScript? When the syntax thing is resolved/proven, we’ll have something that we’re ready to incorporate? Or is it highly dependable?

USA: IIUC, you're talking about the function registry. I would say that it should work consistently across implementations. However, in the JavaScript spec, it’s quite well-defined how to do it, and it is done in a JavaScripty way (higher-order functions and such)

FYT: Sorry, where is it defined?

USA: It's defined as an optional argument to the formatter.

FYT: Is that already in the spec?

USA: Yes

FYT: Including that part?

USA: Yes

EAO: The Intl.MessageFormat proposal repo contains pretty much complete spec text of what that proposal will require, and that will include the specification for how to define your own functions, and how they work together with the format. It does include a couple of built-in function definitions, I think :time is the only one that we’re still missing. USA was most recently working on :date. I was not able to convince the MessageFormat working group to accept a formatToParts representation, so that is specific to this ECMA-402 proposal. I mention this because making sure the custom functions are able to do custom formatting to parts is a large part of the complexity that ends up needing to be implemented. That’s done, though, and I invite a review of these parts of the proposal

Updates from Implementers

https://github.com/tc39/ecma402/wiki/Proposal-and-PR-Progress-Tracking

FYT: V8, for DurationFormat in progress, I have time to synchronize the V8 implementation for DurationFormat. Still have two tests in test262 that have issues, some might be issues in the test. BAN has addressed some of the minutes and hours things. Not sure how that impacts the part of the PR. PartitionDurationFormatPattern is too long!

YSZ: [missed what he said]

SFC: I’m noticing that SpiderMonkey is missing a lot of updates. JSC has gaps.

YSZ: I’m still looking to each item’s progress. Some are already fixed, but still catching up to update table.

SFC: I would not be surprised if half the things here are already fixed, because they were fixed in ICU or something.

[Sean Burke]: No update – I don’t have insight into SpiderMonkey stuff

SFC: Anything more than a year old should definitely not be missing tests or MDN. More recent stuff should have the tests and MDN in progress.

FYT: Do we want to purge the table of merged PRs that already have implementations?

SFC: Not at this moment; I see no harm in keeping it around. At this point I still feel that everything in the table is recent enough.

SFC: At one point I experimented with switching to Google Sheets for this, and we decided that even with the difficulty in updating that we should stick with the wiki page.

FYT: Maybe every year when we ship the version of ECMA-402, we take a snapshot of this page and archive it.

SFC: If necessary we can already recover from source control

YSZ: We can use source control and markdown locally, I think I’m fine with the current format, but I do agree that this is hard to modify

Updates from the W3C i18n Group

EAO: No real updates, also I have a recurrent conflict with that meeting and have not been able to attend for two months.

SFC: It would be nice if someone else from this call should be a liaison.

EAO: The meeting is on Thursdays at 4:00 CET / 7:00 AM California time

BAN: I can do it

SFC: You can be a bidirectional liason, delivering updates in both directions.

Proposals and Discussion Topics

https://github.com/tc39/ecma402/projects/2

DurationFormat Update

BAN: The big update is the refactoring of PartitionDurationFormatPattern. I have a PR open on the DurationFormat repo which very much addresses this. PR #188.

tc39/proposal-intl-duration-format#188

BAN: There are two separate methods of formatting: numeric and non-numeric. They require different code paths, but in the old spec, they were interleaved. In the refactor, there is a FormatNumericUnits AO, which then defers to FormatNumericHours, FormatNumericMinutes, and FormatNumericSeconds.

FYT: Is there a way you can put the rendered result of the PR somewhere on the web to look? Because it's hard to read the source and spot issues.

USA: For context, this AO is what FYT was talking about earlier when he said that DurationFormat was difficult to implement. I want to really thank BAN to put in all this work. Before this PR, I was feeling bad about supporting digital format due to the amount of spec complexity. To a point where I became convinced that it was a mistake. Now with this refactor I'm much more comfortable with the scope of the spec. Only big source of errors is that me and BAN (and Anba and FYT) are old eyes on it, so a fresh set of eyes could spot some things we haven’t. Apart from that, I hope this PR, while purely editorial, will make this much easier to read and implement.

SFC: I’m hoping that over the next few months we’ll have an ICU4X implementation, since we’ll have a Google Summer of Code working on it. It will be good to have this landed by the time they start working on it in mid-May.

FYT: One of the issues for me is that the complexities come from we put everything into a table with 4 different row, but some of those things only apply to some of them, but since you’re in a loop each time through you have to check every thing, and you need to keep state around. That makes the algorithm more complex than it needs to be, because we’ve fit into the table framework. And if you don’t follow the table framework, a lot of the issue you don’t need to think through and calculate it – there’s no up. But because currently it’s in this loop thing, even for the early rows, you have to go through that if to see if you hit that or not. Separate table for larger units, hours/minutes/seconds, and subseconds. We have tons of local variables just to keep track of state.

USA: I’d like to respond to both SFC and FYT. It’s exciting to hear that we’re going to have a student working on DurationFormat/ICU4X. I had no idea, because of how it was implemented in ICU4C, it can be implemented without having the building blocks. It’s not fully there yet, but there is a work-in-progress implementation in Boa that I’m working on, that could serve as a reference for your student. If there is indeed a fast, working implementation in ICU4X, Boa wouldn’t need its own implementation. With that, FYT, I would like to point out that the table approach is kind of weird, and it didn’t come so naturally, however it was the concise way to write the spec and also do so in a way that doesn’t significantly harm readability. From my experience implementing it, that is not actually necessary, since the table is essentially a mapping. Every unit has a corresponding value for the default, and that is just by correlation, it doesn’t need to be expressed in table form. If you know the name of the unit, the fact of it being a table is that you know what the corresponding other values are. It is a way to shorten, or to more beautifully write the spec, while I feel that implementations would probably be better off using something else that makes more sense codewise.

FYT: It just makes the review process difficult – it’s extremely difficult to synchronize.

USA: I do agree that it is impossible to do verbatim. You cannot implement it line by line, you have to break apart from the spec. With test262 eventually having full coverage, that might not be a concern.

FYT: Code reviewers can’t depend on the tests to do code review – have to look at code line by line.

USA: That could have been avoided if we had more powerful building blocks in ecmarkup, since the data structures you have available in code are not available. The best we have is a table.

SFC: I’ll respond a little about ICU4X. We have clients who’ve been asking for DurationFormat for many years, outside the ECMAScript space (Dart, C++, other surfaces). This implementation we’re building is because this is something that is in scope for ICU4X, and ICU in general. This is a core i18n component. Sure, it’s composed of list and number formatting, but it’s composed in a way that’s difficult to do correctly in user land, so it is in scope for ICU4X. Just because it’s in Boa doesn’t mean it wouldn't need to be in ICU4X.

USA: I wasn’t sure that it was planned, but either way that sounds great. ICU4X is a much larger set of stakeholders than Boa

SFC: Once we have the Summer of Code student online, BAN and USA should meet with them at least once on how it works.

BAN: With the tables, is it the length of the table that is the problem?

FYT: When you have a table, you go to an iteration, and you treat all table rows the same, and you have to keep track of the state for each table. Even for the first few rows. For example, you don't need "if fractional" for years, months, weeks, …, so in the implementation we need to diverge, but that makes the review process more difficult. If you inline the table into the AO, you can remove some of those branches.

SFC: One way to resolve this would be to create three AOs with three tables: calendar units, hour/minute/second, and subsecond. This is already the direction we're going with BAN's refactoring.

Conclusion

BAN to merge PR 188 promptly. Soliciting at least one more review of the change.

BAN to investigate whether the tables can be split better to reduce unreachable branches of code, or inline the tables into the AOs.

Grouping separators in DurationFormat

tc39/proposal-intl-duration-format#192

FYT: If, in digital style, if hours, minutes, seconds are very large, should we introduce separators?

I.e. 20:03:120

If DurationFormat isn’t doing any calculations to fold it into an additional upper unit, we can format as-is. But we do this calculation in the subsecond thing, so this is a question we need to make a decision. Do we keep it as is? It’s an issue, but it’s not a huge issue.

The bigger issue is that if the number is > 6 digits, comma separators are introduced, which can be very weird.

df.format({hours: 20, minutes: 3, seconds: 1234567})

"20:03:1,234,567" ?
or
 "20:03:1234567" ?

SFC: First issue is weird, but fine. We could throw an exception, but I don’t like data-driven exceptions. GIGO is a thing here, but it’s not even garbage: it’s weird, but you can figure out what it is. It could also be hard to put specific bounds – say, for example, a leap second. Maybe not a relevant concern with Durations, though. But for example, 0:90 seconds is not completely bad or wrong. Not convinced that any solution to the first problem is better. The better solution is to use Temporal.Duration, so I’m inclined to leave it as-is. The second one we should change, but it’s not a big change – it’s just setting an option on the NumberFormat. It’s a normative change, but I don’t think anyone is going to object to it.

BAN: I agree. The separators on the large numbers is ugly and confusing. I was coming to this with the understanding that anything that changes the values to be within bounds should be done by Temporal.

USA: I agree. It was a conscious decision to not do math, and useGrouping is a bug by some definition.

FYT: I agree with SFC, but I want to note that we do do math, with the subseconds unit. It’s not true that we don’t do math. I’m okay not to do math on those three, but we can’t say we didn’t do math.

USA: FYT, you’re right. We did the minimum amount of math required to cater to the digital case, and that’s it. To better qualify what I said earlier, what I meant when I said “we don’t do math”, when it came to doing balancing and all of that, we had considered it at first and then scrapped it because of the lessons we had learned – we had to basically conclude to do as little math as possible and then focus on educating users on how it works and what their expectations should be.

SFC: I think the milliseconds math is not really – I don’t see it as math, it’s not balancing units, it’s not doing unit conversions, it’s not taking data from one field and adding it to another. It’s just for the purpose of displaying it as a fractional value. The math is limited in scope, it just changes how numbers are displayed on screen. The one difference is that if we have overlapping subseconds, we do add them. We could throw a data-driven exception or just live with it. They’re consequences of decisions we have made that are narrow enough that it’s fine.

FYT: This may not be related, but when I implement that math, I have used BigInt for it, and use expensive BigInt operations and then take the last 9 digits and chop it. We could have seconds be a huge number, and we still have to format that. It’s not very easy, and quite expensive.

SFC: I recall we had come up with a solution for that problem.

FYT: Nope. Because it’s formatting by NumberFormat, it is international mathematical value of the result, so the precision has to be unlimited.

SFC: What I mean is that it’s already the case that if you’re calculating nanoseconds, you can calculate those without having to overflow. You can do it as a two-phase calculation (nanoseconds, then seconds) should fit in a u64

FYT: If the second part is a max-value number, and you have to convert that to whatever you can fit with – maybe you’re right.

SFC: Say you have NUMBER_MAXVALUE (2^53), there should be enough bits such that when you add from the fractional digits it can still fit in a u64

SFC: YSZ, is this change potentially acceptable to you?

YSZ: I think the change looks fine to me.

Conclusion

FYT to create a pull request, which we will bring to the next TG1 meeting for approval.

Normative: Canonicalize value of extension in ApplyUnicodeExtensionToTag #881

#881

USA: You might remember this one from a few meetings ago, there was a PR by FYT that was accidentally merged and then reverted, unfortunately the initial merging closed the issue, so we lost track of it. A libJS contributor pointed it out and reapplied the PR. This PR has TG1 approval and has been discussed among us, but I wanted to confirm with the group that I should merge it.

SFC: Yes. This should go in 2024. I’m glad that the contributor caught this, but it’s not good that we have this problem. Firmly in support of merging it.

FYT: Wasn’t there one I already landed?

SFC: It landed before it got approved at TG1, reverted, then never un-reverted.

RGN: This is not currently in the 2024 branch?

USA: No

RGN: Then it cannot be in 2024 – it’s a normative change and we’re in the opt-out period.

SFC: It was already approved and merged.

RGN: It was intended to go in, we wanted it to go in, but we missed it.

USA: Does it matter if it ends up in 2024 or not?

SFC: Probably not. This is a process problem on our side. What should we have done to make sure it doesn’t happen? We could have waited on the trigger so that we don’t have the problem in the first place, but how can we fix this?

USA: If we already revert a PR, we must re-open issues that were closed.

SFC: The correct process would be to – merging it too fast was human error, the correct thing to do is when you force-push a revert onto main, immediately open a pull request – if there were a pull request, we would have noticed it.

USA: +1

RGN: +1

USA: Also we could put protections on the main branch.

SFC: I don’t know if that would have fixed anything. I commit to main once a month when I add meeting notes. I was looking forward to this in 2024, since it impacts some of the ICU4X because we normalize everything internally, but if we have to wait it’s not that big a deal.

RGN: I feel really bad that we missed this, but the opt-out period is intended to be a time in which there are no normative changes. Running up against that would be worse.

Conclusion

Merge into 2025

Normative: turn useGrouping false for hours & smaller units when style is 2-digits or numeric #198

tc39/proposal-intl-duration-format#198

SFC: LGTM

BAN: +1

Conclusion

Bring this to TG1 for approval in June.

Normative: specify time zone ID requirements to reduce divergence between engines #877

#877

JGT: This is a normative PR that tries in a more deterministic way to determine which time zone ids are primary and which are non-primary. Let me give some background on how various engines today, and CLDR and ICU, make these decisions. This is a followup on previous discussions on #825.

In every year there’s a few, around ten at most, updates to the IANA databases. We tend to see two or three new identifiers added each year. Another change, that is more rare, is renaming an identifier (europe/kiev to europe/kyiv). Also there was a long-term process of demoting some identifiers that always had the same time zone data from 1970 always. America/Montreal and America/Toronto are the same. That process is done. [missed a sentence]

There are intra-country merges (america/toronto and america/montreal). There are others, though: for example, the name is assigned to the largest city (so Reykjavik’s code is africa/cote d’ivore.

Say you’re in Iceland. Iceland decides to adopt daylight savings time. That will break all the summer timestamps that had previously been stored if the time zone for Iceland points to the Ivory Coast. So ICU doesn’t follow the intra-country updates of IANA [check notes later].

This is a fairly complicated system. My goal with this PR is to maintain compatibility with as many things in ECMAScript as possible, to avoid churn with users. Also, we want guidance in the future on how it will work – our users have guidance, but also CLDR and ICU have guidance, and we can ask them to follow the spec and not do things that won’t work.

The four rules I’ve come up with are:

  1. The starting point is IANA, take the rules in IANA, keep the special cases that are already in ECMAScript. In IANA GMT and UTC are different, we’ve decided to merge them in ECMAScript
  2. The next is that if you have a primary and a non-primary ID, you need to share the same country code. We won’t allow ECMAScript time zone IDs to split country boundaries – this can be complicated because of country boundaries changing – but we can outsource that problem to ISO.
  3. Next is following what IANA is doing when it comes to intra-country mergers. I see no reason ECMAScript should buck this trend.
  4. There are legacy POSIX identifiers, about 50. These are mostly non-primary, but there are 9 deprecated weird primary ids, and we don’t really want to list these. I think it’s a good idea to deprecate these in ECMAScript. I’ve followed up to make that request over there – I’m not making good headway – but since these are 9 special cases it seems reasonable to do.

These are the rules, and this PR is spec text that tries to implement them.

FYT: Is this PST7PST or PST8PST?

JGT: The spec text should have the actual value.

SFC: I tagged you on the ICU4X PR intended to imlemetn this. CLDR has its own timezones which it identifies as BCP47 ids. CLDR specifies the data saying all these BCP 47 ids map to IANA ids. [missed something]. I want to make sure that that data in CLDR, which is what ICU4X and ICU4C uses, is reflective of what’s being used in this PR.

JGT: 1 is an exception, 2 and 3 are intended to match what CLDR is doing today, 4 is an exception on top of CLDR.

SFC: And you say you’re hoping to get 4 into CLDR and just haven’t gotten headway?

JGT: I made the request, there’s hemming and hawing.

SFC: Is it pushback, or they just haven’t had time to review it.

JGT: A bit of both. They’re not in agreement about doing it, and they don’t think it’s a high priority. There is, though, universal agreement that we don’t want end users to see these weird POSIX time zones, and there’s not much benefit to users to have these be primary. If we don’t want these deprecated old POSIX things to show up, we need to make them non-primary.

FYT: I see some mapping in CLDR already doing that. Which one are they requesting?

JGT: They do it in some cases but not others.

SFC: So the proposal you also made to CLDR is that PST8PST should be made an alias for America/Los Angeles?

JGT: In some cases they aren’t in CLDR, they’re in IANA but not CLDR, and there are some that are in CLDR and are primary.

SFC: I want to make sure there’s a method in CLDR – there are deprecated – for example, Europe/Zaporozhye used to be its own zone, and now they’ve merged it with Europe/Kyiv.

JGT: Because they’re in IANA and are required to be recognized, there’s special-cases in all the engines on how to handle these codes. This just formalizes that.

RGN: A final point on that last topic: it looks to me like CLDR are reflecting the current state of the IANA database. The blahST blahDT names appear to be zones in North America. I agree that it makes sense – it’s a finite list, it’s an old list, and making them aliases in the context of ECMA-402 is good regardless of what happens in upstream data and libraries. I’ve just submitted my review of this PR, mostly positive, but something to talk about in the next discussion is that the waiting period should not be normative. Everything else just seemed like it needed straightforward editorial tweaks. Largely in favor, with exception of how strong that particular recommendation should be.

JGT: For context on the waiting period: when Kiev was renamed to Kyiv, some systems were updated. Because ECMAScript canonicalizes what it gets from the system, say, if your system says Europe/Kiev, your browser says you have to use the [...]

If that ECMAScript system sends that timezone somewhere else, that somewhere else might say “hey, what’s Europe/Kyiv”? We’ve had cases of two sides, one saying “hey, why are you sending me this updated name” and the other saying “why are you sending me this outdated name”. So an idea is to leave the new one non-primary for a couple of years, and then swap it.

RGN: Thanks, JGT, overall it’s a big PR but it’s well structured and it takes everything in a good direction.

JGT: One of the underlying goals here is to ensure that, most of the ecosystem other than SpiderMonkey is using ICU and CLDR, and I want to get everyone on the path to use ICU and CLDR. SpiderMonkey notes that the spec requires behavior that ICU and CLDR doesn’t do. One of the goals of this PR is to enable SpiderMonkey to use ICU and CLDR as their base, which would simplify things for everyone and ensure that engines have the same consistent behavior. For V8 and JSC, there’d be very few changes to implement this PR. The main change would be to switch to use the getIANA API, the other change is the legacy POSIX identifiers. For SpiderMonkey it would be a bigger change, since they’ll have to stop doing their “direct to IANA” approach, and could rely on ICU and CLDR for this data.

FYT: I have a detailed question. My understanding is for V8 the changes we need to canonicalize those 9 id to a different value? How do you decide the mapping table? I can tell the one you’re talking about – PST8PST to America/Los_Angeles – but others map to other ones. They also have something mapping to a different value for EST. EST they’re mapped to America/Indianapolis. This is nitpicking about detail, but the table you’re putting there, how do you get it out?

JGT: To be honest, because they are so legacy I don’t care a lot what the sources are. IIRC why I picked Etc/ and not America/Indianapolis. They used Indianapolis because Indiana had for a long time had not adopted DST, so it could be used for eastern time but not DST. [...] Honestly, if we believe it’s better to stick with the Indianapolis case and the Phoenix case, we can stick with that.

FYT: I don’t know how CLDR generates this, I don’t know if it’s correct or not, but we have spec here saying it should be this and CLDR generates a different mapping, we would have difficulty figuring out what’s the right one.

JGT: If you consult CLDR 17111.. When I filed this issue, these three weren’t in CLDR. If CLDR has developed an opinion of what those should be…

FYT: somewhere in the source there’s a mapping table, but I don’t know what it is. If you search the CLDR tool you’ll see that, but I’m not sure that’s part of the spec. I just wonder how you generate those things – there are multiple places with those mappings, but I don’t know how important those are.

JGT: I don’t see EST anywhere in CLDR’s data – it’s not there. If you could find out where that mapping table comes from, if there’s already a way we’re doing it, I want to stick with the current webcompat way even if it’s not perfect. These are legacy, they don’t matter much, keep them the same. Do you know where it is?

FYT: Trying to find it, the thing I find might not be important. I can put in what I’ve found.

https://github.com/unicode-org/cldr/blob/a6fbd0ffa1db0c9f60dc0f678ace589366ae7f3a/tools/cldr-code/src/main/resources/org/unicode/cldr/icu/timezone_aliases.txt#L100 https://github.com/unicode-org/cldr/blob/a6fbd0ffa1db0c9f60dc0f678ace589366ae7f3a/tools/cldr-code/src/main/resources/org/unicode/cldr/util/data/TimeZoneAliases.txt#L86 I am not sure those are part of CLDR but could be just part of some tool.

https://github.com/unicode-org/cldr/blob/a6fbd0ffa1db0c9f60dc0f678ace589366ae7f3a/common/supplemental/supplementalMetadata.xml#L1800

SFC: Regarding this topic, I have a question about whether this is basically a fill-in so that CLDR has all the zone IDs that are in IANA and make aliases. I just added it to the CLDR design agenda. My position is that if browser implementations don’t need to diverge from CLDR and ICU, I just want to have them in CLDR. If they don’t need to be in CLDR, why do we need them in ECMAScript? But I would like it if CLDR implemented it so that we don’t have divergence. All in all, though, I support this direction for this proposal. I’ll take a closer look at the spec text, but if what this does is the way JGT described, I’m for it.

JGT: Yes, that’s what I’m looking for – “does the spec text match the goals in the summary?”

FYT: Pasting link to your repo, I don’t know what these tables are so I don’t know if it’s relevant.

JGT: Interesting. I wonder if ICU updates… this is really interesting, I have no idea where this comes from. It’s not in CLDR’s data.

FYT: It’s just in their repo, it may not be part of CLDR data, I don’t know how important it is. I think what we need to do we make sure we don’t conflict with anything else.

JGT: What I’ll do, now that I have this link, I’ll go through this PR and commit to aligning the info that’s in here with the table in the PR, so that we can retain compatibility with whatever’s being used here.

SFC: I would like to close the CLDR issue one way or the other – we add them or not, we find out what’s going on with some of the other data files, basically close this at a point where ICU and CLDR and ECMAScript agree on something. That’s what I want to see.

FYT: Adding another link in CLDR data, a better one. Let me paste it.I put it into PR 877 – this one is in the supplemental data, with the MST mapping to Phoenix.

JGT: What is supplement data vs?

SFC: It’s just different data. If this is just a bug - I had in another tab, you say that PST is missing from here? It’s missing from here – that’s just a bug. If it’s in supplemental data and it’s not reflected in BCP 47, that’s just a bug. This strengthens the case for 17111.

JGT: To be clear, I don’t have an opinion of where these aliases point to, I just want them to be aliases, so that an ECMAScript program has a dropdown to choose your time zone, PST8PST isn’t an option.

SFC: A correct fix would to just add consistency. The place to make the change if you feel so motivated,would be to open up cldr/common/bcp47 and fix it, and then add tests to make it consistent. If a PR were to be made against this ticket – waiting for them to make a PR might not be a good strategy, but if you make one and ask them to review it, that would be a good strategy.

FYT: One thing: if you look at that particular data file – the supplemental data – there’s something like systemV, UNIX SystemV, they have SystemV PST8PST to Los Angeles…

JGT: I have no idea where this data comes from or where it’s going. I just think that everything in IANA should be recognized by ECMAScript, and no POSIX timezone should be in a dropdown in an ECMAScript program.

FYT: Yeah, a strategy might be in CLDR, here.

JGT: What I wanted to have in the ECMA-402 spec, what we agreed in #825, our previous discussion, we wanted to say IANA + our rules should be the source of truth, and we pointed to CLDR and ICU as a source of that truth.

FYT: I agree, I just don’t want to see conflict.

JGT: Consensus here is that I will go through this file, find the mappings that exist, update the PR in ECMA-402, update the table that matches those mappings, and then make a PR against CLDR to true up the BCP data with the mappings that are shown here. All of those things should exist, but also I believe the ECMA-402 spec should include this table, since there’s no other way to define how to treat these legacy IDs.

SFC: Do we agree with the spirit of this PR, and can we bring it to the next TG1 meeting?

FYT: Could we have a rendered version where we can look at it easily?

JGT: Should I ask for consensus? Can I have consensus on this normative PR?

SFC: We don’t take consensus, but we do give a recommendation to TG1 that it should be approved. One thing that worries me is that we haven’t heard anything from Mozilla or Apple regarding this change.

EAO: I have not looked into this deeply, but it sounds fine, and do understand that Mozilla in particular needs to do more changes, but I’d be surprised if that’s a significant amount of work. Getting all of this aligned would be great, but I didn’t look in depth on what changes would be required by Firefox

YSZ: Looks fine to me

JGT: I’ve talked in the past, not about this PR in particular, and it seems like the main thing would be ripping stuff out of the SpiderMonkey build process to make a call to ICU and CLDR rather than doing some fairly complex transformations to come up with the mapping table. It’s a subtractive change.

FYT: Is it possible to do something in parallel, a proposed PR in test262 to show how to validate?

JGT: Yes. I’ll be working with PCO to build tests for it. The tests will essentially be: “here’s a mapping table we expect to work, run it through your engine and make sure it does.” There’s already similar tests like this.

FYT: So you create a DateTimeFormat with that id, and

JGT: Yes. Both V8 and JSC are out of compliance already, so the tests that currently fail will continue to fail

FYT: At least we’ll know the right one – we’ll see the impact.

JGT: From an end-user perspective the short-term thing is – Firefox is close to a full Temporal implementation thanks to Anba’s heroic efforts – and we’ll see the first implementation of this PR in SpiderMonkey. That will be a good way to show if your output matches.SFC

Conclusion

TG2 approves of the change as JGT described it, modulo alignment with CLDR legacy IDs and a more thorough review of the spec text by the editors.

I18N objections to reducing accept-language #10

SFC: I want to use this as a platform for conversation. This proposal is a proposal from Victor at Google, who works on Chrome, I have no relationship with him – he’s on a different team and essentially from a different company. He put up a proposal to reduce fingerprinting in the Accept-Language header, we already understand the problem space, but his proposal is to take the Accept-Language header and make it so that the Accept-Language that gets broadcast is only the user’s most preferred language, and make language negotiation a back-and-forth thing. I believe that this is problematic from an i18n POV, since it takes what we’ve had for a long time, a language fallback list. Given the slow introduction of Client-Hints – it’s taking something that used to be in the header and makes it a client-server negotiation. I don’t see it being adopted – in my opinion it’s harmful to i18n. Addison agrees. Safari says they only broadcast one language – my understanding from what Myles said previously is that Apple broadcasts a subset of available language pairs but could result in [..]] I hope we can have a discussion on a position on this.

RGN: What does this mean in the context of ECMA-402, since sending a particular value in an Accept-Languages header is independent of us.

SFC: Yes, it’s not 402. I’m bringing it up because we have experts in the i18n webspace. Get open questions answered – what we could is make a TG2 position on it, but I’m mainly using this platform to raise awareness.

RGN: So to understand this correctly, there is no way this turns out to have a direct impact on ECMA-402

SFC: It will have an impact on what we’re doing in this space, but not directly on ECMA-402

RGN: Given the tension between a fine-grained approach and what it means for fingerprinting, I would like to see a general sense of guidance that here’s how you would write this, if you could pick from a small number of well-defined list you’re reducing fingerprinting. But it’s possible to go too far. Reducing to a single unit is an example of going too far.

YSZ: We’d like to discuss about the current implementation in the next meeting. Right now I don’t have much concrete detailed ideas.

SFC: That was the purpose, to raise this for a more detailed discussion in the next meeting.

SFC: Someone on the thread said that WebKit is doing something in this space, which I think is not exactly what’s happening. Victor claimed that Safari only has one language, but that’s not my understanding from talking with Myles on this a couple of years ago. If you could look into it, YSZ, that would be great.

FYT: This was part of [missed]. I added that support in Netscape 2.1 in 1995. Before me Mosaic already had that in 1995. They’re asking the browser to not send anything else. What does that really mean? You can just, any browser can just not send it at all, but the receiving party should already parse that header if that’s sent null. I don’t understand what they’re proposing. Are they planning on changing the HTTP protocol, or are they trying to have browsers have a more restrictive behavior, never send out anything but main languages? Who are they proposing to? Change an RFC, or?

SFC: I had that question too. I don’t think this is a proposal to a specific body, just a proposal of what this behavior could be. It’s not an actual proposal, it’s looking for feedback. It’s like our Locale Extensions work. I’ve had this idea for a while that the ideal outcome is that we figure out a way to make Accept-Language more private while retaining good i18n on the web and find the parts of Locale Extensions to implement in that space.

FYT: In the meantime is that really the important thing, to have multiple languages there? Is it really widely used? I have mixed feelings, because back in 1995 I added it, and wonder whether it was useful. I have a hard time managing the impact. I think some web server can say “well I don’t support German but I support French so I return the French to you?”

SFC: Negotiation with the Accept-Language header is something that can be used and is used. We can quantify the impact, but it’s definitely something that’s useful and also used. Addison Phillips gave a presentation about Accept-Language about how it can and should be used, and how it should be used more widely than it currently is. That being said, there is an opportunity to focus Accept-Language on things that are useful. The whole weighting with q scores, is that important? Do you need five languages in your fallback list? Maybe, maybe not. There is certainly room, but we should maybe look at it from this angle.