Initial implementation of all ICU calendars#1245
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1245 +/- ##
==========================================
+ Coverage 95.06% 95.26% +0.19%
==========================================
Files 19 19
Lines 9695 11121 +1426
Branches 1526 1824 +298
==========================================
+ Hits 9217 10594 +1377
- Misses 470 522 +52
+ Partials 8 5 -3
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Calendars 101While building this PR I had to learn a lot about various calendars. Here's a summary of what I've learned and some links and notes in case you'd like to learn more. I may update this comment with more information as needed. We may want to adapt this comment into the Temporal docs at some point. Calendar Types
Eras All ICU calendars except hebrew, chinese, and dangi (the Korean calendar that's essentially identical to Chinese) use eras. In this PR, each era is identified by a string "era code". An "era" is defined by the following metadata:
BTW, determining this metadata was a useful exercise in this PR. Even though the implementation may be thrown away, the structure of this metadata may be helpful long-term. The most common pattern in ICU calendars is for calendars to either have a single era that accepts negative years, or one era for positive years and another era for negative years. The only ICU exceptions are This PR proposes two ways to specify a year: a signed value relative to a per-calendar "anchor era", and a pair of values It's an open issue whether we should accept ICU Supported Calendars
Hindu Calendar One notable calendar that's not currently supported by ICU is the Hindu Calendar. This is a lunisolar calendar that is by far the most challenging calendar that I researched. http://cldr.unicode.org/development/development-process/design-proposals/chinese-calendar-support#TOC-5.-Hindu-luni-solar-calendar-old-or-new-with-several-variants-: has a good overview of the complexity. The TL;DR is that leap months are not simply added like Chinese or Hebrew. They can also be removed, or in rare cases combined! So a particular month can either be:
This calendar is a good stress test for any month-handling scheme that we may use. If it works for Hindu, it should work anywhere! This calendar is one of the reasons I'd suggest using free-form strings for month codes instead of being more proscriptive about the string format. |
General PrinciplesHere's some general principles underlying the design of this PR:
|
|
A short but important note about the
In my opinion, the
With Temporal, thanks to the custom calendars, it is possible to define any calendar historically used in Western Europe, with a switching date between 1582-10-15 (Rome and Spain) and 1752-09-14 (U.K.) or even later. |
I strongly disagree with this, sorry. It's not within scope for Temporal to specify how any of the calendars other than > new Date(Date.parse('1582-10-15T00:00Z') - 1).toLocaleString('en', {timeZone: 'UTC', calendar: 'gregory'})
'10/14/1582, 11:59:59 PM'It would be very bad if the results differed by 10 days depending on whether you used Date.toLocaleString or one of the Temporal toLocaleString methods. |
|
Justin, thanks for the awesome extra background and notes around this. Very helpful for diving in. Re: new fields,
edit: @ptomato linked me to #1235, which had slipped my mind. Will comment on I think the anchor era should be an observable property, otherwise it will be a magic number/string/other in documentation. |
@Louis-Aime - To clarify, @ptomato isn't saying your feedback is bad, only that there's a division of responsibility. Temporal is responsible for the Calendar API and for defining requirements and invariants that all calendars should follow. But others are responsible for the actual calculations of each built-in calendar, including decisions like whether That said, if any of your feedback may have impact on the Temporal API, then it's definitely welcome to file here. For example, your feedback above got me wondering if developers would be able to successfully pass calendar-specific options (e.g. a Gregorian->Julian transition date) from @sffc, if @Louis-Aime (or others) has feedback about the |
|
The |
ptomato
left a comment
There was a problem hiding this comment.
Thanks, this is a lot of work, I think it meets the goal of validating that the current design of the calendar API is flexible enough to accommodate all the different calendars, with a few minor modifications. That is an important milestone.
I would like to make sure that we make these modifications with as few API additions as possible.
Separately, I would like to start merging parts of this so that we don't have the whole thing blocked on the resolution of the open discussions. It seems that #1227 and #1229 would be good places to start on that. Would you be able to start by splitting those out or is it something that @cjtenny or I should do?
polyfill/lib/calendar.mjs
Outdated
| fields = ES.ToRecord(fields, [ | ||
| ['day'], | ||
| ['era', undefined], | ||
| ['eraYear', undefined], |
There was a problem hiding this comment.
I think the ISO calendar should not observably access era, eraYear, or monthCode, since those fields have no meaning in ISO calendar space (and would not be included in implementations without Intl.) Maybe this part should move to impl['gregory']?
There was a problem hiding this comment.
Hmm, let's discuss this in context of your comment below about whether to offer these properties on the ISO calendar.
polyfill/lib/calendar.mjs
Outdated
| }, | ||
| eraYear(date) { | ||
| if (!HasSlot(date, ISO_YEAR)) date = ES.ToTemporalDate(date, GetIntrinsic('%Temporal.PlainDate%')); | ||
| return GetSlot(date, ISO_YEAR); |
There was a problem hiding this comment.
I think all of these should be undefined in the ISO calendar. (assuming we have fields month and monthCode in non-ISO calendars; if not, then things might need to be different)
There was a problem hiding this comment.
I think that it'd be better to have valid values for the ISO calendar, because it'd make it easier to write cross-calendar code. Otherwise, writing cross-calendar libraries would require a lot more special casing and would introduce more bugs. This is especially true for monthCode which is required for PlainMonthDay in lunisolar calendars.
Here's what I had in mind:
monthCode- always${month}monthType- always'regular'eraYear- alwaysyearera-undefined(just like era-less calendars like Hebrew and Chinese)
What do you think?
polyfill/lib/plainmonthday.mjs
Outdated
| const val1 = GetSlot(this, slot); | ||
| const val2 = GetSlot(other, slot); | ||
| if (val1 !== val2) return false; | ||
| // optimization: avoid calling getters if slots are equal |
There was a problem hiding this comment.
As discussed in #1239 I don't think this should change.
Apologies if I came across badly in my comment. |
Agreed. It seems like the Calendar API is on the right track with a few mods.
Me too!
After spending so much time on this PR, I'm backed up on other things so would love help! Feel free to pull any subset from this PR that you think would be useful to split out. #1227 is simple and self-contained so is a great candidate. #1229 and #1235 may be somewhat inter-related so it may make sense to tackle both in the same PR. |
#1227 is separated out and merged now. If you don't mind, I will rebase this PR every time we merge something so that we don't accumulate merge conflicts. |
Cool! Rebase after every PR sounds good, but I'm working on some bug fixes now (the current code doesn't handle Also, #1253 might be another candidate for splitting out. It's similar to #1229 and #1235, except for options instead of fields. |
eb158bf to
7dab912
Compare
|
I just pushed a commit that should fix behavior of the
|
|
I think this may not address #1229 after all - though it adds new coercion types for monthCode and eraYear and changes default values to enable other calendars to work. I can't figure out how it would allow a custom calendar to receive those fields pre-coercion or to do the validation on its own. I've got a commit for #1229 that does do that, which is almost done and I'll push it out in the morning (PST). It doesn't overlap with these changes much at all so it should be a simple rebase, which @ptomato or I can tackle (I don't yet have push bits on this repo, so I can't rebase here directly). Edit: I believe it will also address #1253 , will take a closer look in the morning. |
AFAIK, the data flow in this PR works like this:
Which step above is problematic? Here's an example of that workflow (from const fieldNames = ES.CalendarFields(calendar, ['day', 'month', 'year']);
const fields = ES.ToTemporalDateFields(item, fieldNames);
return ES.DateFromFields(calendar, fields, constructor, overflow); |
cjtenny
left a comment
There was a problem hiding this comment.
Comments part 2; stepping out for lunch and a pickup trip to the library then I'll be back later this afternoon for part 3.
| } | ||
| } | ||
| if (monthCode === undefined) monthCode = this.getMonthCode(year, month); | ||
| return { ...calendarDate, day, month, monthCode, year, eraYear }; |
There was a problem hiding this comment.
Will this lead to inconsistent era output re previous comment? As far as I can tell it doesn't matter, but trying to understand why an undefined era is inserted above with an eraYear. Ideally both or neither should be present if possible, even though I think this is all impl-internal.
Also, should the hebrew calendar have constant era 'am' or 'ah'?
There was a problem hiding this comment.
It shouldn't cause any harm. For internal-only processing, I generally followed the rule of always filling in eraYear === year for non-era-using calendars in order to avoid special-case code in cases where eraYear is used. As you noted, I I don't think this is observable behavior.
There was a problem hiding this comment.
Also, should the hebrew calendar have constant era 'am' or 'ah'?
I assumed that hasEra should be true for calendars whose default .toLocaleDateString('en-US-u-ca-...') output contains an era, and false for calendars where it doesn't. The latter group is only ISO, Hebrew, and Chinese/Dangi. So that's why there's no era in Hebrew. But this is also really easy to change. @sffc, @Manishearth - do you guys have any opinions about which calendars should and should not use eras?
| const matchingEra = | ||
| era === undefined ? undefined : this.eras.find((e) => e.name === era || e.genericName === era); | ||
| if (!matchingEra) throw new RangeError(`Era ${era} (ISO year ${eraYear}) was not matched by any era`); | ||
| if (eraYear < 1 && matchingEra.reverseOf) { |
There was a problem hiding this comment.
does this properly reject era: 'ce', eraYear: 0?
There was a problem hiding this comment.
The convention I used (at least I think I did!) was the same as what legacy Date does with era-less input: out of range values are allowed and evaluated relative to the epoch of the era. So era: 'ce', eraYear: 0 is treated as ISO year 0 and will be stored as 1 BCE. An exception is reverse eras, which I require to be positive because -1 BCE is clearly a bug.
We could also choose to reject 0 CE inputs. I don't feel strongly either way.
There was a problem hiding this comment.
Hm, I think if we're in a 'constrain' context that makes sense, but a calendar should reject an eraYear that's not valid for the context of a given era in a 'restrict' context. It seems inconsistent to coerce eraYear 0 to CE 1 otherwise.
There was a problem hiding this comment.
Sorry, I wasn't clear. The current behavior for all eras is not to constrain eraYear, but rather to evaluate eraYear as a signed value in context of the era's epoch. So 0 CE => 1 BCE, and Meiji 100 => Showa 42.
That's why I'm a little skeptical about constrain behavior because it's a little ambiguous about what it should do: constrain to first/last day or era? Or current behavior?
There was a problem hiding this comment.
BTW, one reason why the current behavior is as-is is to make it cleaner to deal with dates in years where eras end mid-year. For example, Jan 1 Reiwa 1 is technically in the previous era because Reiwa started May 1. By always treating eraYear as a signed value without constraints, it includes handling the Jan 1 Reiwa 1 case too. I think there was discussion of this in another issue but forget which.
There was a problem hiding this comment.
Hm, I think this is tricky. There's no clear temporal spec for this, but it does seem in the spirit of constrain vs reject that Meiji 100 would be a RangeError when treated strictly, and Showa 42 when treated loosely. Same for 0 CE vs 0 BCE, for example. My preference would be not to evaluate eraYear as a signed value in the context of an epoch; that's what year is for. This could lead to developer mistakes, too, though it's admittedly a little obscure. Let me see if I can find that other discussion after reviewing the tests, before I suggest any changes.
There was a problem hiding this comment.
If OK with you, I'd prefer to handle this question as a follow-up PR, because this case seems ambiguous enough that we may want to discuss it with a wider group before committing to a course of action. To summarize, there are a few questions to answer:
- How to handle out-of-bounds eraYear with
reject? - How to handle out-of-bounds eraYear with
constrain? - How to handle out-of-bounds months/days (but in-bounds eraYear) with
reject? - How to handle out-of-bounds months/days (but in-bounds eraYear) with
constrain?
AFAIK, we already decided (3) and (4) in #1300, which was to accept those values without exceptions. So unless there's new information I'd prefer to leave those closed decisions closed.
Changes in response to review feedback by @cjtenny: * Indian calendar estmates are now exact * Exact estimates are more efficient now * Fixed some wrong comments
cjtenny
left a comment
There was a problem hiding this comment.
Part 3, now onto the tests.
polyfill/lib/calendar.mjs
Outdated
|
|
||
| const helperJapanese = ObjectAssign( | ||
| {}, | ||
| // NOTE: For convenience, this hacky implementation only supports the most |
There was a problem hiding this comment.
I'll file a (non-blocking, separate) issue to mirror Intl's supported eras in the polyfill tomorrow, after someone decides on the THIS-IS-NOT-A-SPEC-CHANGE issue tag
There was a problem hiding this comment.
This is actually an old comment that I inherited from @ptomato but the reason for only showing 5 eras is out of date. In our meeting with the ICU4X folks last week, we decided that @sffc and @Manishearth would do more research and choose between two options:
- only supporting Meiji and newer eras, and use ce/bce for older dates
- supporting all CLDR eras
The reason why neither of these is an obviously best winner is that older eras are apparently both infrequently used and are also frequently subject to revision by historians... and in some cases are not precisely defined down to the month/day/year. This makes it hard to use older eras for Temporal features that require eras to start and end on precise dates. Let's use #526 for more discussion. If @sffc and @Manishearth decide that "use all the CLDR eras" is the way we should go, then there's definitely a work item needed to populate those into the polyfill, but it's not a guarantee that this is needed.
In the meantime I'll clarify this comment to match the latest state of things.
There was a problem hiding this comment.
Here's more context about Japanese era identifier choice: https://docs.google.com/document/d/1vMVhMHgCYRyx2gmwEfKRyXWDg_lrQadd8iMVU9uPK1o/edit?ts=601316e5#heading=h.11oia1rbgkns
4ea9823 to
ba854d2
Compare
* Fix Indian calendar tests in Node 12/14 * Add new tests for Indian leap year and Indian ICU bugs * Refactor ICU bug checking to be more generic * Update outdated comments * Remove redundant validation
ba854d2 to
c372905
Compare
No worries, this is a great review. I'll add these tests tomorrow. Also, sorry for all the commit spam tonight. It's painful when bugs only repro in Node 12 or 14 and the only place I can repro them is in CI! |
* Ensure that invalid `month` is constrained or rejected per options * Ensure that invalid `monthCode` is always rejected (fixes tc39#1349) * Add tests for PlainMonthDay to verify the above and to ensure that leap days and leap months are constrained accurately
116a88c to
dfe9e7e
Compare
Done! At this point, all tests are passing and I believe that all review feedback has either been addressed or I've suggested to punt the feedback to a follow-up. So should be ready to merge if you like what you see! BTW, adding those PlainMonthDay tests was a really good suggestion, because adding these tests exposed a few bugs, including that I wasn't constraining/rejecting |
|
Ah, crap I spoke too soon. One of the Test262 tests are failing. I'll push a fix later this afternoon-- my laptop is about to run out of battery. |
|
I fixed the last failing 262 test, which was a weird one: Anyway, LMK if any more changes are required. |
|
Awesome Justin, thanks for all the work here. I'll be on tomorrow morning and creating follow up issues, hopefully merging! |
The previous commit failed the`PlainMonthDay({month: Infinity, day})`
test because 'MInfinity' is obviously invalid. This commit changes the
"calendar omitted" path of ToTemporalMonthDay (which previously
added a `monthCode`) to add a `year` (1972) instead which won't
cause the MInfinity problem.
While we're in the neighborhood, this commit also adds one more test
for round-tripping PlainMonthDay values to/from ISO strings.
dd913bd to
57b83f7
Compare
I realized this morning that there's a much simpler solution (only a 2-line change in ecmascript.mjs) to this problem, so I just replaced last night's commit with this simpler fix. |
cjtenny
left a comment
There was a problem hiding this comment.
Thanks again for all your work here, @justingrant ! I'm sorry for the late-in-the-weekend review, my brain was feeling like molasses all weekend so I wanted to take a few extra passes. I've gone over the changes enough times now that I'm confident they're correct :) Congratulations on this beastly change.
|
Thanks, glad to help! @cjtenny I assume you're going tp open follow-up issues? Just wanted to make sure that I'm not on the hook for those. ;-)
Cool, thanks. FWIW, this was actually an easy problem to find because it was just pure luck that obviously-bogus I assume there's already a spec change needed to add the M prefix to |
|
Sounds good on both counts. Yeah, I'll take care of them :) I've been irrationally waiting to turn my emacs buffer of issues into GH issues in the hopes that someone else will come up with a tag for really-not-changing-for-stage3 issues, but I need to bite the bullet 😅 |
This PR adds initial implementations for all supported ICU non-ISO calendars, and proposes a few possible changes to the Temporal API to address issues found while building those calendars. The goal of this PR is to validate the Temporal calendar API and to identify potential issues we may need to discuss and/or resolve. To set expectations, the calendar implementations themselves pass basic tests and match the output of
Intl.DateTimeFormat, but are not intended to have production-ready accuracy nor performance. Also, this PR has some gaps and unfinished parts, but I wanted to get it into GitHub so we can start discussion and resolve open issues. On a personal note, getting this PR into GitHub was helpful to keep my mind off the chaos in Washington today! Fixes #1210. FIxes #1349.At a high level, here's what's in this PR:
DateTimeFormat.formatToParts. But calendar->ISO is even hackier: the code tries to roughly estimate an ISO date from the calendar date, then roundtrips back to calendar and then bisects the difference between estimate and actual values until we end up with the right values. This is inefficient and slow, but it guarantees that this PR's results will always match DateTimeFormat's results and it's probably good enough for an interim implementation before a real production polyfill is built later this year.monthCodethis is a year-independent, string-valued property that describes the month. For ICU calendars, it follows the iCalendar convention of${month}for non-leap months, and is${previousMonth}Lfor leap months (e.g.'M5L'for Hebrew Adar I)eraYear- the era-relative year, as a numbermonthfield as the 1-based numeric index of the month in the current year. For, example, in a Chinese leap year the leap month with monthCodeM7Lwill havemonth === 8. UnlikemonthCode,monthis not guaranteed to refer to the same month name in every year for lunisolar calendars.yearfield as a signed value relative to an "anchor era" for each calendar, e.g. "AD" for Gregorian. This avoids potential confusion and security issues that could result from years counting backwards for years in eras like BC or the ROC calendar's Before R.O.C. era.yearor aneraYear/erapair to specify the year infromorwith, and it enables developers to use eithermonthormonthCodeto specify the month.Here's a summary of issues related to this PR. I marked those that already contain possible fixes in this PR. There's quite a few issues here, so even if the code here can fix some of these issues, it may make sense to split those fixes out into separate PRs. For questions about specific issues, please discuss in those GH issues to keep our discussions in one spot!
Here's smaller issues that I expect to be are relatively non-controversial:
These issues I expect to require more discussion to resolve.
from()orwith()that are not emitted bygetFields()#1235 Calendars can't accept fields infrom()orwith()that are not emitted bygetFields()overflowoption behave with leap days and months inPlainMonthDay.from()? #1237 How should theoverflowoption behave with leap days and months inPlainMonthDay.from()?The items below were handled in separate PRs: