Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unique identifiers for Japanese calendar eras #526

Open
ptomato opened this issue Apr 23, 2020 · 22 comments
Open

Unique identifiers for Japanese calendar eras #526

ptomato opened this issue Apr 23, 2020 · 22 comments
Labels
calendar Part of the effort for Temporal Calendar API ecma402 Behavior specific to implementations supporting ecma402 non-prod-polyfill THIS POLYFILL IS NOT FOR PRODUCTION USE!
Milestone

Comments

@ptomato
Copy link
Collaborator

ptomato commented Apr 23, 2020

In all our pseudocode examples supposing a Japanese calendar thus far, we've been passing in era names as lowercased ASCII string identifiers made from English transliterations of the names, e.g. { year: 2, era: 'reiwa' } for the current year. That requires that these identifiers are unique. However, that seems not to be the case as there are several eras written with different kanji but with the same English transliteration.

The ICU names (which I guess must be for output only, not input) disambiguate this by adding a range of years to all of the era names in Latin locales except the most recent five. (For example in the C locale: https://github.com/unicode-org/icu/blob/master/icu4c/source/data/locales/root.txt#L1582-L1818)

What to do for input? Requiring the user to specify the era as a numerical index is extremely unergonomic, (for example, { year: 2, era: 181 }) because they're not otherwise referred to numerically as far as I can tell. This could be mitigated with numeric constants?

There must be some sort of prior art for software that does these calculations.

For background and a list of eras: https://en.wikipedia.org/wiki/Japanese_era_name — disclaimer, all I pretend to know about this calendar comes from that Wikipedia page.

This may be a problem in other calendars as well if they use non-numerical identifiers for things. I don't know if that's the case in any other calendars.


These are the duplicate names in the ICU data:

  • Jōgen - ISO years 976–978 (貞元) and 1207–1211 (承元)
  • Shōryaku - ISO years 990–995 (正暦) and 1077–1081 (承暦)
  • Eishō - ISO years 1046–1053 (永承) and 1504–1521 (永正)
  • Shōhō - ISO years 1074–1077 (承保) and 1644–1648 (正保)
  • Kōwa - ISO years 1099–1104 (康和) and 1381–1384 (弘和)
  • Tenshō - ISO years 1131–1132 (天承) and 1573–1592 (天正)
  • Kōji - ISO years 1142–1144 (康治) and 1555–1558 (弘治)
  • Shōan - ISO years 1171–1175 (承安) and 1299–1302 (正安)
  • Jōō - ISO years 1222–1224 (貞応) and 1652–1655 (承応)
  • Enkyō - ISO years 1308–1311 (延慶) and 1744–1748 (延享)
  • Shōwa - ISO years 1312–1317 (正和) and 1926–1989 (昭和)
  • Genkō - ISO years 1321–1324 (元亨) and 1331–1334 (元弘)

Note: Meitoku appears twice as well, from ISO years 1384­–1387 and 1390–1394, but Wikipedia lists just one Meitoku era, and new Date('1385-01-01').toLocaleString('ja-JP-u-ca-japanese', {era: 'short'}) shows the Genchū era instead.

@ptomato ptomato added the calendar Part of the effort for Temporal Calendar API label Apr 30, 2020
@ryzokuken ryzokuken added this to the Stage 4 milestone May 20, 2020
@littledan
Copy link
Member

In general, this comes up on the ECMA-402 side of Temporal, where the non-ISO calendars are specified. This comes up for Intl more broadly, e.g., for some future extension of Intl.DisplayNames. For Stage 3, I think it's fine for us to have vague text about the names. This is an action item for the ECMA-402 WG to come to a conclusion on, and should be part of the Stage 4 final integrated spec text.

@Louis-Aime
Copy link

How about designing the era optional field of Temporal.Date, rather than a text ? The name of the era would be resolved with the same mechanism as for months and weekdays.

@sffc
Copy link
Collaborator

sffc commented Jan 13, 2021

Meiji is the first era in Modern Japan. One idea therefore would be to adopt a scheme such as:

  • era10001 = Meiji
  • era10002 = Taishō
  • era10003 = Shōwa
  • era10004 = Heisei
  • era10005 = Reiwa

Eras prior to Meiji can be inserted in descending order, like:

  • era09990 = Keiō
  • era09980 = Genji
  • era09970 = Bunkyū

I intentionally left one extra digit at the end in case eras need to be added in the future. I'm told that as we get far back into Japanese history, eras are sometimes still being discovered.

@Louis-Aime
Copy link

Is there any official Japanese institution that handles this issue ? The Temporal team could then take into account the way they "code" the eras, if they do so. And how they do if they discover new eras in the past.

@Manishearth
Copy link

So I and @sffc have been discussing some options and tradeoffs, and we want to eventually present them, but first I wanted a quick temperature check: How do people feel about just using Kanji for the era names? It's guaranteed to be unique, and ultimately these are identifiers so people can copy-paste them.

The downsides are:

  • There's a potential for encoding issues
  • Web APIs typically use ASCII identifiers, this is a departure
  • This is bound to be somewhat controversial, though to some extent I think it's good to push developers out of the ascii bubble.

Mostly looking to see if people think this is a good idea, if they think it's a good idea but have some reservations (which we can weigh when we look at the tradeoffs), or if they think it's a really bad idea and should be excluded from consideration.

@macchiati
Copy link

When we are talking about programmatic identifiers (not user strings), as limiting as it is, ASCII alphanums have the advantage of :

  • being recognizable by essentially all programmers
  • useful in protocols that limit the character set (eg BCP47)

@ptomato
Copy link
Collaborator Author

ptomato commented Jan 14, 2021

I don't think it's a bad idea but I also don't think pushing developers out of an ASCII / C-locale / Gregorian bubble is a design goal of Temporal. Rather, it's a goal to make doing the right thing the easiest course of action. I don't see this significantly contributing in that way.

@Manishearth
Copy link

@ptomato yeah the primary goal of that proposal is that it avoids a bunch of other tradeoffs due to the ascii requiring disambiguators. It does seem like it's worth talking about, though perhaps may not be the winner when we weigh all the other tradeoffs against each other

@justingrant
Copy link
Collaborator

If names are problematic, I don't see a problem with a scheme like @sffc described above in #526 (comment).

A hybrid approach could be pairing a numeric code with a transliterated string, e.g. 10001_meiji, 10002_taisho, etc. Even if the word part is duplicated, the number could disambiguate. Keep in mind that these strings are never supposed to be shown to end users, so IMHO it really doesn't matter what they are as long as they can be read and copy/pasted easily.

I'd strongly suggest sticking to ASCII for the reasons noted above. In particular, using Kanji seems fraught with potential problems because a lot of localized software development and debugging happens by developers who aren't fluent in the language being used. Using characters that are not readable by the developers writing the code seems like a bad idea.

@Manishearth
Copy link

Yes, there are other options with different tradeoffs, i mostly wanted to get a temperature check on the kanji option in isolation

@sffc
Copy link
Collaborator

sffc commented Jan 21, 2021

Based on #1311, we now have separate "algebraic year" and calendar-specific "era year". This means we need to choose an anchor year for the Japanese calendar. I think there are two options on the table right now.

  1. Anchor in Meiji, the first era in Modern Japan (1868)
  2. Anchor to CE 1, such that the algebraic year is equal to the ISO year
// Option 1
{
    era: "reiwa",
    eraYear: 3,
    year: 152,
}

// Option 2
{
    era: "reiwa",
    eraYear: 3,
    year: 2021
}

@justingrant
Copy link
Collaborator

I like the option of anchoring to CE 1, given that in Japanese (unlike AFAIK all other ICU calendars) there's no obvious default era to use. This would also sidestep a potentially politically controversial decision, per Wikipedia:

The imperial year system (kōki) was used from 1872 to the Second World War. Imperial year 1 (Kōki 1) was the year when the legendary Emperor Jimmu founded Japan – 660 BC according to the Gregorian Calendar. Usage of kōki dating can be a nationalist signal, pointing out that the history of Japan's imperial family is longer than that of Christianity, the basis of the Anno Domini (AD) system. Kōki 2600 (1940) was a special year. The 1940 Summer Olympics and Tokyo Expo were planned as anniversary events, but were canceled due to the Second Sino-Japanese War. The Japanese naval Zero Fighter was named after this year. After the Second World War, the United States occupied Japan, and stopped the use of kōki by officials. Today, kōki is rarely used, except in some judicial contexts.

@cjtenny
Copy link
Collaborator

cjtenny commented Jan 22, 2021

Instead of sidestepping that political decision, would anchoring to CE 1 possibly be wading into that controversial area - selecting an imperialist / pro-Christian reinforcement of the Gregorian calendar and/or US occupational history?

@justingrant
Copy link
Collaborator

Instead of sidestepping that political decision, would anchoring to CE 1 possibly be wading into that controversial area - selecting an imperialist / pro-Christian reinforcement of the Gregorian calendar and/or US occupational history?

Great point. I thought about this more and still think that making year be the ISO year is best. Here's why: the modern Japanese calendar is solely concerned with turning era/eraYear into an ISO year. It has no other non-ISO features if eras are not involved.

So instead of inventing a default era, which at best will be novel (thereby requiring developer education) and at worst may invite cultural complaints, seems like we should just accept that this is really the ISO calendar with an alternate way to describe years. If developers want to use the "Japanese calendar-ness" they can use era/eraYear. If they don't care about the Japanese-specific part of the calendar, then I don't see how we'd be adding value by using a non-ISO year.

I'm open to being overruled by folks who understand the cultural context, but for an initial default I feel safer in picking ISO than choosing an arbitrary era.

@justingrant
Copy link
Collaborator

Here's some context from a Japanese colleague of @sffc.

As you may know, Japan used lunar calendar imported from China, but with some adjustments time to time historically. Japan officially retired lunar calendar system and switched to Gregorian calendar system, with Japanese era/year on January 1, 1873. Starting year 1873, month and date is fully synchronized with Gregorian calendar, but just use Japanese era/year system. This date is 5th year in era Meiji (明治).

  • 1872-12-31 = 明治4年12月2日 (Meiji 4 Dec 2)
  • 1873-01-01 = 明治5年1月1日 (Meiji 5 Jan 1)

Most of people are familiar with 5 eras since Meiji

  • 明治 / Meiji (1868-10-23 - 1912-07-29)
  • 大正 / Taisho (1912-07-30 - 1926-12-24)
  • 昭和 / Showa (1926-12-25 - 1989-01-07)
  • 平成 / Heisei (1989-01-08 - 2019-04-30)
  • 令和 / Reiwa (2019-05-01 -)

However, ordinary people in Japan usually has no idea about era names before Meiji. This is mainly because -

  • Many very short eras before Meiji - most of them are less than 10 years.
  • Simply just too old (150+ years). People born before Meiji is no longer alive. In 1976, last person who was born before Meiji passed away. (BTW, it looks there are still over on thousand people born in Meiji are still alive)
  • Modern books including text book usually use Western year before Meiji.
  • Lunar calendar was used for all eras before Meiji. Most of people has almost no knowledge about the lunar calendar system historically used. It's not easy for people to imagine correlation between lunar date and solar date.

From the aspect of cultural support on computer systems, I would expect Japanese calendar behaves as below:

  • 5 eras since Meiji are used.
  • Use Western year before Meiji.

I would suggest:

  • Use Western calendar for year/month/date calculation internally. Western calendar might be proleptic Gregorian calendar, or Julian/Gregorian (Julian up to 1582, then Gregorian after that).
  • Western calendar eras + Japanese eras starting Meiji. BC/AD/Meiji/Taisho/Showa/Heisei/Reiwa. BC and AD are not really Japanese eras although.
  • First day of Meiji era in this implementation would be Meiji 5 Jan 1. Meiji 4 Dec 12 should be AD 1872 Dec 31 instead.

For your questions:

Which would be easier for Japanese developers to learn and use?
a) Era-less dates are always relative to Meiji 1
b) Era-less dates are always relative to 1 CE

(a) is a bad option. No one can easily get year from Meiji 1. Meiji 1 is still in lunar calendar based date (Meiji 1 Jan 1 = 1868-01-25) and date is jumped between Meiji 4 and 5 as I mentioned above. We would apply Gregorian system to date before Meiji 5, but it will result dates before Meiji 5 to be all different from the actual date.

I think (b) is the right choice.

Note that we can’t use “era-less dates are always relative to the current era” because the goal of year is to be relative to a fixed epoch that will never change.

That's fine. I think Japanese era/year should be "computed fields" based on Gregorian date.

Can you share your perspective on potential political or cultural controversy with choosing either of these approaches?

In general, Japanese era/year expression is getting less popular these days. Most of people pivot through Gregorian date. I don't see any potential political or cultural controversy with (b).

@sffc
Copy link
Collaborator

sffc commented Jan 30, 2021

2021-01-29: We had a lengthy discussion about this today and decided to use the transliteration as the era identifier for the five modern eras (starting with meiji). For pre-meiji eras, we will use either a century syntax (like heisei-13xx) or just fall back to the Gregorian CE/BCE eras.

@Louis-Aime
Copy link

Louis-Aime commented Jan 30, 2021

I understand from @justingrant's comment the Japanese people would just like to stick to the ISO calendar for all dates before the Gregorian solar calendar was enforced. This is less controversial than choosing year ISO 1940 or year ISO -660 as a starting year. The Japanese people now stick to the Gregorian calendar and the western 7-days week, that come from western Europe. This seems not controversial. Even Saudi Arabia uses ISO8601 now.
I tried to convince the Japanese people not to stick to the Gregorian calendar when choosing the day of Emperor Aki Hito's resigning (see - in French: https://blog.calendriermilesien.org/2017/06/quel-jour-lempereur-du-japon-choisira-t.html), but eventually they switched at a "simple" Gregorian date, May 1st.
As dating events with ancient luni-solar calendars is a very difficult task, I would warmly suggest they use ISO8601 event for dates before 1582, which is easier to handle than the Julian calendar.
In the end, this imperial calendar should be specified by a Japanese authority (like the Académie des Sciences/Bureau des Longitudes in France). But meanwhile, I find the decision mentioned by @sffc as the most reasonable.

@justingrant
Copy link
Collaborator

BTW, whichever solution is decided here should be aligned across legacy Date and Temporal. If 1867-01-01[u-ca-japanese] is going to be ad or ce in Temporal.PlainDate.prototype.era, then it should be the same era in Date.protototype.toLocaleDateString() too.

@cjtenny
Copy link
Collaborator

cjtenny commented Feb 5, 2021

I understand from @justingrant's comment the Japanese people would just like to stick to the ISO calendar for all dates before the Gregorian solar calendar was enforced. This is less controversial than choosing year ISO 1940 or year ISO -660 as a starting year. The Japanese people now stick to the Gregorian calendar and the western 7-days week, that come from western Europe. This seems not controversial. Even Saudi Arabia uses ISO8601 now.
I tried to convince the Japanese people not to stick to the Gregorian calendar when choosing the day of Emperor Aki Hito's resigning (see - in French: https://blog.calendriermilesien.org/2017/06/quel-jour-lempereur-du-japon-choisira-t.html), but eventually they switched at a "simple" Gregorian date, May 1st.
As dating events with ancient luni-solar calendars is a very difficult task, I would warmly suggest they use ISO8601 event for dates before 1582, which is easier to handle than the Julian calendar.
In the end, this imperial calendar should be specified by a Japanese authority (like the Académie des Sciences/Bureau des Longitudes in France). But meanwhile, I find the decision mentioned by @sffc as the most reasonable.

Let's not tokenize Shane's coworker by suggesting that one Japanese-identifying person speaks for the entire Japanese populace :) Also, while many nations use the ISO8601 or similar calendars for many purposes, other calendars still have important cultural, religious, and civic uses in those countries and part of the point of calendar support in Temporal must be to enable those calendars that do not just look like ISO.

@cjtenny
Copy link
Collaborator

cjtenny commented Feb 10, 2021

Carrying over discussion from #1245 (comment) :

Should the polyfill support all of the CLDR eras? Currently, there is a discrepancy between the eras supported by Intl and the eras supported in the polyfill.

@cjtenny cjtenny added the ecma402 Behavior specific to implementations supporting ecma402 label Feb 10, 2021
@ptomato
Copy link
Collaborator Author

ptomato commented Feb 18, 2021

ECMA-402 issue: tc39/ecma402#541

The remaining action for Temporal is to implement whatever ECMA-402 decides on this topic in the polyfill.

@ptomato ptomato added the non-prod-polyfill THIS POLYFILL IS NOT FOR PRODUCTION USE! label Feb 18, 2021
@ptomato ptomato removed this from the Stage 4 milestone Feb 18, 2021
@ptomato ptomato added this to the Stage "3.5" milestone Dec 8, 2022
@ptomato
Copy link
Collaborator Author

ptomato commented Dec 8, 2022

The Intl Era and Month Codes proposal exists for this purpose. This issue can remain open for the necessary integration between that proposal and this one, since engines will need to know the form of the codes in order to ship.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
calendar Part of the effort for Temporal Calendar API ecma402 Behavior specific to implementations supporting ecma402 non-prod-polyfill THIS POLYFILL IS NOT FOR PRODUCTION USE!
Projects
None yet
Development

No branches or pull requests

9 participants