Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why doesn't Intl.Locale include the Ethiopic numbering system for Ethiopic Locales #132

Open
Eazash opened this issue Aug 8, 2023 · 9 comments
Labels

Comments

@Eazash
Copy link

Eazash commented Aug 8, 2023

When using the ECMASCRIPT INTL.Locale API to resolve the numbering system for Ethiopic locales, it only resolves to the latin locale. Per my testing, most Ethiopic Locales (e.g. Amharic, Afar, Ge'ez) do not include the Ethiopic numbering system. For example...

const am = new Intl.Locale("am-ET")
console.log(am.numberingSystems)
// ['latn']

The above code should have outputed ['latn', 'ethi']. For reference, the Unicode Common Locale Data Repository lists both the latin and ethiopic numberinc systems(full json permalink can be found here).

{
  "main": {
    "am": {
      "identity": {
        "version": {
          "_cldrVersion": "43"
        },
        "language": "am"
      },
      "numbers": {
        "defaultNumberingSystem": "latn",
        "otherNumberingSystems": {
          "native": "latn",
          "traditional": "ethi"
        }
	}
  }
}

As a follow up question, why is the default numbering system for amharic Latin? Shouldn't that be Ethiopic?

@Eazash Eazash added the question label Aug 8, 2023
@andjc
Copy link

andjc commented Aug 8, 2023

Second question first, best to ask on a CLDR related list. Although, I assume a decimal based system would allow easier computations, and a significant number of users use Arabic digits by default.

For the first question, I assume that ICU is being used under the hood, and going by the API docs, currently, ICU only supports numbering systems whose radix is 10.

Also the relevant language tag would be am-ET-u-nu-ethi. By default locales tend to use Arabic digits, unless an alternative is specified when the locale instance is created.

@dyacob
Copy link
Member

dyacob commented Aug 9, 2023

Hi Ezira, it's great to read that someone is actually trying to create ordered lists with the Ethiopic numbers 😊. I don't know the ECMA APIs, but in CSS you can create an Ethiopic numeral list with:

<ul style="list-style-type: ethiopic-numeric;">
  <li> ... </li>
   ...
</ul>

In answer to your follow-up question, as you've seen the western numerals have been the most used in Ethiopia for some time for general math, commerce, phone numbers, page numbers, and with lists. So it became the default for Amharic locales. A Ge'ez language locale would be the only one that should use the Ethiopic (Ge'ez) numerals by default. If a Ge'ez language ("gez") locale is not using the Ethiopic numerals, I would consider it broken. It may be the case that a developer team did not have time to implement a numeral conversion for a locale (ICU has an API for conversion, but the locale APIs might not be utilizing it).

@andjc
Copy link

andjc commented Aug 10, 2023

@Eazash, ICU4C and ICU4J support the Ethiopic Numeral System (and other algorithmic number systems) using the RuleBasedNumberSystem class, via inbuilt RBNF rule sets. RBNF is not support in ICU4X. ECMASCRIPT doesn't support RBNF or the RuleBasedNumberSystem class.

There is an open issue (from 2016) for supporting algorithmic number systems in ECMA-402, but that would require support for RBNF. A number of concerns have been raised about the impact of including RBNF support. It probably needs a champion from within the ECMA-402 developers or editors.

It is possible to handle the Ethiopic Number System on the server side if you are using Node.js or a Python web framework and integrate PyICU. Unfortunately PHP Intl does not seem to support the RuleBasedNumberSystem, although I guess it could be patched to do so.

Currently, client side solutions would require writing your own functions for formatting and parsing Ethiopic numbers independently of Intl.

@Eazash
Copy link
Author

Eazash commented Aug 11, 2023

Thanks @andjc. I'm just learning about what ECMA-402 and ICU are. My original intent is to localize numbers to the Ethiopic System.

Also the relevant language tag would be am-ET-u-nu-ethi. By default locales tend to use Arabic digits, unless an alternative is specified when the locale instance is created.

Even when specifying the numbering system like above, it still defaults to the latin numbering system.

locale = new Intl.Locale("am-ET-u-nu-ethi")
formatter = new Intl.NumberFormat(locale)
console.log(locale.numberingSystems)
// ['ethi']
console.log(formatter.resolvedOptions().numberingSystem)
// 'latn'

There is an open issue (from 2016) for supporting algorithmic number systems in ECMA-402, but that would require support for RBNF

The issue seems like it's going to take a while to be fully resolved as a rewrite is being discussed. For my current use-case however, manually implementing the algorithm seems to be the optimal choice.

@Eazash
Copy link
Author

Eazash commented Aug 11, 2023

@dyacob CLDR lists the default numbering system for both gez and gez-ER to be latn. Oddly enough, it doesn't list ethi as an option for the locales like it does with am. I'd appreciate some help/insight into researching this and hopefully reporting it to CLDR.
The list-style-type method can be a nice hack to get what I want though, thanks for that.

@andjc
Copy link

andjc commented Aug 12, 2023

@Eazash its possible that some of the other locales should be updated to include ethi as a traditional number system. Although, some algorithmic number systems (including Ethiopic) are defined in the root locale's RBNF file, and thus accessible to other locales through inheritance.

In icu4c and icu4j decimal numbers use the NumberFormatter class to handle formatting and parsing, but algorithmic number systems, ordinal numbers and spelled out numbers (Numbers rendered as words) are all handled by the RuleBasedNumberFormat class instead.

Currently, ECMA-402 does not support any of the operations that the RuleBasedNumberFormat class is used for. And although ICU is a common solution used by Firefox and Chromium based browsers, not all ECMAScript engines use ICU, So there are concerns for the overheads created for non-ICU based implementations.

ICU4X also seems to have not implemented RBNF yet either. The reality is that client-side locale solutions are much more limited that server-side solutions.

@andjc
Copy link

andjc commented Aug 12, 2023

@Eazash, CLDR locales , each have different levels of completeness and support. Both am and ti locales include ethiopic as a traditional number system. gez does not.

am has spellout rules, but ti and gez don't. Locale coverage isn't uniform.

@srl295
Copy link

srl295 commented Oct 5, 2023

@dyacob CLDR lists the default numbering system for both gez and gez-ER to be latn. Oddly enough, it doesn't list ethi as an option for the locales like it does with am. I'd appreciate some help/insight into researching this and hopefully reporting it to CLDR.
The list-style-type method can be a nice hack to get what I want though, thanks for that.

You can report it to https://unicode-org.atlassian.net  — better yet, if you'd like to contribute to the gez locale you can sign up at https://cldr.unicode.org/index/survey-tool/survey-tool-accounts

@srl295
Copy link

srl295 commented Oct 5, 2023

@Eazash, CLDR locales , each have different levels of completeness and support. Both am and ti locales include ethiopic as a traditional number system. gez does not.

am has spellout rules, but ti and gez don't. Locale coverage isn't uniform.

It's not uniform, but it is documented. There's a chart here https://www.unicode.org/cldr/charts/dev/supplemental/locale_coverage.html that shows that am is at modern level, ti at basic, and gez is not even at basic level.

contributions are welcome, feel free to contact me directly by any method if interested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants