-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support of local numbering systems for number symbols #1036
Add support of local numbering systems for number symbols #1036
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kiitos Thank you for the PR! 💪
Some comments, requests, and discussion within. Re the defaults, if you happen to know speakers of the affected locales (see comment), please tag them in too :)
scripts/import_cldr.py
Outdated
if default_number_system_node is not None: | ||
numbering_systems['default'] = str(default_number_system_node.text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be better to just have default_numbering_system
in the data
by itself, because otherwise numbering_systems
's type will be a silly dict[str, str | list[str]]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good point and it made me notice there's a bug. I followed #470 and the this comment (#470 (comment)) without thinking too much. There's no list so the correct type is actually just dict[str, str]
. So we can keep it as it is but just removed the list. Or would you like to still have default_numbering_system separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be worth to have a separate default_numbering_system
since I foresee that to be a pretty hot path, and saving one dict lookup might compound :)
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1036 +/- ##
==========================================
+ Coverage 91.34% 91.43% +0.08%
==========================================
Files 26 26
Lines 4415 4436 +21
==========================================
+ Hits 4033 4056 +23
+ Misses 382 380 -2 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick response and good comments!
How do you see the end goal how function like format_decimal
should work? Should it also convert the digits like this numbers.format_decimal(1.2, locale="ar", numbering_system='default) == '٢٫٣'
?
scripts/import_cldr.py
Outdated
if default_number_system_node is not None: | ||
numbering_systems['default'] = str(default_number_system_node.text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good point and it made me notice there's a bug. I followed #470 and the this comment (#470 (comment)) without thinking too much. There's no list so the correct type is actually just dict[str, str]
. So we can keep it as it is but just removed the list. Or would you like to still have default_numbering_system separately?
9216b12
to
c9f7ac8
Compare
I mean, if that's a sensible formatting, I think so, yeah. I do wonder if we should do one version where the high-level formatting functions still use
? I'm just kind of worried about putting people through the trouble of their formatting suddenly changing with a "routine" Babel upgrade. |
f0375f3
to
b0d8b60
Compare
Stats about numbering systems regarding symbols
Conclusion: There's no need to have traditional alias. Also I feel Next steps on this PR Could we preliminarily agree on sensible signature for functions so I can't start implementing it so there's less risk I make a lot of code we don't agree on? One option could be to add something like It would be nicer if it don't have to call What do you think? Do you have more suitable suggestions?
Other thoughts I think it would be quite easy to validate digit formatting against eg. the java/kotlin number formatter from the standard library to get more confidence (or to findout there's more nuances) that converting digits is as simple as 1-to-1 mapping for all number systems having 10 digits. Here's code that I used to get stats on the beginning of post. Basically calculating number of failures or successes depending on the case. Just for the reference.
|
Sorry for the delay, had a bunch of other things to attend to...
Sounds good, but let's call the argument
That is the case, yep. That's somewhat analogous to how you can pass in a locale identifier to almost anything, and we call The API surface looks good to me at first glance aside from that naming change :) |
@kajte I'm planning on trying to get a new Babel out maybe this week or early next week, and I think we could at least include the numbering-system loading code from this PR, even if there'd be no surface changes that'd allow to use them just yet, if you have the time to crop this PR to only that? IOW, pretty much just change the |
I wrote some code last week that I didn't have time to complete yet. I can check today or tomorrow whether it's close to done. If not, your suggestions sounds good. |
b0d8b60
to
0bfc336
Compare
Okay, I got something done now.
How does it look? If needed I can also change it what you suggested on Wednesday so that we only want to include that part to the next release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking pretty good!
babel/core.py
Outdated
""" | ||
return self._data['number_symbols'] | ||
|
||
@property | ||
def other_numbering_systems(self) -> localedata.LocaleDataDict: | ||
"""Mapping of othern numbering systems. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Mapping of othern numbering systems. | |
"""Mapping of other numbering systems. |
What does "other" mean here? Could we elucidate that some?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cldr spec calls them Other numbering systems: https://www.unicode.org/reports/tr35/tr35-numbers.html#otherNumberingSystems But I agree it's not very clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it now to:
Mapping of other numbering systems available for the locale.
See: https://www.unicode.org/reports/tr35/tr35-numbers.html#otherNumberingSystems
babel/numbers.py
Outdated
def __init__(self, numbering_system: str, locale: Locale | str | None) -> None: | ||
super().__init__(f"Unknown numbering system {numbering_system} for Locale {locale}.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just format this error message at the call site since you're not putting the numbering_system
and locale
into fields in the exception anyway.
Alternately, put them in there :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I formatted it at the call
babel/numbers.py
Outdated
class UnsupportedNumberingSystemError(Exception): | ||
"""Exception thrown when an unsupported numbering system is requested for the given Locale.""" | ||
|
||
def __init__(self, numbering_system: str, locale: Locale | str | None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does need to be (and currently is) constructed with an actual Locale.
def __init__(self, numbering_system: str, locale: Locale | str | None) -> None: | |
def __init__(self, numbering_system: str, locale: Locale) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whole __init__
function based on the comment above
babel/numbers.py
Outdated
:param numbering_system: The numbering system used for fetching the symbol. Defaults to "latn", special | ||
value "default" will use the default numbering system of the locale. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:param numbering_system: The numbering system used for fetching the symbol. Defaults to "latn", special | |
value "default" will use the default numbering system of the locale. | |
:param numbering_system: The numbering system used for fetching the symbol. Defaults to "latn". | |
The special value "default" will use the default numbering system of the locale. |
(Consider this repeated for all param numbering_system
s :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
572a890
to
5113888
Compare
- Import number symbols for available numbering systems from cldr data - Add default_numbering_system and other_numbering_systems properties for Locale - Add numbering_system argument to relevant number formatting fuctions and use number symbols based on the given numbering system Fixes partially issue python-babel#446
5113888
to
0509ae4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Just applying some minor formatting fixes via review.)
assert numbers.format_percent(134.5, locale='ar_EG', numbering_system="default") == '13٬450%' | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert numbers.format_percent(134.5, locale='ar_EG', numbering_system="default") == '13٬450%' | |
assert numbers.format_percent(134.5, locale='ar_EG', numbering_system="default") == '13٬450%' |
9032583
to
abe6bda
Compare
@kajte Thanks! 🎉 |
How should I run linters locally to catch these? I installed the pre-commit hook and it passed for my commit. |
Thanks for your support! I have couple busy weeks ahead but I'll try continue with formatting actual numerals after that. |
I think you hadn't rebased on the
Thank you for the contribution! Same here TBH, and then it'll be kinkunpaistoaika and – – 😁 |
Changes:
Example:
Before:
Now:
Fixes partially issue:#446
I took some inspiration from the closed PR: #470
Couple questions:
latn
to the default numbering system for locales which default system isn'tlatn
? Eg. for before the changefa_IR
usedlatn
but now it starts usingarabext
.I would eventually like to fix other
# TODO: Support other number systems
TODOs from the code and take default numbering system to use in percentage, currency etc. formatting and add a function for formatting digits to the local numbering system but I decided to create a pull request at this point to check the direction I'm taking is ok for maintainers. I feel even features added in this PR are an enhancement as such.