bpo-33731: Implement support for locale specific format (WIP) #8612

james-emerton · 2018-08-02T03:34:54Z

Adds support for 'l' and 'L' which will format a string as per 'f'
except that they will use locale specific grouping, separators, and
decimal point. In the case of 'L' the LC_MONETARY values will be used.

https://bugs.python.org/issue33731

Adds support for 'l' and 'L' which will format a string as per 'f' except that they will use locale specific grouping, separators, and decimal point. In the case of 'L' the LC_MONETARY values will be used.

the-knights-who-say-ni · 2018-08-02T03:34:57Z

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

When your account is ready, please add a comment in this pull request
and a Python core developer will remove the CLA not signed label
to make the bot check again.

You can check yourself
to see if the CLA has been received.

Thanks again for your contribution, we look forward to reviewing it!

Compare the output of the 'l' and 'L' formats against the output of `locale.format_string()`. The locale data seems to differ between platforms, so testing against literals is challenging. (Other format tests are explicitly providing locale data, but that approach would fail to properly test my modifications.) Also added a test against literals for the en_US locale in the hopes that it's consistent across platforms.

james-emerton · 2018-08-04T21:41:56Z

I think I need some guidance on testing this correctly. I see that the existing format tests for the Decimal type are passing locale data explicitly via an undocumented parameter. Unfortunately, using this approach doesn't really test my changes, which were actually to libmpdec.

I've run tests locally on both MacOS and Windows, and they pass on my local machine, but the Windows CI build is failing. I haven't yet tested a Linux build locally, for which I'll need to spin up another VM.

Finally, since this requires changes to libmpdec, is it acceptable to commit those changes here or do they first need to be incorporated into the upstream repository?

skrah · 2018-08-04T21:56:03Z

_decimal has the undocumented parameter, see test_n_format.

If I approve the changes, the patch can include libmpdec. It seems that there's still some discussion in the issue though. An immediate observation is that I'd prefer 'm' for monetary -- There is at least one open issue for the 'm' parameter, too.

james-emerton · 2018-08-04T22:07:36Z

I can certainly use the aforementioned parameter, but I feel that doing so doesn't really exercise the changes. This changes mpd_parse_format_str to treat the format as 'f' with the addition of locale data. Using the parameter to provide the locale data would test that we switch the format to 'f' but not that the correct locale data is being loaded. Maybe that's okay/the best we can do in this case?

The use of 'l' and 'L' was the suggestion of Eric Smith in his comment on bpo-34311. The intention being 'l' for locale and 'L' for monetary locale, as this format would not be exclusively for the monetary context. At any rate, I'm certainly open to changing the letters being used.

ericvsmith · 2018-08-04T22:26:28Z

I don't feel strongly about l and L. m certainly works for me, too. Has any other language broken ground on this? Can we follow their example?

james-emerton · 2018-08-04T23:25:59Z

I did a bit more looking around, and the Single UNIX Specification for printf provides a modifier that performs grouping. From http://man7.org/linux/man-pages/man3/printf.3.html

  '      For decimal conversion (i, d, u, f, F, g, G) the output is to
         be grouped with thousands' grouping characters if the locale
         information indicates any.  (See setlocale(3).)  Note that
         many versions of gcc(1) cannot parse this option and will
         issue a warning.  (SUSv2 did not include %'F, but SUSv3 added
         it.)

Thus, n should be equivalent to 'g and we should also support the modifier for f, o, x, d, and their uppercase equivalents. I think this is a better approach than introducing another type.

It appears that the C99 implementation provides no mechanism to use the values from LC_MONETARY in place of LC_NUMERIC. This distinction seems exceedingly rare in practice, and I'd personally be okay with leaving it out.

skrah · 2018-08-04T23:31:44Z

Currently we're using uppercase to mean "print an uppercase exponent". So:

'n' => regular_locale + 'g'
'N' => regular_locale + 'G'

'l' => regular_locale + 'f'
'L' => monetary_locale + 'f'?

Here I'd expect 'F', even though 'F' doesn't really do anything:

'L' => regular_locale + 'F'

libmpdec actually makes use of the regular convention:

if (isupper((uchar)type)) {
    type = tolower((uchar)type);
    flags |= MPD_FMT_UPPER;
}

Perhaps we can use a modifier like $n, $l for use monetary?

Upon further discussion and research it appears that we should be supporting grouping via the `'` modifier as per C99

This implements the `'` modifier in place of the thousands separator to enable locale specific grouping and decimal point.

james-emerton · 2018-08-05T01:04:16Z

I've now implemented this (for just Decimal so far) by accepting ' in place of the , or _ characters. (I noticed while was in here that Decimal doesn't currently support _)

I also see that those other two modifiers are documented in PEP 378 and PEP 515. Should I be writing this up as a PEP as well?

steelman · 2019-01-03T16:04:04Z

Interesting. Without knowing about this PR, I whipped up something similar (#11405). Clearly there is a demand for this feature.

skrah · 2019-01-04T11:06:03Z

@james-emerton Perhaps a mini-PEP that briefly lists the motivation and syntax alternatives (I still like $n) is a good way forward. Discussion is currently scattered among two GitHub issues and two bugs.python.org issues.

This is just my opinion, @ericvsmith is the format-language expert.

skrah · 2019-01-04T11:06:39Z

Also @steelman of course.

steelman · 2019-01-04T14:24:43Z

I've sent an e-mail to python-ideas, however, I can't see it in the archive yet.

james-emerton · 2019-01-27T22:24:49Z

Last I was working on this (months ago!) I started but never finished drafting a PEP. I've finished that off and added a bit about the alternative suggestions I've seen.

python/peps#886

skrah · 2019-01-30T10:37:40Z

@james-emerton Thanks, in the meantime Eric Smith has posted his preferred version here:

https://mail.python.org/pipermail/python-ideas/2019-January/054837.html

I now support that version. PEPs should probably be announced and discussed on python-ideas.

skrah · 2019-01-30T14:33:26Z

I suggest to continue the discussion at https://bugs.python.org/issue35638 and on python-ideas, so I'm closing this one now (we can reopen as appropriate later).

Implement support for locale specific decimal format

cbb7e9e

Adds support for 'l' and 'L' which will format a string as per 'f' except that they will use locale specific grouping, separators, and decimal point. In the case of 'L' the LC_MONETARY values will be used.

james-emerton requested review from rhettinger and skrah as code owners August 2, 2018 03:34

the-knights-who-say-ni added the CLA not signed label Aug 2, 2018

bedevere-bot added the awaiting review label Aug 2, 2018

the-knights-who-say-ni added CLA signed and removed CLA not signed labels Aug 4, 2018

james-emerton added 2 commits August 4, 2018 16:51

Reverting Decimal format changes

79b9b83

Upon further discussion and research it appears that we should be supporting grouping via the `'` modifier as per C99

bpo-33731: Add locale modifier to Decimal.__format__

6cb6064

This implements the `'` modifier in place of the thousands separator to enable locale specific grouping and decimal point.

james-emerton mentioned this pull request Jan 27, 2019

PEP 9999: Locale-aware fixed-point numeric formatting python/peps#886

Closed

skrah closed this Jan 30, 2019

JakubSzewczyk mannequin mentioned this pull request Apr 28, 2022

string formatting that produces floats with preset precision while respecting locale #77912

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-33731: Implement support for locale specific format (WIP) #8612

bpo-33731: Implement support for locale specific format (WIP) #8612

james-emerton commented Aug 2, 2018 •

edited by bedevere-bot

Loading

the-knights-who-say-ni commented Aug 2, 2018

james-emerton commented Aug 4, 2018

skrah commented Aug 4, 2018

james-emerton commented Aug 4, 2018 •

edited by bedevere-bot

Loading

ericvsmith commented Aug 4, 2018

james-emerton commented Aug 4, 2018 •

edited

Loading

skrah commented Aug 4, 2018 •

edited

Loading

james-emerton commented Aug 5, 2018

steelman commented Jan 3, 2019

skrah commented Jan 4, 2019

skrah commented Jan 4, 2019

steelman commented Jan 4, 2019

james-emerton commented Jan 27, 2019

skrah commented Jan 30, 2019

skrah commented Jan 30, 2019

bpo-33731: Implement support for locale specific format (WIP) #8612

bpo-33731: Implement support for locale specific format (WIP) #8612

Conversation

james-emerton commented Aug 2, 2018 • edited by bedevere-bot Loading

the-knights-who-say-ni commented Aug 2, 2018

james-emerton commented Aug 4, 2018

skrah commented Aug 4, 2018

james-emerton commented Aug 4, 2018 • edited by bedevere-bot Loading

ericvsmith commented Aug 4, 2018

james-emerton commented Aug 4, 2018 • edited Loading

skrah commented Aug 4, 2018 • edited Loading

james-emerton commented Aug 5, 2018

steelman commented Jan 3, 2019

skrah commented Jan 4, 2019

skrah commented Jan 4, 2019

steelman commented Jan 4, 2019

james-emerton commented Jan 27, 2019

skrah commented Jan 30, 2019

skrah commented Jan 30, 2019

james-emerton commented Aug 2, 2018 •

edited by bedevere-bot

Loading

james-emerton commented Aug 4, 2018 •

edited by bedevere-bot

Loading

james-emerton commented Aug 4, 2018 •

edited

Loading

skrah commented Aug 4, 2018 •

edited

Loading