Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce fixed point locale aware format type for floating point numbers #79819

Open
steelman mannequin opened this issue Jan 2, 2019 · 18 comments
Open

Introduce fixed point locale aware format type for floating point numbers #79819

steelman mannequin opened this issue Jan 2, 2019 · 18 comments
Labels
3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@steelman
Copy link
Mannequin

steelman mannequin commented Jan 2, 2019

BPO 35638
Nosy @rhettinger, @mdickinson, @vstinner, @ericvsmith, @skrah, @serhiy-storchaka, @steelman, @huftis
PRs
  • bpo-35638: Introduce fixed point locale aware format type #11405
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-01-02.11:25:01.876>
    labels = ['interpreter-core', 'type-feature', '3.9']
    title = 'Introduce fixed point locale aware format type for floating point numbers'
    updated_at = <Date 2019-09-16.08:15:57.874>
    user = 'https://github.com/steelman'

    bugs.python.org fields:

    activity = <Date 2019-09-16.08:15:57.874>
    actor = 'vstinner'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2019-01-02.11:25:01.876>
    creator = 'steelman'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 35638
    keywords = ['patch', 'patch', 'patch', 'patch']
    message_count = 18.0
    messages = ['332863', '332876', '332877', '332878', '332879', '332880', '332890', '332928', '332930', '332931', '332932', '333000', '333052', '333270', '333296', '333297', '333315', '352492']
    nosy_count = 8.0
    nosy_names = ['rhettinger', 'mark.dickinson', 'vstinner', 'eric.smith', 'skrah', 'serhiy.storchaka', 'steelman', 'huftis']
    pr_nums = ['11405']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue35638'
    versions = ['Python 3.9']

    @steelman
    Copy link
    Mannequin Author

    steelman mannequin commented Jan 2, 2019

    It is currently impossible to format floating point numbers with an arbitrary number of decimal digits AND the decimal point matching locale settings. For example no current format allows to display numbers ranging from 1 to 1000 with exactly two decimal digits.

    @steelman steelman mannequin added stdlib Python modules in the Lib dir 3.7 (EOL) end of life 3.8 only security fixes type-feature A feature request or enhancement labels Jan 2, 2019
    @ericvsmith
    Copy link
    Member

    Since this is a new feature, it can only be added to 3.8. Adjusting versions accordingly.

    I suggest that if we add this at all, it only be added to __format__, not to %-formatting.

    Any suggestions on a specification for this?

    @ericvsmith ericvsmith added interpreter-core (Objects, Python, Grammar, and Parser dirs) and removed stdlib Python modules in the Lib dir 3.7 (EOL) end of life labels Jan 2, 2019
    @steelman
    Copy link
    Mannequin Author

    steelman mannequin commented Jan 2, 2019

    I've got the patch. I will push it to github as soon as I can (some technical issues).

    @ericvsmith
    Copy link
    Member

    Before a patch is created, we should discuss the behavior that will be implemented and agree on it. What is your suggestion?

    @ericvsmith
    Copy link
    Member

    Of course, feel free to create a PR. But the correct place to discuss any new behavior is on the issue tracker, or maybe on python-ideas, not in a PR.

    @steelman
    Copy link
    Mannequin Author

    steelman mannequin commented Jan 2, 2019

    I have created a new format "m" that is for "n", what "f" is for "g". The patch for string.rst says

    +---------+----------------------------------------------------------+
    | 'm' | Number. This is the same as 'f', except that it uses |
    | | the current locale setting to insert the appropriate |
    | | number separator characters. |
    +---------+----------------------------------------------------------+

    My patch only applies to floats not integers.

    @ericvsmith
    Copy link
    Member

    I haven't looked at this closely yet, but you'll need to at least:

    • add tests that the locale-aware formatting is happening
    • support decimal
    • make sure it works with complex (which it probably does, but needs a test)

    And, I think we'll need to run this through python-ideas first. One thing I expect to come up there: why f and not g?

    Again, I haven't looked through the code yet, or really even given any thought to determining if this is a sound idea.

    @ericvsmith ericvsmith changed the title Introduce fixed point locale awear format type for floating point numbers Introduce fixed point locale aware format type for floating point numbers Jan 2, 2019
    @steelman
    Copy link
    Mannequin Author

    steelman mannequin commented Jan 3, 2019

    I haven't looked at this closely yet, but you'll need to at least:

    • add tests that the locale-aware formatting is happening

    Done.

    • support decimal
    • make sure it works with complex

    Good points. Done. Please note, that there is an inconsistency between float/complex/int/_pydecimal(!) and decimal. The former provide only 'n' format type and the latter provides 'n' and 'N'. So I implemented 'm' and 'M' for decimal and 'm' for _pydecimal.

    (which it probably does, but needs a test)

    There are no tests for 'n'. Should I create for both 'm' and 'n'?

    And, I think we'll need to run this through python-ideas first. One thing I expect to come up there: why f and not g?

    Because 'g' has been already covered with 'n'.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jan 3, 2019

    I think there's another open GitHub issue for this, and yes, probably
    it should be discussed on python-ideas, too.

    My main concern with 'm' for libmpdec is that I'd like to reserve it
    for LC_MONETARY. There was one OS X issue that would have been solved
    by adding LC_MONETARY support.

    On the other hand perhaps '$' would also be possible for monetary.

    So it appears that there might be some bikeshedding about the names
    or whether the feature is needed at all.

    @steelman
    Copy link
    Mannequin Author

    steelman mannequin commented Jan 3, 2019

    As much as I am open to any suggestions for naming and such (although I think 'm' together with 'n' are a good supplement for 'f' and 'g'), I really would like to introduce a method to format numbers with fixed number of decimal digits (it looks good in tables) and with separators from locale.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jan 3, 2019

    For reference, the (one of the?) other GitHub issue(s) is here:

    #8612

    It actually proposes to use LC_MONETARY.

    @serhiy-storchaka
    Copy link
    Member

    You can use locale.format_string() for locale aware formatting.

    @steelman
    Copy link
    Mannequin Author

    steelman mannequin commented Jan 5, 2019

    Indeed. Thank you. I was sure I had tried this. However, this is still only a workaround and not the solution I need. I am working on a project now which uses pint https://pint.readthedocs.io/en/latest/ which uses format() and its relatives.

    With "n" format present Python is missing locale-aware "f" formatter anyway.

    @vstinner
    Copy link
    Member

    vstinner commented Jan 9, 2019

    Łukasz Stelmach:

    It is currently impossible to format floating point numbers with an arbitrary number of decimal digits AND the decimal point matching locale settings.

    I would like to warn you that handling properly locales can be very tricky. I just wrote an article about that:
    https://vstinner.github.io/locale-bugfixes-python3.html

    Stefan Krah:

    My main concern with 'm' for libmpdec is that I'd like to reserve it
    for LC_MONETARY.

    Since it seems like we are still at the "idea" stage, would it make sense to add a function which accept options to choose how to format a number?

    • decimal point
    • thousands separator
    • grouping

    Because there are more and more format variants. See for example Python/formatter_unicode.c. It has 5 "locale types":

    • LT_NO_LOCALE
    • LT_DEFAULT_LOCALE
    • LT_UNDERSCORE_LOCALE
    • LT_UNDER_FOUR_LOCALE
    • LT_CURRENT_LOCALE

    and it uses this structure:

    /* Locale info needed for formatting integers and the part of floats
       before and including the decimal. Note that locales only support
       8-bit chars, not unicode. */
    typedef struct {
        PyObject *decimal_point;
        PyObject *thousands_sep;
        const char *grouping;
        char *grouping_buffer;
    } LocaleInfo;

    There is the locale but also "underscore" separator for thousands: see PEP-515.

    I'm not talking about adding something into format(), but add a method to float maybe. Or add a function somewhere else.

    --

    By the way, the decimal module doesn't support properly the following corner case: LC_NUMERIC using an encoding different than LC_CTYPE encoding. I wrote #5191 but I abandonned my change.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Jan 9, 2019

    Since it seems like we are still at the "idea" stage, would it make sense to add a function which accept options to choose how to format a number?

    Maybe, but I think for format() Eric's latest proposal on python-ideas is great ("*f" for "f + LC_NUMERIC", "$f" for "f + LC_MONETARY".

    For me that's sufficient. Does locale.format_string() handle the other cases?

    By the way, the decimal module doesn't support properly the following corner case: LC_NUMERIC using an encoding different than LC_CTYPE encoding. I wrote #5191 but I abandonned my change.

    Well, I *discovered and opened* bpo-7442 several years ago, and you said:

    "I see that various people contributed to the issue, but it looks like the only user asking for the request is Stefan Krah. I prefer to close the issue and wait until more users ask for it before considering again the patch, or find a different way to implement the feature (support LC_NUMERIC and LC_CTYPE locales using a different encoding)."

    So why would you think that I'm not aware of that issue? It has low priority for me and I hesitate to depend on the official locale functions in decimal because I don't want to be involved in additional issue reports in that area.

    @steelman
    Copy link
    Mannequin Author

    steelman mannequin commented Jan 9, 2019

    I'd appreciate, if we continued the discussion at python-ideas, where I posted the idea[1]. There has already been several valuable comments.

    [1] https://mail.python.org/pipermail/python-ideas/2019-January/054793.html

    @vstinner
    Copy link
    Member

    vstinner commented Jan 9, 2019

    By the way, the decimal module doesn't support properly the following corner case: LC_NUMERIC using an encoding different than LC_CTYPE encoding. I wrote #5191 but I abandonned my change.

    FYI I opened bpo-35697 to discuss the decimal module case.

    @rhettinger
    Copy link
    Contributor

    I had thought that use a locales were deemed an anti-pattern (not easy-to-use, not thread-safe, etc).

    @rhettinger rhettinger added 3.9 only security fixes and removed 3.8 only security fixes labels Sep 15, 2019
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    Status: No status
    Development

    No branches or pull requests

    4 participants