Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

locale.format() input regression #54588

Closed
warsaw opened this issue Nov 9, 2010 · 19 comments
Closed

locale.format() input regression #54588

warsaw opened this issue Nov 9, 2010 · 19 comments
Labels
3.7 docs easy type-bug

Comments

@warsaw
Copy link
Member

@warsaw warsaw commented Nov 9, 2010

BPO 10379
Nosy @malemburg, @warsaw, @amauryfa, @ericvsmith, @bitdancer, @ambv, @berkerpeksag, @wm75
PRs
  • #259
  • #1145
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-04-20.04:39:00.354>
    created_at = <Date 2010-11-09.23:33:25.736>
    labels = ['easy', 'type-bug', '3.7', 'docs']
    title = 'locale.format() input regression'
    updated_at = <Date 2017-04-20.04:39:00.353>
    user = 'https://github.com/warsaw'

    bugs.python.org fields:

    activity = <Date 2017-04-20.04:39:00.353>
    actor = 'berker.peksag'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-04-20.04:39:00.354>
    closer = 'berker.peksag'
    components = ['Documentation']
    creation = <Date 2010-11-09.23:33:25.736>
    creator = 'barry'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 10379
    keywords = ['easy']
    message_count = 19.0
    messages = ['120905', '120907', '120908', '120909', '120912', '120921', '120922', '120978', '120989', '120994', '185792', '185828', '185838', '185840', '185841', '290734', '290735', '291686', '291945']
    nosy_count = 9.0
    nosy_names = ['lemburg', 'barry', 'amaury.forgeotdarc', 'eric.smith', 'r.david.murray', 'docs@python', 'lukasz.langa', 'berker.peksag', 'wolma']
    pr_nums = ['259', '1145']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue10379'
    versions = ['Python 3.7']

    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Nov 9, 2010

    @mission[~:1001]% python2.7 -c "import locale; print locale.format('%.0f KB', 100)"
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/usr/lib/python2.7/locale.py", line 189, in format
        "format specifier, %s not valid") % repr(percent))
    ValueError: format() must be given exactly one %char format specifier, '%.0f KB' not valid
    @mission[~:1002]% python2.6 -c "import locale; print locale.format('%.0f KB', 100)"
    100 KB

    @amauryfa
    Copy link
    Member

    @amauryfa amauryfa commented Nov 9, 2010

    This was changed by bpo-2522 on purpose; no suffix is allowed in locale.format().

    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Nov 9, 2010

    Okay, so line 187 of locale.py has this test:

        if not match or len(match.group())!= len(percent):

    the problematic part is the len test. When format string is '%.0f KB' match.group() is '%.0f' but of course percent is the full string.

    This seems like a bogus test, since clearly the given input is a valid format string. I'm not sure what the intent of this test is. The Python 2.6 test is:

        if percent[0] != '%':

    which is perhaps too naive.

    I guess I don't understand why this test is here. Wouldn't it make more sense to either just let any TypeError from _format() to percolate up, or to catch that TypeError and transform it into the ValueError? Why try to replicate the logic of str.__mod__()?

    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Nov 10, 2010

    Hmm. So I guess the answer is to use locale.format_string() instead. But the documentation for locale.format() is not entirely clear about the prohibition on trailing text.

    @ericvsmith
    Copy link
    Member

    @ericvsmith ericvsmith commented Nov 10, 2010

    I agree the documentation isn't terribly clear on what a "%char specifier" or "whole format string" is.

    FWIW, this is also a 3.1 and greater issue.

    @ericvsmith ericvsmith added the docs label Nov 10, 2010
    @bitdancer
    Copy link
    Member

    @bitdancer bitdancer commented Nov 10, 2010

    Yeah, obviously that language can be improved. 'exactly' was meant to imply 'nothing but', but clearly it doesn't.

    If we want to restore more stringent backward compatibility and allow trailing text, it would be possible to make format an alias for format_string. I'm not sure this is a good idea, but it is the most sensible way to restore backward compatibility while still fixing the original bug that I can think of.

    Or...perhaps there is little need of both 'format' and 'format_string' as public APIs, and we could deprecate (without removing) one of them.

    On the other hand, I believe the original bug affects the Ubuntu code that triggered this report...in other words, absent this fix chances are there would eventually have been a bug report against that code that would have necessitated that it change to use format_string anyway in order to get the correct locale-specific number formatting.

    @ambv
    Copy link
    Contributor

    @ambv ambv commented Nov 10, 2010

    Please use the deprecation process when possible. That would mean creating an alias for the function you want to remove somewhat like this (taken from configparser):

            def readfp(self, fp, filename=None):
                """Deprecated, use read_file instead."""
                warnings.warn(
                    "This method will be removed in future versions.  "
                    "Use 'parser.read_file()' instead.",
                    PendingDeprecationWarning, stacklevel=2
                )
                self.read_file(fp, source=filename)

    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Nov 11, 2010

    The bug has been fixed upstream by replacing .format() with .format_string(). I'm not sure I understand why there are two different methods - .format() seems kind of pointless to me, but then I don't use the locale module enough to say what's useful.

    For Python 2.7 I think the only thing we can do is to update the docs so that the distinction and restrictions are clear.

    @bitdancer
    Copy link
    Member

    @bitdancer bitdancer commented Nov 12, 2010

    Well, the distinction is that, before the bug fix that caused your issue, the 'format_string' method would use a regex to extract the % specifiers from the input string, and call 'format' to replace that % specifier with a properly localized result string. That is, 'format' was designed to handle a single % specifier with no extra text, basically as a helper method for format_string. The fact that it didn't reject extra text was, according to an internal comment, a defect of the implementation. (Passing any extra text would cause the implementation to fail to do the internationalization that was the entire reason for calling it.)

    When I fixed the bug I extracted the 'replace a single % specifier' code into an internal method, and made the format method live up to what I perceived to be its documented interface by rejecting extra input characters so that it could safely call the new internal substitution routine.

    Now, from the perspective of a *user* of the locale module, I fail to see the point in having both 'format' and 'format_string' exposed. If you want to format a single % specifier, just pass it to format_string. Thus my suggestion to make them both do the same thing (to cater to other code that may be calling format incorrectly) and then deprecate one of them (presumably format).

    To bad I didn't think of that when I fixed the original bug.

    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Nov 12, 2010

    On Nov 12, 2010, at 12:15 AM, R. David Murray wrote:

    Now, from the perspective of a *user* of the locale module, I fail to see the
    point in having both 'format' and 'format_string' exposed. If you want to
    format a single % specifier, just pass it to format_string. Thus my
    suggestion to make them both do the same thing (to cater to other code that
    may be calling format incorrectly) and then deprecate one of them (presumably
    format).

    +1

    To bad I didn't think of that when I fixed the original bug.

    Dang.

    @BreamoreBoy
    Copy link
    Mannequin

    @BreamoreBoy BreamoreBoy mannequin commented Apr 2, 2013

    msg120978 "The bug has been fixed upstream...". Have I missed something as on Windows Vista...?

    c:\Users\Mark\MyPython>python
    Python 3.3.1rc1 (v3.3.1rc1:92c2cfb92405, Mar 25 2013, 22:39:19) [MSC v.1600 32 bit (Intel)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import locale; print(locale.format('%.0f KB', 100))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "c:\python33\lib\locale.py", line 193, in format
        "format specifier, %s not valid") % repr(percent))
    ValueError: format() must be given exactly one %char format specifier, '%.0f KB' not valid

    @ericvsmith
    Copy link
    Member

    @ericvsmith ericvsmith commented Apr 2, 2013

    Barry meant that the upstream program that triggered this error has been changed to call format_string() instead of format(). The bug still exists in format().

    My suggestion is to have format() be an alias for format_string(). Deprecating format() is an optional step, but may not be worth the hassle.

    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Apr 2, 2013

    On Apr 02, 2013, at 11:32 AM, Eric V. Smith wrote:

    My suggestion is to have format() be an alias for
    format_string(). Deprecating format() is an optional step, but may not be
    worth the hassle.

    Agreed on both counts.

    @ericvsmith
    Copy link
    Member

    @ericvsmith ericvsmith commented Apr 2, 2013

    So I guess the question is: would this be a bug fix and applied to 2.7 and 3.3, or just an enhancement for 3.4?

    I think it would be a bug fix and thus should be backported. It's not like we'd be breaking any working code, unless it was expecting the exception.

    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Apr 2, 2013

    On Apr 02, 2013, at 03:04 PM, Eric V. Smith wrote:

    I think it would be a bug fix and thus should be backported. It's not like
    we'd be breaking any working code, unless it was expecting the exception.

    That would be my preference.

    @ericvsmith ericvsmith added the type-bug label Nov 24, 2016
    @bitdancer
    Copy link
    Member

    @bitdancer bitdancer commented Mar 28, 2017

    New changeset 1cf93a7 by R. David Murray (Garvit Khatri) in branch 'master':
    bpo-10379: add 'monetary' to format_string, deprecate format
    1cf93a7

    @bitdancer
    Copy link
    Member

    @bitdancer bitdancer commented Mar 28, 2017

    Oops. I merged the patch without coming back here first :(. Still getting used to the new workflow.

    It turns out that format has a parameter, monetary, that isn't supported by format_string. So what we did was add that parameter to format_string and deprecate format.

    If there is objection to this solution I will revert the merge.

    @berkerpeksag
    Copy link
    Member

    @berkerpeksag berkerpeksag commented Apr 15, 2017

    I think the solution in PR 259 is great, but I still find the documentation a little bit vague. I've just opened PR 1145 to add some examples specifiers.

    @berkerpeksag
    Copy link
    Member

    @berkerpeksag berkerpeksag commented Apr 20, 2017

    New changeset 6dbdedb by Berker Peksag in branch 'master':
    bpo-10379: Add %char examples to locale.format() docs (GH-1145)
    6dbdedb

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 docs easy type-bug
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants