-
-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
locale.format() input regression #54588
Comments
@mission[~:1001]% python2.7 -c "import locale; print locale.format('%.0f KB', 100)"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib/python2.7/locale.py", line 189, in format
"format specifier, %s not valid") % repr(percent))
ValueError: format() must be given exactly one %char format specifier, '%.0f KB' not valid
@mission[~:1002]% python2.6 -c "import locale; print locale.format('%.0f KB', 100)"
100 KB |
This was changed by bpo-2522 on purpose; no suffix is allowed in locale.format(). |
Okay, so line 187 of locale.py has this test: if not match or len(match.group())!= len(percent): the problematic part is the len test. When format string is '%.0f KB' match.group() is '%.0f' but of course percent is the full string. This seems like a bogus test, since clearly the given input is a valid format string. I'm not sure what the intent of this test is. The Python 2.6 test is: if percent[0] != '%': which is perhaps too naive. I guess I don't understand why this test is here. Wouldn't it make more sense to either just let any TypeError from _format() to percolate up, or to catch that TypeError and transform it into the ValueError? Why try to replicate the logic of str.__mod__()? |
Hmm. So I guess the answer is to use locale.format_string() instead. But the documentation for locale.format() is not entirely clear about the prohibition on trailing text. |
I agree the documentation isn't terribly clear on what a "%char specifier" or "whole format string" is. FWIW, this is also a 3.1 and greater issue. |
Yeah, obviously that language can be improved. 'exactly' was meant to imply 'nothing but', but clearly it doesn't. If we want to restore more stringent backward compatibility and allow trailing text, it would be possible to make format an alias for format_string. I'm not sure this is a good idea, but it is the most sensible way to restore backward compatibility while still fixing the original bug that I can think of. Or...perhaps there is little need of both 'format' and 'format_string' as public APIs, and we could deprecate (without removing) one of them. On the other hand, I believe the original bug affects the Ubuntu code that triggered this report...in other words, absent this fix chances are there would eventually have been a bug report against that code that would have necessitated that it change to use format_string anyway in order to get the correct locale-specific number formatting. |
Please use the deprecation process when possible. That would mean creating an alias for the function you want to remove somewhat like this (taken from configparser): def readfp(self, fp, filename=None):
"""Deprecated, use read_file instead."""
warnings.warn(
"This method will be removed in future versions. "
"Use 'parser.read_file()' instead.",
PendingDeprecationWarning, stacklevel=2
)
self.read_file(fp, source=filename) |
The bug has been fixed upstream by replacing .format() with .format_string(). I'm not sure I understand why there are two different methods - .format() seems kind of pointless to me, but then I don't use the locale module enough to say what's useful. For Python 2.7 I think the only thing we can do is to update the docs so that the distinction and restrictions are clear. |
Well, the distinction is that, before the bug fix that caused your issue, the 'format_string' method would use a regex to extract the % specifiers from the input string, and call 'format' to replace that % specifier with a properly localized result string. That is, 'format' was designed to handle a single % specifier with no extra text, basically as a helper method for format_string. The fact that it didn't reject extra text was, according to an internal comment, a defect of the implementation. (Passing any extra text would cause the implementation to fail to do the internationalization that was the entire reason for calling it.) When I fixed the bug I extracted the 'replace a single % specifier' code into an internal method, and made the format method live up to what I perceived to be its documented interface by rejecting extra input characters so that it could safely call the new internal substitution routine. Now, from the perspective of a *user* of the locale module, I fail to see the point in having both 'format' and 'format_string' exposed. If you want to format a single % specifier, just pass it to format_string. Thus my suggestion to make them both do the same thing (to cater to other code that may be calling format incorrectly) and then deprecate one of them (presumably format). To bad I didn't think of that when I fixed the original bug. |
On Nov 12, 2010, at 12:15 AM, R. David Murray wrote:
+1
Dang. |
msg120978 "The bug has been fixed upstream...". Have I missed something as on Windows Vista...? c:\Users\Mark\MyPython>python
Python 3.3.1rc1 (v3.3.1rc1:92c2cfb92405, Mar 25 2013, 22:39:19) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale; print(locale.format('%.0f KB', 100))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\python33\lib\locale.py", line 193, in format
"format specifier, %s not valid") % repr(percent))
ValueError: format() must be given exactly one %char format specifier, '%.0f KB' not valid |
Barry meant that the upstream program that triggered this error has been changed to call format_string() instead of format(). The bug still exists in format(). My suggestion is to have format() be an alias for format_string(). Deprecating format() is an optional step, but may not be worth the hassle. |
On Apr 02, 2013, at 11:32 AM, Eric V. Smith wrote:
Agreed on both counts. |
So I guess the question is: would this be a bug fix and applied to 2.7 and 3.3, or just an enhancement for 3.4? I think it would be a bug fix and thus should be backported. It's not like we'd be breaking any working code, unless it was expecting the exception. |
On Apr 02, 2013, at 03:04 PM, Eric V. Smith wrote:
That would be my preference. |
Oops. I merged the patch without coming back here first :(. Still getting used to the new workflow. It turns out that format has a parameter, monetary, that isn't supported by format_string. So what we did was add that parameter to format_string and deprecate format. If there is objection to this solution I will revert the merge. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: