PyUnicode_FromFormat integer format handling different from printf about zeropad #72601

zhangyangyu · 2016-10-11T08:56:20Z

BPO	28415
Nosy	@terryjreedy, @vstinner, @ericvsmith, @serhiy-storchaka, @ztane, @zhangyangyu, @mlouielu
PRs	bpo-28415: Note 0 conversion different between Python and C #885

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2017-04-27.03:38:33.019>
created_at = <Date 2016-10-11.08:56:20.071>
labels = ['interpreter-core', 'easy', 'type-bug', '3.7', 'docs']
title = 'PyUnicode_FromFormat integer format handling different from printf about zeropad'
updated_at = <Date 2017-04-27.03:38:33.017>
user = 'https://github.com/zhangyangyu'

bugs.python.org fields:

activity = <Date 2017-04-27.03:38:33.017>
actor = 'xiang.zhang'
assignee = 'docs@python'
closed = True
closed_date = <Date 2017-04-27.03:38:33.019>
closer = 'xiang.zhang'
components = ['Documentation', 'Interpreter Core']
creation = <Date 2016-10-11.08:56:20.071>
creator = 'xiang.zhang'
dependencies = []
files = []
hgrepos = []
issue_num = 28415
keywords = ['easy']
message_count = 7.0
messages = ['278467', '278528', '278666', '290776', '290881', '291175', '292392']
nosy_count = 8.0
nosy_names = ['terry.reedy', 'vstinner', 'eric.smith', 'docs@python', 'serhiy.storchaka', 'ztane', 'xiang.zhang', 'louielu']
pr_nums = ['885']
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue28415'
versions = ['Python 3.7']

zhangyangyu · 2016-10-11T08:56:20Z

Although declared *exactly equivalent* to printf in the doc, PyUnicode_FromFormat could generate different result from printf with the same format.

For example:

from ctypes import pythonapi, py_object, c_int
f = getattr(pythonapi, 'PyUnicode_FromFormat')
f.restype = py_object
f(b'%010.5d', c_int(100))
'0000000100'

while printf outputs:

printf("%010.5d\n", 100);
     00100

I use both gcc and clang to compile and get the same result. gcc gives me a warning:

warning: '0' flag ignored with precision and ‘%d’ gnu_printf format

I am not sure this should be fixed. It seems the change could break backwards compatibility.

ztane · 2016-10-12T12:36:08Z

To be more precise, C90, C99, C11 all say that ~"For d, i, o, u, x and X conversions, if a precision is specified, the 0 flag will be ignored."

terryjreedy · 2016-10-14T21:21:57Z

I presume that PyUnicode_FromFormat is responsible for the first of the following:
>>> '%010.5d' % 100
'0000000100'
>>> b'%010.5d' % 100
b'0000000100'

I am strongly of the opinion that the behavior should be left alone and the C-API doc changed by either 1) replacing 'exactly' with 'nearly' or 2) adding the following: "except that a 0 conversion flag is not ignored when a precision is given for d, i, o, u, x and X conversion types" (and other exceptions as discovered).

I took the terms 'conversion flag' and 'conversion type' from
https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting
https://docs.python.org/3/library/stdtypes.html#printf-style-bytes-formatting

I consider the Python behavior to be superior.  The '0' conversion flag, the '.' precision indicator, and the int conversion types are literal characters.  If one does not want the '0' conversion, one should omit it and not write it to be ignored.
>>> '%10.5d' % 100
'     00100'

And I consider the abolition of int 'precision', inr {} formatting even better.  
>>> '{:010.5d}'.format(100)
Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    '{:010.5d}'.format(100)
ValueError: Precision not allowed in integer format specifier

It has always been a source of confusion, and there is hardly any real-world use case for a partial 0 fill.

mlouielu · 2017-03-29T10:20:21Z

Add a note block under Py*_FromFormat in unicode.rst and bytes.rst.

Could Xiang or Terry help to review? Thanks.

terryjreedy · 2017-03-30T20:45:00Z

*Way* too wordy. In msg278666, I suggested minimal and terse alternatives.
a. /exactly/nearly/
b. add "except that a 0 conversion flag is not ignored when a precision is given for d, i, o, u, x and X conversion types"

terryjreedy · 2017-04-05T11:00:54Z

(Response to what I believe is latest patch.) In msg278666, my two suggestions were 'either...or', not both. The list came from Antti's msg278528, but the correct list for Python appears to be different, and different for bytes and unicode. When I made the suggestion, I did not realize that 'exactly' was repeated for each conversion type in a table. As a note, I think the following might work. "For <list of> conversion types, the 0-conversion flag has effect even when a precision is given."

I also think that 'exactly could be dropped when it is not exactly true.

zhangyangyu · 2017-04-27T03:36:38Z

New changeset 88c38b3 by Xiang Zhang (Louie Lu) in branch 'master':
bpo-28415: Note 0 conversion different between Python and C (#885)
88c38b3

zhangyangyu added 3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Oct 11, 2016

terryjreedy added the docs Documentation in the Doc dir label Oct 14, 2016

terryjreedy assigned docspython Oct 14, 2016

MojoVampire mannequin changed the title ~~PyUnicode_FromFromat interger format handling different from printf about zeropad~~ PyUnicode_FromFormat integer format handling different from printf about zeropad Oct 19, 2016

zhangyangyu added the easy label Mar 17, 2017

zhangyangyu closed this as completed Apr 27, 2017

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyUnicode_FromFormat integer format handling different from printf about zeropad #72601

PyUnicode_FromFormat integer format handling different from printf about zeropad #72601

zhangyangyu commented Oct 11, 2016

zhangyangyu commented Oct 11, 2016

ztane mannequin commented Oct 12, 2016

terryjreedy commented Oct 14, 2016

mlouielu mannequin commented Mar 29, 2017

terryjreedy commented Mar 30, 2017

terryjreedy commented Apr 5, 2017

zhangyangyu commented Apr 27, 2017

PyUnicode_FromFormat integer format handling different from printf about zeropad #72601

PyUnicode_FromFormat integer format handling different from printf about zeropad #72601

Comments

zhangyangyu commented Oct 11, 2016

zhangyangyu commented Oct 11, 2016

ztane mannequin commented Oct 12, 2016

terryjreedy commented Oct 14, 2016

mlouielu mannequin commented Mar 29, 2017

terryjreedy commented Mar 30, 2017

terryjreedy commented Apr 5, 2017

zhangyangyu commented Apr 27, 2017