Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ndarray should offer __format__ that can adjust precision #5543

Open
brandon-rhodes opened this Issue Feb 8, 2015 · 16 comments

Comments

Projects
None yet
10 participants
@brandon-rhodes
Copy link

brandon-rhodes commented Feb 8, 2015

In many wonderful cases an ndarray can be used in place of a Python float and Just Work.

But not in one case:

import numpy as np

n = 1.23
print('{0:.6} AU'.format(n))

n = np.array([1.23, 4.56])
print('{0:.6} AU'.format(n))

The output of the above code, at least under Python 3.4, is:

1.23 AU
Traceback (most recent call last):
  File "tmp9.py", line 7, in <module>
    print('{0:.6} AU'.format(n))
TypeError: non-empty format string passed to object.__format__

It would be a great convenience if the ndarray grew a __format__() method that understood the tiny mini-language of float formatting, and used the number of digits of precision specified there to make its own call to the standard NumPy vector array formatting. Users could control array appearance on the screen using a Python standard that many programmers already understand.

brandon-rhodes added a commit to skyfielders/python-skyfield that referenced this issue Feb 8, 2015

Make it possible to str() a vector Distance
Without this tweak, the attempt to print was dying with an error,
because a NumPy array does not know what to do with a '.6' format
string:

Traceback (most recent call last):
  File "tmp9.py", line 9, in <module>
      print(mars(tt=2457061.5).position)
        File "/home/brandon/skyfield/skyfield/units.py", line 50, in
  __str__
      return '{0:.6} AU'.format(self.AU)
      TypeError: non-empty format string passed to object.__format__

In response I have opened: numpy/numpy#5543
@njsmith

This comment has been minimized.

Copy link
Member

njsmith commented Feb 8, 2015

That would be lovely, yes. Any interest in putting together a patch?
On 8 Feb 2015 09:42, "Brandon Rhodes" notifications@github.com wrote:

In many wonderful cases an ndarray can be used in place of a Python float
and Just Work.

But not in one case:

import numpy as np

n = 1.23
print('{0:.6} AU'.format(n))

n = np.array([1.23, 4.56])
print('{0:.6} AU'.format(n))

The output of the above code, at least under Python 3.4, is:

1.23 AU
Traceback (most recent call last):
File "tmp9.py", line 7, in
print('{0:.6} AU'.format(n))
TypeError: non-empty format string passed to object.format

It would be a great convenience if the ndarray grew a format() method
that understood the tiny mini-language of float formatting, and used the
number of digits of precision specified there to make its own call to the
standard NumPy vector array formatting. Users could control array
appearance on the screen using a Python standard that many programmers
already understand.


Reply to this email directly or view it on GitHub
#5543.

@brandon-rhodes

This comment has been minimized.

Copy link
Author

brandon-rhodes commented Feb 8, 2015

Yes, I will try my hand at it and let you know if it works! Thanks for letting me know that the feature will be welcomed, before I got started.

@jaimefrio

This comment has been minimized.

Copy link
Member

jaimefrio commented Feb 8, 2015

What would the expected output be? Numpy seems to be doing what all Python sequences do, and I don't think that breaking that commonality is a good idea:

>>> a = [1.23, 4.56]
>>> aa = np.array(a)

# Python 3.4
>>> print('{0:.6} AU'.format(a))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: non-empty format string passed to object.__format__
>>> print('{0:.6} AU'.format(aa))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: non-empty format string passed to object.__format__

# Python 2.7
>>> print('{0:.6} AU'.format(a))
[1.23, AU
>>> print('{0:.6} AU'.format(aa))
[ 1.23 AU
@njsmith

This comment has been minimized.

Copy link
Member

njsmith commented Feb 8, 2015

Ah, I thought we were talking about array scalars (where we obviously
should respect the format string).

It does seem like supporting .format() in a useful way would be an
obviously good thing. If it's not obvious what the most useful way is then
someone should make a proposal on numpy-discussion and go from there.

On Sun, Feb 8, 2015 at 10:31 AM, Jaime notifications@github.com wrote:

What would the expected output be? Numpy seems to be doing what all Python
sequences do, and I don't think that breaking that commonality is a good
idea:

a = [1.23, 4.56]
aa = np.array(a)

Python 3.4

print('{0:.6} AU'.format(a))
Traceback (most recent call last):
File "", line 1, in
TypeError: non-empty format string passed to object.format
print('{0:.6} AU'.format(aa))
Traceback (most recent call last):
File "", line 1, in
TypeError: non-empty format string passed to object.format

Python 2.7

print('{0:.6} AU'.format(a))
[1.23, AU
print('{0:.6} AU'.format(aa))
[ 1.23 AU


Reply to this email directly or view it on GitHub
#5543 (comment).

Nathaniel J. Smith -- http://vorpus.org

@brandon-rhodes

This comment has been minimized.

Copy link
Author

brandon-rhodes commented Feb 8, 2015

@jaimefrio — my understanding is that the entire design of NumPy arrays is precisely to do things that all Python sequences do not. Normal lists cannot accept division; NumPy arrays can. Normal lists cannot be taken to a power; NumPy arrays can. The whole point was to break commonality very nearly everywhere, from what I can see of its design, so that lists would act like numbers.

Accepting a format string is, in my view, symmetrical with division and being taken to a power.

The behavior I am anticipating is roughly that of running str() on a NumPy array, but with the format adjusted as though numpy.set_printoptions had been called to set whatever precision is specified by the format string.

@mhvk

This comment has been minimized.

Copy link
Contributor

mhvk commented Feb 9, 2015

I like the suggestion. The array scalar case would certainly just be to have expected behaviour, and it would seem the route of least surprise to apply the format to the individual elements (and otherwise behave as if no format string was given).

@jaimefrio

This comment has been minimized.

Copy link
Member

jaimefrio commented Feb 9, 2015

It certainly seems convenient, but there may be a good reason why Python lists, or tuples, do not do the same. The change in behavior from Python 2.x to 3.x probably means that raising an error is a conscious decision. I don't follow any of the Python forums, perhaps someone will know if this has been brought up before somewhere else.

Either way. it is probably a good idea to give this a run through the mailing list, where it will get looked at by more people than here.

@brandon-rhodes

This comment has been minimized.

Copy link
Author

brandon-rhodes commented Feb 10, 2015

Python lists and tuples do not support __format__() because (a) they are non-uniform — they can have any sorts of data inside, so it is not clear whether a format string would be interpreted numerically or for strings or what, and (b) because they really do no formatting of their own: they just wrap their own parens and commas around the repr()'s of whatever strings, ints, floats, and other objects that they happen to contain.

A NumPy array—unless its members are of dtype object_, in which case I entirely agree with you that the array should follow the excellent lead of tuple() and list() by refusing any format string—is in a quite different situation: if it is numeric, then it does not contain smaller objects, and therefore cannot delegate to them with repr(). Instead it takes charge of formatting the floats or ints inside for the screen, making all kinds of format-y decisions like indentation, breaking its output into lines, and even omitting sections of data if there are too may floats.

The change in behavior was simply to avoid disappointing users who provide a format string: if the string is accepted by a type, and raises no error, then people reasonably expect to see some change in format, but were not doing so — under the old Python 2 behavior — and instead were having their format string accepted quietly but then just ignored. Here is the issue in which it was negotiated that an error was better for uses than an ignored formatting string:

http://bugs.python.org/issue7994

I propose that NumPy arrays adhere quite strictly to this standard behavior: in situations where the array itself is formatting its contents, accepting and using the string will allow the user to decide how many decimal places are shown, in the standard way that users already set the precision of data they are printing. In situations where the NumPy array contains other objects, and is doing no numeric formatting of its own, then I see no problem with its ignoring the format string (unless someone wants to make the argument that a NumPy array ought to broadcast it to its members!) and raising the “you gave me a format string I can't use!” exception that is now standard in Python to warn users away from trying to format things in situations where the format will be ignored.

@jaimefrio

This comment has been minimized.

Copy link
Member

jaimefrio commented Feb 10, 2015

You have me convinced... Thanks for the detailed explanation!

@brandon-rhodes

This comment has been minimized.

Copy link
Author

brandon-rhodes commented Feb 12, 2015

Thank you for pushing for an improved proposal — the idea that a NumPy array should raise an exception when given a format string if the NumPy array's dtype is object_ had simply not occurred to me, and it's an edge case that users will be very happy that NumPy gets right!

@njsmith (or whomever would like to), feel free to assign this to me so that it shows up on my GitHub to-do list when I sit down this weekend. Thanks for the chance to add this!

@gustavla

This comment has been minimized.

Copy link

gustavla commented Feb 2, 2017

I'm really interested in this feature (and willing to contribute) and wanted to check what progress has been made in the last two years. I can't seem to access numpy-discussion right now to see if the discussion has continued there.

@brandon-rhodes

This comment has been minimized.

Copy link
Author

brandon-rhodes commented Feb 3, 2017

Apparently, no one assigned it to me. By that first weekend I had already forgotten about this issue given the press of other responsibilities — so, no progress from me yet.

@shoyer

This comment has been minimized.

Copy link
Member

shoyer commented Feb 3, 2017

@gustavla see here for the numpy-discussion mailing list. But I don't think this has been discussed before.

Our rule is that API changes need to reach consensus on the mailing list. This feature feels like a pretty clear win to me (especially with the arrival of f-strings), so I don't anticipate any objections. Still, it would be good to come up with a concrete proposal on how it should work and run that by the mailing list before starting work.

@anntzer

This comment has been minimized.

Copy link
Contributor

anntzer commented Feb 15, 2017

@gustavla Tangentially related issue: #6136 (just scratching my own itch...)

eric-wieser added a commit to eric-wieser/numpy that referenced this issue Oct 18, 2017

ENH: Implement ndarray.__format__ for 0d arrays
This fixes numpygh-7978

The behavior for other sized arrays is left unchanged, pending discussion in numpygh-5543

theodoregoetz added a commit to theodoregoetz/numpy that referenced this issue Oct 23, 2017

ENH: Implement ndarray.__format__ for 0d arrays
This fixes numpygh-7978

The behavior for other sized arrays is left unchanged, pending discussion in numpygh-5543

@anntzer anntzer referenced this issue Oct 27, 2017

Merged

Improve reprs of transforms. #9421

1 of 6 tasks complete
@akshaybabloo

This comment has been minimized.

Copy link

akshaybabloo commented Oct 29, 2018

In many wonderful cases an ndarray can be used in place of a Python float and Just Work.

But not in one case:

import numpy as np

n = 1.23
print('{0:.6} AU'.format(n))

n = np.array([1.23, 4.56])
print('{0:.6} AU'.format(n))

The output of the above code, at least under Python 3.4, is:

1.23 AU
Traceback (most recent call last):
  File "tmp9.py", line 7, in <module>
    print('{0:.6} AU'.format(n))
TypeError: non-empty format string passed to object.__format__

It would be a great convenience if the ndarray grew a __format__() method that understood the tiny mini-language of float formatting, and used the number of digits of precision specified there to make its own call to the standard NumPy vector array formatting. Users could control array appearance on the screen using a Python standard that many programmers already understand.

In Python 3.7, the .format() seems to be working.

@asottile

This comment has been minimized.

Copy link
Contributor

asottile commented Dec 8, 2018

throwing another hat into the ring from #12491

import numpy as np
x, y = np.array([-969100.0]), np.array([-4457000.0])

# This works
"%.4g, %.4g" % (x, y)

# This errors
"{:.4g}, {:.4g}".format(x, y)

happening because size-0 / size-1 arrays are treated as their scalar in many places except in __format__

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.