New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: fix 0d array printing using str
or formatter
.
#9332
Conversation
numpy/core/arrayprint.py
Outdated
@@ -521,7 +515,9 @@ def array2string(a, max_line_width=None, precision=None, | |||
if formatter is None: | |||
formatter = _formatter | |||
|
|||
if a.size == 0: | |||
if style is not None and a.shape == () and a.dtype.names is None: | |||
return style(a[()]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding a comment here that the names is None
check is to avoid co-recursion with voidtype_str
, which upcasts structured scalars to arrays again
Added comment, also reorganized voidtype_str. |
return ret; | ||
} | ||
else { | ||
PyObject *item, *item_str; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many spaces here
assert_equal(repr(np.datetime64('2005-02-25')[...]), | ||
"array('2005-02-25', dtype='datetime64[D]')") | ||
|
||
# repr of 0d arrays is affected by printoptions | ||
x = np.array(1) | ||
np.set_printoptions(formatter={'all':lambda x: "test"}) | ||
assert_equal(repr(x), "array(test)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your comment below got lost on github, but I see it in the email.
This test class implements tearDown
which resets the printoptions, so we don't need to worry about pollution. Will add the str test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. There's some related stuff in #9201, but we can deal with that later.
ret = PyObject_Str((PyObject *)arr); | ||
Py_DECREF(arr); | ||
} | ||
return ret; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get rid of this, and make a call to something in core.multiarray._internal
? This is the recursive point I'd rather not have. The function would look like:
def _void_repr(x):
fields = ', '.join(repr(xi) for xi in x)
if len(x) == 1:
return "(" + fields + ",)"
else:
return "(" + fields + ")"
Which is similar to StructureFormat.__call__
, but doesn't invoke formatters - which is what we want - scalar repr
shouldn't be customizable, should it?
numpy/core/arrayprint.py
Outdated
if a.size == 0: | ||
# the ``dtype.names is None`` check is to avoid co-recursion with | ||
# voidtype_str, which upcasts structured scalars to 0d arrays | ||
if style is not None and a.shape == () and a.dtype.names is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we can ditch this check too
5f045dd
to
efa131b
Compare
Yeah, eliminating the recursion that way is a good idea. Done in the new commit. There was a small catch, though, that we have to treat subarray-fields carefully. Previously they were specially printed using the |
I just pushed an extra commit with an alternate implementation of |
5b8a483
to
e8c322d
Compare
numpy/core/arrayprint.py
Outdated
def _void_scalar_repr(x): | ||
format_functions = [] | ||
for field_name in x.dtype.names: | ||
format_function = _get_format_function(ravel(x[field_name]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about format_function = repr
? I might be contradicting myself here, but I think that repr(void)
should not be reconfigurable - isn't that now the case for other scalar repr
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed your above comments - you already did this, and wanted me to compare.
The implementation of the second commit looks a little tidier in iterating over the dtype rather than the value, but I think we should be using I think we should aim to keep the output constant no matter what the print setting sare. |
numpy/core/arrayprint.py
Outdated
@@ -521,7 +515,9 @@ def array2string(a, max_line_width=None, precision=None, | |||
if formatter is None: | |||
formatter = _formatter | |||
|
|||
if a.size == 0: | |||
if style is not None and a.shape == (): | |||
return style(a[()]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can kill this if
entirely now, and leave the style
argument as obsolete?
How do we want I can see it both ways. The But I admit the |
Also, my original motivation for using |
Is that true? I would have thought using |
Ultimately, I think this should be the goal of the scalar
This is already untrue of floating point and datetime types, so I don't think it needs to be a goal.
That's a reasonable argument, that I thought of too. But I think the important thing is that unlike arrays, |
Either way, no point touching this till we pull the trigger on #9139 |
e8c322d
to
8fae6fc
Compare
I've gone ahead and rebased this, and then added a single commit on the end with the |
@@ -634,6 +608,34 @@ static PyObject * | |||
/**end repeat**/ | |||
|
|||
static PyObject * | |||
voidtype_str(PyObject *self) | |||
{ | |||
if (PyDataType_HASFIELDS(((PyVoidScalarObject*)self)->descr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra parens here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think we need them all... compiler errors for if (PyDataType_HASFIELDS((PyVoidScalarObject*)self->descr))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, I was incorrectly reading if ((PyDataType_HASFIELDS(...)))
here
if sys.version_info[0] >= 3: | ||
assert_equal(str(np.array('café', np.unicode_)), 'café') | ||
assert_equal(repr(np.array('café', np.unicode_)), | ||
"array('café',\n dtype='<U4')") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has an issue number, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so... I think we came up with this one in #9143.
I actually somewhat dislike truncating floats if they appear on their own. Inside a So I still favor that |
OK, that is a good argument as well. And since with your current implementation I can override this if I wish by using |
be1e90a
to
e3ac30b
Compare
The str of 0d arrays now returns `str(a[()])` like scalars, and the repr returns `'array(' + formatter(a[()]) + ')'` like ndarrays. The default implementation of str and repr for user-defined types is removed. Fixes numpy#1415
136fd06
to
e2e14ae
Compare
numpy/core/arrayprint.py
Outdated
if a.shape == () and not a.dtype.names: | ||
if _format_options['legacy']: | ||
return str(a.item()) | ||
return str(a[()]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What effect is guarding this under a.dtype.names
in the non-legacy case having?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd perhaps be inclined to write this as the slightly longer
if _format_options['legacy'] and a.shape == () and not a.dtype.names:
return str(a.item())
# actually justify this behavior in a comment here, where it;s a feature, not deprecated behaviour
if a.shape == () and not a.dtype.names:
return str(a[()])
just to separate the legacy conditions from the rest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good idea for clarity.
I had the same thought about a.dtype.names
when writing this. I did confirm for myself that in 1.13 the a.dtype.names
case was special-cased to use the formatter
to represent the value, i.e. it should go through the array2string
code, which uses formatter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm but actualy, here is the implication of excluding a.dtype.names
in the non-legacy case.
If we do if a.shape == () and not a.dtype.names:
, then the eventual output is the output of StructureFormat
, but with the precision
specified as argument to array_str
, but all other settings from __format_options
. I am not sure the precision
argument is ever used.
If we only do if a.shape == ()
, then the eventual output is still the output of StructureFormat
, but with all settings, including precision
, from __format_options
. We go through a different path through void_scalar_repr
to get there but the result is almost identical.
I'm going to remove the a.dtype.names
test in the non-legacy case, just to make the code simpler. I don't think it has any real effect. I'll leave it in legacy mode since it makes a difference there,
numpy/core/arrayprint.py
Outdated
if options['legacy']: | ||
if a.shape == () and not a.dtype.names: | ||
return style(a.item()) | ||
elif not (style is np._NoValue): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: style is not np._NoValue
120f0b8
to
8df86b6
Compare
doc/release/1.14.0-notes.rst
Outdated
character for positive values. The new default is '-'. | ||
|
||
This new default changes the float output relative to numpy 1.13. The old | ||
behavior can be obtained in "legacy" printing mode, see compatibility notes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add the word "above" here, and in the other place you refer to it.
numpy/core/arrayprint.py
Outdated
if a.shape == () and not a.dtype.names: | ||
return style(a.item()) | ||
elif style is not np._NoValue: | ||
warnings.warn("'style' argument is deprecated and no longer functional" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You lost the deprecation date comment here
@@ -302,11 +317,13 @@ def test_sign_spacing(self): | |||
assert_equal(repr(np.array(1.)), 'array(+1.)') | |||
assert_equal(repr(b), 'array([+1.234e+09])') | |||
|
|||
np.set_printoptions(sign='legacy') | |||
np.set_printoptions(legacy=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have a test that style
is not deprecated in legacy mode.
Looks great! Nits about release note and tests, but happy with the iteration we've landed on. |
8df86b6
to
bbf5438
Compare
Marking this for 1.14 release. If there are more unmerged PRs involving printing/formatting, it would be helpful to also set their milestones. |
@charris, I just went through and added the last 2 to 1.14 |
Great, thanks Allan. I'm going to start pushing off most of the other PRs to 1.15, but I want to make sure all the printing/formatting changes get in. |
I think everything is done. I'd like to merge this one before rebasing #8981 because of the dependencies. |
if a.shape == () and not a.dtype.names: | ||
return style(a.item()) | ||
elif style is not np._NoValue: | ||
# Deprecation 11-9-2017 v1.14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this 2017-11-09
please (like we do elsewhere)? As a European, this format is silly ;)
(yes, I realized that I missed it the first round)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good to me! I've rebased #9201 with just the unicode changes.
One tiny nit about date formats in deprecation warning.
Sounds like folks are happy with this, so in it goes. Thanks Allan, and thanks to the rest of the print formatting crew. |
This is split off from #9139, plus has some extra editing of the release notes.