Better introspection for comparing numpy arrays #5347

timhoffm · 2019-05-30T12:38:48Z

With plain python lists I can do:

a = [1, 2, 3]
b = [1, 2, 4]
assert a == b

That does not work for numpy arrays

        a = np.array([1, 2, 3])
        b = np.array([1, 2, 4])
>       assert a == b
E       ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

There are two ways out:

Using numpy.testing
```
np.testing.assert_equal(a, b)
```
This has the advantage of giving good context information on the error.
But it is far from the simple assert assert a == b paradigm in pytest.

Alternatively we can aggregate the the check to a single bool

assert np.all(a == b)

This looks much nicer, but pytest cannot tell anymore where the error is:

>       assert np.all(a == b)
E       assert False
E        +  where False = <function all at 0x7f57cd65eea0>(array([1, 2, 3]) == array([1, 2, 4])
E        +    where <function all at 0x7f57cd65eea0> = np.all

Question: Would it in principle be possible with pytest's introspection capabilities to resolve the contents of np.all() and give a better error message? In a frist step maybe something like

assert np.all(a == b)
  At index 2: False

This of course, would require an assumption on the semantics of np.all.

Alternative idea: ~~Could pytest be made to rewrite assert a == b to np.testing.assert_equal(a, b) if either a or b is a numpy array?~~ Possibly not a good idea because it introduces a different semantic compared to a plain a == b.

The text was updated successfully, but these errors were encountered:

asottile · 2019-06-02T01:45:46Z

== is not an equality check for numpy arrays

>>> x = np.array([1, 2, 3])
>>> y = np.array([1, 2, 3])
>>> x == y
array([ True,  True,  True])

Zac-HD · 2019-06-03T02:33:38Z

IMO the idiomatic solution is to use np.testing, and if you want more detailed feedback the right place to generate more detailed messages would be upstream in Numpy.

(we've run into similar issues in Hypothesis; numpy arrays are just complex enough to need special handling and papering over the differences only makes it worse in the long run 😭)

twmr · 2019-06-03T08:45:58Z

Does pytest allow to write e.g. a numpy testing plugin that extends the assert statement bytecode rewriter? This plugin would also need to provide a new version of the pytest.approx function, which exposes the functionality of the numpy.testing.assert_* functions.

nicoddemus · 2019-06-06T22:38:10Z

numpy testing plugin that extends the assert statement bytecode rewriter?

That's an interesting proposal. We would need to figure out an API and hooks in order to do that though. This is interesting as it would allow all kinds of extensions, like the one proposed here and #3479.

asottile · 2019-06-06T23:14:38Z

even with that, I don't think the one proposed here is a good proposal, numpy overrides == to implement a projection -- if that code were suddenly in a test and didn't act as a projection that would be pretty surprising (maybe not as surprising as numpy doing that in the first place, but still a deviation from how the code actually executes)

timhoffm · 2019-06-08T10:33:12Z

I agree that rewriting a == b is a bit dangerous. OTOH the specific expression assert a == b does currently not have a defined meaning. But nevertheless it's probably not good to make a == b context dependent. And it still does not solve the approx/allclose cases. This was just an idea i wanted to add to the discussion to have the space of possible solutions covered.

Independently of this is the question if pytest could introspect assert np.all(a == b) or assert np.allclose(a, b). Essentially, that could replace the need for explicit numpy.testing.* functions. Instead of

assert a.dtype == b.dtype
nptest.assert_allclose(a, b)

one could then write

assert a.dtype == b.dtype
assert np.allclose(a, b)

which restores the pattern of starting all test conditions with assert.

samueljsb · 2019-08-03T12:15:54Z

The error is a numpy error, which you'd get when asserting the truthiness of an array regardless of whether you're using pytest or not (this falls out of the numpy implementation of __eq__ for arrays). Is pytest core really the right place to change that behaviour? The error message suggests a solution: use assert all(a==b).

timhoffm · 2019-08-05T21:04:58Z

@samueljsb As I've written in the original message assert np.all(a == b) is one way of writing this. But it has the disadvantage of a not quite instructive error message.

Additional problem: If the dimensions are not the same, numpy currently just returns False for a == b. Thus, tests can fail with an obscure error message:

>       assert all(a == b)
E       TypeError: 'bool' object is not iterable

Fundamental conclusion: Numpy arrays are too complex for attempting any simple assert * with introspection from the pytest side. The comparison logic and message must come from numpy.

These are implemented in numpy.testing. At present

assert a.dtype == b.dtype
nptest.assert_allclose(a, b)

is the right way to go.

Possible enhancement: A plugin system for assert statement bytecode rewriting as proposed in #5347 (comment) could allow to write

assert a.dtype == b.dtype
assert np.allclose(a, b)

The rewrite rule assert np.allclose --> nptest.assert_allclose would have to be provided by numpy. This way users would not have to learn about np.testing explicitly. They would use normal numpy comparison functions but would still receive more instructive error messages.

MachinesJesus · 2020-06-17T06:21:16Z

Informative dev issue. Thanks!

I ended up just doing:

assert np.testing.assert_allclose(a, expected_a) is None

seems to work for now, don't know if you guys have fixed it or have a better patch. Will be using this a lot.

timhoffm · 2020-06-17T08:29:22Z

@MachinesJesus that‘s rather obscuring what is happening. assert_allclose always returns None. So the assert never fails. It‘s the assert_allclose that raises the exception and formats the error message. When using that you don‘t need the outer assert.

MachinesJesus · 2020-06-17T22:55:54Z

Hi @timhoffm, Thanks yes I tested and found unnecessary. Will keep watching this for a better solution. It would be great to see what value of the array it failed on. Chris

pckroon · 2020-08-24T12:08:20Z

Hi all,

I'm having a similar issue, except that my numpy arrays are buried deep in some custom classes (which have dicts, which have dicts, which may contain arrays). Calling e.g. np.testing methods doesn't help me.

When trying to create a custom reporter (pytest_assertrepr_compare) I can't figure out how to compare/report the difference between the array containing dicts without having to reimplement the existing reporting mechanism for dicts.

AdityaSavara · 2020-09-06T22:04:14Z

Hello, this issue of using nested objects is at least partly covered by the customCompare and compareNested functions inside https://github.com/AdityaSavara/UnitTesterSG

It even intentionally allows for comparing lists to arrays etc. because in scientific applications we may not be trying to compare apples to apples, and just want to know that the inside is the same.

I think it would be great if the custom functions from UnitTesterSG made it into pytest in some form. Right now, UnitTesterSG is made to be compatible with pytest. A typical unit test file is shown here: https://github.com/AdityaSavara/UnitTesterSG/blob/master/test12/test_12.py

nicoddemus · 2021-07-21T23:48:47Z

I think we can close this now that pytest.approx supports numpy arrays, so one should use:

a = np.array([1, 2, 3])
b = np.array([1, 2, 4])
assert a == pytest.approx(b)

Liquidmasl · 2023-02-20T12:38:06Z

I think we can close this now that pytest.approx supports numpy arrays, so one should use:
a = np.array([1, 2, 3])
b = np.array([1, 2, 4])
assert a == pytest.approx(b)

While this is a working workaround for comparing arrays while still seeing the difference, this opens a whole can of worms of other issues, the approx really takes it self seriously with the "approximate":

np.array([1,2,3]) == approx(np.array([1,2,3]))
Out[22]: True
[] == approx(np.array([]))
Out[23]: True
0 == approx(np.array([]))
Out[24]: True
"" == approx(np.array([]))
Out[25]: True
(0,) == approx(np.array([]))
Out[26]: False
(0) == approx(np.array([]))
Out[27]: True
(0) == approx(np.zeros([2,3]))
Out[29]: True

I am struggling with numpys overwritten == everyday it feels like!
especially because numpy.testing assertion methods dont use assert but instead raise an assertion error, this circumvents pytests better difference display.
Started writing my own assertion method that can take "any" object, but numpy makes it very difficult.

>>> assert np.array([]) ==  np.array([])
raises AssertionError
>>> assert np.array([0,1]) == np.array([0,1])
raises ValueError

I can catch the value error but I obviously cant catch the assertion error.

Anyway, I digress,
I would still be very happy about some kind of pytest addon that supports numpy assertions. Or at least a possibility to write one.
Also open for implementation suggestions for a general assertion method that actually uses pytests assertion (or its output)

timhoffm mentioned this issue May 30, 2019

Rewrite assert np.* tests to use numpy.testing matplotlib/matplotlib#14377

Merged

Zac-HD added topic: reporting related to terminal output and user-facing messages and errors type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature labels Jun 5, 2019

timhoffm mentioned this issue Nov 28, 2019

Fix stepfilled histogram polygon bottom perimeter matplotlib/matplotlib#15783

Merged

6 tasks

eric-czech mentioned this issue May 27, 2020

Fix #33. Handle missing values in PackGeneticBits related-sciences/gwas-analysis#34

Merged

Zac-HD added the topic: rewrite related to the assertion rewrite mechanism label Jun 15, 2020

pckroon mentioned this issue Aug 24, 2020

Integration tests marrink-lab/vermouth-martinize#281

Merged

nicoddemus closed this as completed Jul 21, 2021

eivindjahren mentioned this issue Dec 1, 2021

Avoid specialized assert formatting when we detect that __eq__ is overridden #9362

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better introspection for comparing numpy arrays #5347

Better introspection for comparing numpy arrays #5347

timhoffm commented May 30, 2019 •

edited

asottile commented Jun 2, 2019

Zac-HD commented Jun 3, 2019

twmr commented Jun 3, 2019

nicoddemus commented Jun 6, 2019 •

edited

asottile commented Jun 6, 2019

timhoffm commented Jun 8, 2019

samueljsb commented Aug 3, 2019

timhoffm commented Aug 5, 2019

MachinesJesus commented Jun 17, 2020 •

edited

timhoffm commented Jun 17, 2020 •

edited

MachinesJesus commented Jun 17, 2020

pckroon commented Aug 24, 2020 •

edited

AdityaSavara commented Sep 6, 2020 •

edited

nicoddemus commented Jul 21, 2021

Liquidmasl commented Feb 20, 2023

Better introspection for comparing numpy arrays #5347

Better introspection for comparing numpy arrays #5347

Comments

timhoffm commented May 30, 2019 • edited

asottile commented Jun 2, 2019

Zac-HD commented Jun 3, 2019

twmr commented Jun 3, 2019

nicoddemus commented Jun 6, 2019 • edited

asottile commented Jun 6, 2019

timhoffm commented Jun 8, 2019

samueljsb commented Aug 3, 2019

timhoffm commented Aug 5, 2019

MachinesJesus commented Jun 17, 2020 • edited

timhoffm commented Jun 17, 2020 • edited

MachinesJesus commented Jun 17, 2020

pckroon commented Aug 24, 2020 • edited

AdityaSavara commented Sep 6, 2020 • edited

nicoddemus commented Jul 21, 2021

Liquidmasl commented Feb 20, 2023

timhoffm commented May 30, 2019 •

edited

nicoddemus commented Jun 6, 2019 •

edited

MachinesJesus commented Jun 17, 2020 •

edited

timhoffm commented Jun 17, 2020 •

edited

pckroon commented Aug 24, 2020 •

edited

AdityaSavara commented Sep 6, 2020 •

edited