Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify difference between np.testing.assert_equal and np.testing.assert_array_equal #8457

Closed
anntzer opened this issue Jan 9, 2017 · 13 comments · Fixed by #24875
Closed

Comments

@anntzer
Copy link
Contributor

anntzer commented Jan 9, 2017

The docs of assert_equal and assert_array_equal point to each other but do not mention the rather subtle differences between the two functions. So far, the only difference I've seen is that assert_equal will consider a scalar equal to a singleton array containing that scalar (which matches a precise reading of the docs) but I may or may not be missing something.

In any case a clarification in the docs would be appreciated.

@rgommers
Copy link
Member

rgommers commented Jan 9, 2017

I think assert_equal simply handles a wider range of input types well. For ndarrays the behavior is identical.

@sjpeterson
Copy link
Contributor

To me it appears that it is assert_array_equal that handles a wider range well. I tested the two with a number of (x, y) pairs

  • neither assert_equal nor assert_array_equal raise an AssertionError:

    • (1, 1)
    • ([1], [1])
    • ((1,), (1,))
    • ({1}, {1})
    • ({'a': 1}, {'a': 1})
    • (1, np.array([1]))
    • (np.array([1]), 1)
    • (1, np.array(1))
    • (np.array(1), 1)
    • ([1], np.array([1]))
    • (np.array([1]), [1])
    • ([1], np.array(1))
    • (np.array(1), [1])
    • ((1,), np.array([1]))
    • (np.array([1]), (1,))
    • ((1,), np.array(1))
    • (np.array(1), (1,))
    • ((1,), [1])
    • ([1], (1,))
    • ({1}, np.array({1}))
    • (np.array({1}), {1})
  • assert_equal raises an AssertionError, assert_array_equal does not:

    • (1, [1])
    • ([1], 1)
    • (1, (1,))
    • ((1,), 1)
  • Both raise an AssertionError

    • ({1}, 1)
    • (1, {1})
    • ({1}, [1])
    • ([1], {1})
    • ({1}, {'a': 1})
    • ({'a': 1}, {1})
    • ({1}, np.array([1]))
    • (np.array([1]), {1})
    • ...

It's also worth mentioning that for the case ({1}, {'a': 1}) (but not ({'a': 1}, {1})), the error message from assert_equal is a somewhat vague and uncharacteristic "<class 'set'>"

@sjpeterson
Copy link
Contributor

And whenever someone makes changes there, that person might as well add a few spaces in
def assert_equal(actual,desired,err_msg='',verbose=True)

@mdhaber
Copy link
Contributor

mdhaber commented Sep 11, 2023

It's been a few years since this discussion started, so the code may have changed. This is the current state of things.

Whenever either of the objects is an array, assert_equal dispatches to assert_array_equal. assert_equal has a lot more code for handling other cases, so #8457 (comment) appears to be a correct summary.

To me it appears that it is assert_array_equal that handles a wider range well.

The AssertionError raised for cases like assert_equal([1], 1) is intentional, so it seems to be handling these cases well, too. It's just a different convention from the case in which one of the elements is an array, and it conflicts with the documentation:

When one of actual and desired is a scalar and the other is array_like, the function checks that each element of the array_like object is equal to the scalar.

It looks like this part would be correct if "array_like" were "array".


To resolve this issue, I'd be willing to

  • correct the documentation of assert_equal as mentioned above
  • add a strict kwarg to numpy.testing.assert_equal, which would simply be passed in the call to assert_array_equal when either input is an array
  • move assert_array_equal to the "Asserts (not recommended)" section of the numpy.testing documentation
  • add a note to the documentation of numpy.testing.assert_array_equal similar to the one in assert_array_almost_equal:
    image

This would read:

It is recommended to use assert_equal instead of this function. The behavior of assert_equal is identical to this function when either of the inputs is an array, and assert_equal has special handling for comparing other types correctly.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 6, 2023

@ngoldbaum Can I get your thoughts on this suggested resolution to this issue (above)?

@ngoldbaum
Copy link
Member

ngoldbaum commented Oct 6, 2023

When one of actual and desired is a scalar and the other is array_like, the function checks that each element of the array_like object is equal to the scalar.

It looks like this part would be correct if "array_like" were "array".

What about this example:

In [1]: import numpy as np

In [2]: np.testing.assert_array_equal([1, 1, 1], 1)

In [3]: np.testing.assert_array_equal(np.array([1, 1, 1]), 1)

Shouldn't the second assert error if your replaced suggestion were true?

That said, this is an error:

In [5]: np.testing.assert_array_equal(np.array([1, 1, 1]), [1])
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)

add a strict kwarg to numpy.testing.assert_equal, which would simply be passed in the call to assert_array_equal when either input is an array

Looks like this is done already. Thanks, great improvement!

move assert_array_equal to the "Asserts (not recommended)" section of the numpy.testing documentation
add a note to the documentation of numpy.testing.assert_array_equal similar to the one in assert_array_almost_equal:

I'm not sure we should go this far. assert_array_almost_equal does a test that isn't all that useful for generic floating point comparisons, so we don't want to lead people to it in new code. For this issue, the difference is in how strict the assert is in checking for scalars and comparing with other types. In some contexts it's expected that the assert should behave this way, for example, if I'm testing an API that accepts an array-like from users, it's very natural for array-likes to be silently converted to an array if that's needed in a library and it would be natural to use assert_array_equal without strict checking to test that. In other cases it would be bad to allow a lax check and assert_equal should probably be preferred.

Rather than outright recommending against ever using assert_array_equal I would instead discuss the tradeoffs and note and recommend assert_equal with strict=True in cases where users would prefer more strict checking.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 6, 2023

Shouldn't the second assert error if your replaced suggestion were true?

The recommendation about "array_like" was for assert_equal, not assert_array_equal.

"Array-like" is incorrect in the statement

When one of actual and desired is a scalar and the other is array_like, the function checks that each element of the array_like object is equal to the scalar.

>>> np.testing.assert_equal([1], 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\matth\anaconda3\envs\scipydev\lib\site-packages\numpy\testing\_private\utils.py", line 377, in assert_equal
    raise AssertionError(msg)
AssertionError:
Items are not equal:
 ACTUAL: [1]
 DESIRED: 1
>>> np.testing.assert_equal(np.array([1]), 1)
>>>

array_almost_equal does a test that isn't all that useful for generic floating point comparisons, so we don't want to lead people to it in new code.

I'm confused. array_almost_equal looks like it was meant to be assert_array_almost_equal or assert_array_equal. Either way, the proposal is not to point people toward one of these functions. (In fact, it would turn users away from assert_array_equal toward assert_equal.)

Rather than outright recommending against ever using assert_array_almost_equal

assert_array_almost_equal is already "not recommended" according to the documentation, so that isn't what I meant to propose. I was suggesting a similar recommendation against assert_array_equal in favor of assert_equal.

I think the points of confusion here lend support to the idea of trimming down the number of recommended functions with similar names!

@ngoldbaum
Copy link
Member

The recommendation about "array_like" was for assert_equal

Ah, I thought it was for the similar recommendation in assert_array_equal. In that case I agree, please change it.

I think the points of confusion here lend support to the idea of trimming down the number of recommended functions with similar names!

Sorry for the imprecision, I just edited my comment, it should be what I originally intended now.

I totally agree this is confusing! I just don't want to rule out a valid use-case or insist that strict shape typing like you want is always better. Maybe it's even better most of the time, but the reason assert_array_almost_equal is recommended against is because the check it's doing never really makes sense, at least as I understand it.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 6, 2023

insist that strict shape typing like you want

While I think strict checks are an important feature to offer, my motivation here is not so much about the strictness of checks but for simplicity of the testing subpackage. assert_equal(np.array(x), y) is equivalent to assert_array_equal(x, y), so I thought it would be preferable to promote one instead of both. I was confused about the difference between these two functions before I started digging into this, so I thought we might avoid further confusion by discouraging one and showing how the same need can be satisfied with the other.

(Actually, I would have preferred that these functions follow normal broadcasting rules rather than making a special case for scalars. That would be even less strict.)

I do see that this is different from the assert_array_almost_equal case though.


Ok, well, I'd be interested in a third opinion, but otherwise , I won't push it further. I will submit a PR attempting to clarify the distinction between the two functions, though. Does that sound good?

@ngoldbaum
Copy link
Member

I thought it would be preferable to promote one instead of both.

Making it clear the checks that assert_equal does are a superset of the checks that assert_array_equal and making assert_array_equal less prominent is fine. My concern is if we explicitly discourage it that might generate a bunch of unnecessary downstream code churn. There are way too many uses of assert_array_equal to reasonably deprecate it, so we'll need to keep it and document it

. Ok, well, I'd be interested in a third opinion,

If you have time on Wednesdays, showing up to either the numpy community meeting or triage meeting on zoom is a great way to get quick feedback from several people.

I will submit a PR attempting to clarify the distinction between the two functions, though. Does that sound good?

Please!

@mdhaber
Copy link
Contributor

mdhaber commented Oct 6, 2023

Great, thanks for your thoughts!

@mdhaber
Copy link
Contributor

mdhaber commented Oct 7, 2023

To avoid multiple CI runs, perhaps we can sort out the wording here.

After the introduction but before the Parameters

def assert_array_equal(x, y, err_msg='', verbose=True, *, strict=False):
"""
Raises an AssertionError if two array_like objects are not equal.
Given two array_like objects, check that the shape is equal and all
elements of these objects are equal (but see the Notes for the special
handling of a scalar). An exception is raised at shape mismatch or
conflicting values. In contrast to the standard usage in numpy, NaNs
are compared like numbers, no assertion is raised if both objects have
NaNs in the same positions.
The usual caution for verifying equality with floating point numbers is
advised.
Parameters

how about:

    .. note:: When either `x` or `y` is already an instance of `numpy.ndarray`
        and `y` is not a `dict`, the behavior of ``assert_equal(x, y)`` is
        identical to the behavior of this function. Otherwise, this function
        performs `np.asanyarray` on the inputs before comparison, whereas
        `assert_equal` defines special comparison rules for common Python
        types. For example, only `assert_equal` can be used to compare nested
        Python lists. In new code, consider using only `assert_equal`,
        explicitly converting either `x` or `y` to arrays if the behavior of
        `assert_array_equal` is desired.

If you'd like to tone down the last sentence even further, LMK what you'd prefer.

@ngoldbaum
Copy link
Member

That language sounds fine to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants