-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: fix an unintended exception being raised when attempting to compare two unequal Table
instances.
#15845
Changes from all commits
966b373
d016355
367d63c
5bc36ec
7bf55f4
5c7e471
f4ff2f7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,6 +15,7 @@ | |
from astropy.io.registry import UnifiedReadWriteMethod | ||
from astropy.units import Quantity, QuantityInfo | ||
from astropy.utils import ShapedLikeNDArray, isiterable | ||
from astropy.utils.compat import NUMPY_LT_1_25 | ||
from astropy.utils.console import color_print | ||
from astropy.utils.data_info import BaseColumnInfo, DataInfo, MixinInfo | ||
from astropy.utils.decorators import format_doc | ||
|
@@ -3683,15 +3684,26 @@ | |
return self._rows_equal(other) | ||
|
||
def __ne__(self, other): | ||
return ~self.__eq__(other) | ||
eq = self.__eq__(other) | ||
if isinstance(eq, bool): | ||
# bitwise operators on bool values not reliable (e.g. `bool(~True) == True`) | ||
# and are deprecated in Python 3.12 | ||
# see https://github.com/python/cpython/pull/103487 | ||
return not eq | ||
else: | ||
return ~eq | ||
Comment on lines
+3687
to
+3694
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this fixes a secondary buglet that I discovered with the test I added. I made an attempt at separating it into its own PR, but I couldn't find a way to test it without fixing the first bug too. I could however move it to a follow up PR if requested. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch. |
||
|
||
def _rows_equal(self, other): | ||
""" | ||
Row-wise comparison of table with any other object. | ||
|
||
This is actual implementation for __eq__. | ||
|
||
Returns a 1-D boolean numpy array showing result of row-wise comparison. | ||
Returns a 1-D boolean numpy array showing result of row-wise comparison, | ||
or a bool (False) in cases where comparison isn't possible (uncomparable dtypes | ||
or unbroadcastable shapes). Intended to follow legacy numpy's elementwise | ||
comparison rules. | ||
|
||
This is the same as the ``==`` comparison for tables. | ||
|
||
Parameters | ||
|
@@ -3712,22 +3724,35 @@ | |
if isinstance(other, Table): | ||
other = other.as_array() | ||
|
||
if self.has_masked_columns: | ||
if isinstance(other, np.ma.MaskedArray): | ||
result = self.as_array() == other | ||
else: | ||
# If mask is True, then by definition the row doesn't match | ||
# because the other array is not masked. | ||
false_mask = np.zeros(1, dtype=[(n, bool) for n in self.dtype.names]) | ||
result = (self.as_array().data == other) & (self.mask == false_mask) | ||
self_is_masked = self.has_masked_columns | ||
other_is_masked = isinstance(other, np.ma.MaskedArray) | ||
|
||
allowed_numpy_exceptions = ( | ||
TypeError, | ||
ValueError if not NUMPY_LT_1_25 else DeprecationWarning, | ||
) | ||
# One table is masked and the other is not | ||
if self_is_masked ^ other_is_masked: | ||
# remap variables to a and b where a is masked and b isn't | ||
a, b = ( | ||
(self.as_array(), other) if self_is_masked else (other, self.as_array()) | ||
) | ||
|
||
# If mask is True, then by definition the row doesn't match | ||
# because the other array is not masked. | ||
false_mask = np.zeros(1, dtype=[(n, bool) for n in a.dtype.names]) | ||
try: | ||
result = (a.data == b) & (a.mask == false_mask) | ||
except allowed_numpy_exceptions: | ||
# numpy may complain that structured array are not comparable (TypeError) | ||
# or that operands are not brodcastable (ValueError) | ||
# see https://github.com/astropy/astropy/issues/13421 | ||
result = False | ||
else: | ||
if isinstance(other, np.ma.MaskedArray): | ||
# If mask is True, then by definition the row doesn't match | ||
# because the other array is not masked. | ||
false_mask = np.zeros(1, dtype=[(n, bool) for n in other.dtype.names]) | ||
result = (self.as_array() == other.data) & (other.mask == false_mask) | ||
else: | ||
try: | ||
result = self.as_array() == other | ||
except allowed_numpy_exceptions: | ||
result = False | ||
|
||
return result | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2424,3 +2424,63 @@ def test_mixin_join_regression(): | |
t12 = table.join(t1, t2, keys=("index", "flux1", "flux2"), join_type="outer") | ||
|
||
assert len(t12) == 6 | ||
|
||
|
||
@pytest.mark.parametrize( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if the test would be clearer if it gave the result in the parametrization rather than do a check to see what would be expected. I.e., "t1, t2, eq" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I asked myself the same question as I wrote the test, but I found that the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also reasonable. As said, I am absolutely fine with just merging the PR as is! (in another PR, @taldcroft pointed out the old adage of perfect being the enemy of good enough) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When I looked at this I had the same idea of "t1, t2, eq" being a little more clear, but then I reminded myself of perfect and good. I think this is quite good. 😄 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Having said that @neutrinoceros - I generally agree with @mhvk's sentiment so that is something to keep in mind going forward. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll try ! |
||
"t1, t2", | ||
[ | ||
# different names | ||
( | ||
Table([np.array([1])], names=["a"]), | ||
Table([np.array([1])], names=["b"]), | ||
), | ||
# different data (broadcastable) | ||
( | ||
Table([np.array([])], names=["a"]), | ||
Table([np.array([1])], names=["a"]), | ||
), | ||
# different data (not broadcastable) | ||
( | ||
Table([np.array([1, 2])], names=["a"]), | ||
Table([np.array([1, 2, 3])], names=["a"]), | ||
), | ||
# different names and data (broadcastable) | ||
( | ||
Table([np.array([])], names=["a"]), | ||
Table([np.array([1])], names=["b"]), | ||
), | ||
# different names and data (not broadcastable) | ||
( | ||
Table([np.array([1, 2])], names=["a"]), | ||
Table([np.array([1, 2, 3])], names=["b"]), | ||
), | ||
# different data and array type (broadcastable) | ||
( | ||
Table([np.array([])], names=["a"]), | ||
Table([np.ma.MaskedArray([1])], names=["a"]), | ||
), | ||
# different data and array type (not broadcastable) | ||
( | ||
Table([np.array([1, 2])], names=["a"]), | ||
Table([np.ma.MaskedArray([1, 2, 3])], names=["a"]), | ||
), | ||
], | ||
) | ||
def test_table_comp(t1, t2): | ||
# see https://github.com/astropy/astropy/issues/13421 | ||
try: | ||
np.result_type(t1.dtype, t2.dtype) | ||
np.broadcast_shapes((len(t1),), (len(t2),)) | ||
except (TypeError, ValueError): | ||
# dtypes are not comparable or arrays can't be broadcasted: | ||
# a simple bool should be returned | ||
assert not t1 == t2 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While we are here what about explicitly checking the return type in all 8 asserts. I think this can be done compactly with this but you should check.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. technically true, but it is not needed: the expression |
||
assert not t2 == t1 | ||
assert t1 != t2 | ||
assert t2 != t1 | ||
else: | ||
# otherwise, the general case is to return a 1D array with dtype=bool | ||
assert not any(t1 == t2) | ||
assert not any(t2 == t1) | ||
assert all(t1 != t2) | ||
assert all(t2 != t1) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Fix an unintended exception being raised when attempting to compare two unequal ``Table`` instances. |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this count as behavior change that should not be backported? If so, please update milestone and remove backport label. Thanks! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @taldcroft should answer here but my understanding is that we're just making the intended behaviour actually work and align out-of-dat documentation, so I think it's okay to backport. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pllim - I agree with what @neutrinoceros said, with the conclusion that this should be backported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could do
return self.__eq__(other) == False
, but arguably trying to be too clever...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this would probably be more performant (
isinstance
is infamously slow). If this function is considered performance-critical in any way I think it's worth considering, otherwise I think it just hurts readability (I know it's hard to resist the call of golfing sometimes !)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, readability is why I felt I was trying to be too clever. Let's stick with what you have!