Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: testing.assert_series_equal: inferred check_exact should not be passed down to index check #57067

Closed
3 tasks done
crusaderky opened this issue Jan 25, 2024 · 2 comments · Fixed by #57341
Closed
3 tasks done
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member Regression Functionality that used to work in a prior pandas version Testing pandas testing functions or related to the test suite
Milestone

Comments

@crusaderky
Copy link

crusaderky commented Jan 25, 2024

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

>>> a = pd.Series(np.zeros(6, dtype=int), [0, 0.2, 0.4, 0.6, 0.8, 1])
>>> b = pd.Series(np.zeros(6, dtype=int), np.linspace(0, 1, 6))  # 0.6000000000000001
>>> pd.testing.assert_series_equal(a, b)

Issue Description

In pandas 2.2.0, the above passes if the series' values dtype is float and fails if the series' values dtype is int. In pandas 2.1.4, it passes in both cases.

The difference seems to be caused by a change in the default for the check_exact in assert_series_equal:

Changed in version 2.2.0: Defaults to True for integer dtypes if none of check_exact, rtol and atol are specified.

What I think is happening is that, in pandas 2.1.4, assert_series_equal was passing down its own default check_exact=False to assert_index_equal - even if the default for the latter is check_exact=True. In 2.2.0, assert_series_equal infers check_exact=True from the values dtype, and then passes it down to assert_index_equal.

Expected Behavior

In my opinion both pandas 2.2.0 and 2.1.4 are wrong here. If the user doesn't explicitly state check_exact, it should default to two potentially different values for the values and the index, which means False for float values, True for int values, and True for the index regardless of dtype (as it already is for assert_index_equal).

Installed Versions

commit : f538741 python : 3.12.1.final.0 python-bits : 64 OS : Linux OS-release : 6.5.0-14-generic Version : #14~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Nov 20 18:15:30 UTC 2 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_GB.UTF-8 LOCALE : en_GB.UTF-8

pandas : 2.2.0
numpy : 1.26.3

@crusaderky crusaderky added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 25, 2024
@crusaderky
Copy link
Author

@rhshadrach rhshadrach added Testing pandas testing functions or related to the test suite Regression Functionality that used to work in a prior pandas version labels Jan 26, 2024
@rhshadrach
Copy link
Member

Thanks for the report. I haven't checked, but seems likely related to #55882. cc @MarcoGorelli

Agreed the default for integers should be to check for equality in the index, approximate for floats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member Regression Functionality that used to work in a prior pandas version Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants