Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Allow absolute precision in assert_almost_equal (#13357) #30562

Merged
merged 1 commit into from Jun 24, 2020

Conversation

joaoleveiga
Copy link
Contributor

@joaoleveiga joaoleveiga commented Dec 30, 2019

closes #9457

Greetings! This is my first pull-request on an open source project, so I hope I did not miss anything at least too obvious... Thank you, in advance 😄

I do have a few questions:

  • Should I add the versionadded directive in all docstrings that were changed?
  • Which "whatsnew" release notes should I add this change to? v1.0.0?
  • Should assert_almost_equal / equals should allow access to np.allclose #9457 also be closed, as from my understanding this was also the underlying issue?
  • Should I be the one ticking the checkboxes below?

This fixes the issue where:

# Fails because it's doing (1 - .1 / .1001)
assert_almost_equal(0.1, 0.1001, check_less_precise=True)

# Works as intuitively expected
assert_almost_equal(
    0.1, 0.1001,
    atol=0.01,
)

Commit message below:

This commit makes assert_almost_equal accept both relative and
absolute precision when comparing numbers, through two new keyword
arguments: rtol, and atol, respectively.

Under the hood, _libs.testing.assert_almost_equal is now calling
math.isclose, instead of an adaptaion of
numpy.testing.assert_almost_equal.

  • closes assert_almost_equal #13357
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@joaoleveiga
Copy link
Contributor Author

Just noticed that I added the issue number on the commit title, which I guess I was not supposed to. 🤔 Can I push-force to my branch and update it?

@jbrockmendel
Copy link
Member

Just noticed that I added the issue number on the commit title, which I guess I was not supposed to.

Not an issue at all, don't worry about it.

@jbrockmendel
Copy link
Member

Greetings! This is my first pull-request on an open source project,

Welcome!

Which "whatsnew" release notes should I add this change to? v1.0.0?

v1.0.0, correct

Should I be the one ticking the checkboxes below?

Yes

Should #9457 also be closed, as from my understanding this was also the underlying issue?

I haven't looked at that yet, but assuming you're right, then yes. also add another checkbox with a - [x] closes #9457

@pep8speaks
Copy link

pep8speaks commented Dec 31, 2019

Hello @joaoleveiga! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-06-21 09:44:27 UTC

@joaoleveiga
Copy link
Contributor Author

Should #9457 also be closed, as from my understanding this was also the underlying issue?

I haven't looked at that yet, but assuming you're right, then yes. also add another checkbox with a - [x] closes #9457

Looking back at this, it seems like the OP wanted a different issue solved, so I won't add the tag for that issue here.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather:

  • deprecate check_less_precise
  • add rtol and atol

similar to https://docs.scipy.org/doc/numpy/reference/generated/numpy.allclose.html with the same defaults

this will be a bit more involved, but ultimately is better than adding unfamiliar keywords

@jreback jreback added the Testing pandas testing functions or related to the test suite label Jan 1, 2020
@joaoleveiga
Copy link
Contributor Author

Thanks for the suggestion @jreback ! I was afraid of touching in too many parts of the code, but I definitely agree to having a similar API to numpy.allclose. Let's see how deprecation works 😉

@joaoleveiga
Copy link
Contributor Author

I would rather:

  • deprecate check_less_precise
  • add rtol and atol

similar to https://docs.scipy.org/doc/numpy/reference/generated/numpy.allclose.html with the same defaults

this will be a bit more involved, but ultimately is better than adding unfamiliar keywords

Any suggestions to make these changes without breaking current functionality?

If the rtol and atol parameters are added with the same defaults, then it will be tricky to check if the user is requesting the "old behavior" with check_less_precise=True/False.

Should I worry about such a breaking change or is it OK?

Example of what I was doing:

def assert_almost_equal(
    left,
    right,
    check_dtype: Union[bool, str] = "equiv",
    check_less_precise: Union[bool, int] = False,
    rtol: float = 1e-5,
    atol: float = 1e-8,
    **kwargs,
):
    """(...)
    check_less_precise : bool or int, default False
        (...)
        .. deprecated:: 1.0.0
           Use `rtol` and `atol` instead to define relative, and absolute
           tolerance, respectively. Similar to :func:`numpy.allclose`.
    rtol : float, default 1e-5
        Relative tolerance.
    atol : float, default 1e-8
        Absolute tolerance.
    """
    ...

🤔

A few options I thought of:

  • With rtol/atol defined as Optional[float] one could check if the new args are defined first, and still keep the check_less_precise flag working as it is now.
  • Ditch the check_less_precise functionality while keeping the argument so that the deprecation warning is more explicit.

Any ideas?
Thanks!

@jreback
Copy link
Contributor

jreback commented Jan 2, 2020

you can just set the default for check_less_precise = lib.no_default (grep for it)

then if check_less_precise is passed at all you warn (and then set rtol if needed)

change all refs in the pandas codebase and should be good

@joaoleveiga
Copy link
Contributor Author

joaoleveiga commented Jan 2, 2020

After looking better at how numpy handles these sort of assertions with numpy.isclose, I realized that perhaps it would be cleaner to just use it, or call math.isclose instead.

WDYT? Does it make sense to replicate math.isclose in _libs.testing.pyx?

This would imply deprecating the _libs.testing.decimal_almost_equal function.

I'm going to commit the changes I made anyway, so that you can review it.

EDIT: BTW, looking at numpy/numpy#10161 has convinced me that going with the math.isclose algorithm seems like the best option. I'm just unsure on how to deal with the deprecation, to avoid being a nuisance to people that might be using the testing.assert_*_equal functions.

@jreback
Copy link
Contributor

jreback commented Jan 2, 2020

After looking better at how numpy handles these sort of assertions with numpy.isclose, I realized that perhaps it would be cleaner to just use it, or call math.isclose instead.

WDYT? Does it make sense to replicate math.isclose in _libs.testing.pyx?

This would imply deprecating the _libs.testing.decimal_almost_equal function.

I'm going to commit the changes I made anyway, so that you can review it.

EDIT: BTW, looking at numpy/numpy#10161 has convinced me that going with the math.isclose algorithm seems like the best option. I'm just unsure on how to deal with the deprecation, to avoid being a nuisance to people that might be using the testing.assert_*_equal functions.

yes not averse to using math.isclose here. I think we handle all of the NaN / different dtype stuff before then (that is of course why this was written original; numpy didn't handle anything except float/int properly), but that's where these tol's matter anyhow

@joaoleveiga joaoleveiga force-pushed the ft-abs-almost-equal-13357 branch 2 times, most recently from 3ad3d7e to fd478b8 Compare January 2, 2020 15:58
@joaoleveiga
Copy link
Contributor Author

Should I move the changes in the whatsnew file from the Enhancements section to the Deprecation section?

@joaoleveiga
Copy link
Contributor Author

Meanwhile, I found some tests that use assert_almost_equal are breaking (and were already breaking prior to this change...). Should I try fixing them in this issue, or open a new one?

@joaoleveiga
Copy link
Contributor Author

@jreback what do you think?

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joaoleveiga can you fix up merge conflicts? Someone can take a look after that

doc/source/whatsnew/v1.0.0.rst Outdated Show resolved Hide resolved
@joaoleveiga joaoleveiga force-pushed the ft-abs-almost-equal-13357 branch 2 times, most recently from 3e1d71a to 84f7b86 Compare January 18, 2020 11:24
@joaoleveiga
Copy link
Contributor Author

@joaoleveiga can you fix up merge conflicts? Someone can take a look after that

Conflicts solved.

@joaoleveiga
Copy link
Contributor Author

Looking at the checks that failed, I was left with a few questions:

  1. Should I add a new test under tests.util.test_assert_almost_equal to test that the FutureWarning is raised?
  2. Should the warning issue a FutureWarning or a DeprecationWarning?
  3. Should I add the @pytest.mark.filterwarnings("ignore:The 'check_less_precise':FutureWarning") decoration to all tests that make use of check_less_precise=? 😬

Thanks

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Should I add a new test under tests.util.test_assert_almost_equal to test that the FutureWarning is raised?

yes

  • Should the warning issue a FutureWarning or a DeprecationWarning?

FutureWarning as you have

  • Should I add the @pytest.mark.filterwarnings("ignore:The 'check_less_precise':FutureWarning") decoration to all tests that make use of check_less_precise=?

NO, just change all of the usages that need changing (likely not very many)

pandas/_testing.py Outdated Show resolved Hide resolved
pandas/_testing.py Outdated Show resolved Hide resolved
@@ -1217,13 +1347,15 @@ def assert_frame_equal(
check_index_type="equiv",
check_column_type="equiv",
check_frame_type=True,
check_less_precise=False,
check_less_precise=no_default,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just use None here, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I was just following your suggestion:

you can just set the default for check_less_precise = lib.no_default (grep for it)

Which seemed a better fit when deprecating a parameter. Should I change it to None or keep with the no_default? 😄

pandas/tests/plotting/test_converter.py Outdated Show resolved Hide resolved
@jbrockmendel
Copy link
Member

Do we have internal use cases for this, or are we just/mostly providing it for downstream authors/users?

@WillAyd
Copy link
Member

WillAyd commented Mar 11, 2020

@joaoleveiga looks like a few merge conflicts have built up - can you resolve them?

@joaoleveiga
Copy link
Contributor Author

@joaoleveiga looks like a few merge conflicts have built up - can you resolve them?

Done! Sorry for the delay, but these past weeks were a bit intense here (Portugal).

pandas/_libs/testing.pyx Show resolved Hide resolved
pandas/_libs/testing.pyx Show resolved Hide resolved
pandas/_testing.py Show resolved Hide resolved
@jreback
Copy link
Contributor

jreback commented Jun 14, 2020

this was pretty close last time i looked. can you merge master

@joaoleveiga
Copy link
Contributor Author

this was pretty close last time i looked. can you merge master

@jreback , just merged master. Locally I'm getting some errors on the datetime tests which seem to be somewhat related to locale definitions. Other than that, other tests passed 👌

@TomAugspurger
Copy link
Contributor

@joaoleveiga possible typing issue: https://github.com/pandas-dev/pandas/pull/30562/checks?check_run_id=772917451#step:11:10, can you check?

Performing static analysis using mypy
pandas/_testing.py:1183: error: "ExtensionArray" has no attribute "asi8"

@joaoleveiga
Copy link
Contributor Author

@joaoleveiga possible typing issue: https://github.com/pandas-dev/pandas/pull/30562/checks?check_run_id=772917451#step:11:10, can you check?

Performing static analysis using mypy
pandas/_testing.py:1183: error: "ExtensionArray" has no attribute "asi8"

I have looked at that and I still don't understand why it's triggering. I didn't change that piece of code.
I can add an additional hasattr(right, "asi8") if it's OK for you, even though it's not really related to the original issue/PR.

WDYT? @TomAugspurger , @jreback

    if hasattr(left, "asi8") and hasattr(right, "asi8") and type(right) == type(left):

@jreback
Copy link
Contributor

jreback commented Jun 20, 2020

@joaoleveiga possible typing issue: https://github.com/pandas-dev/pandas/pull/30562/checks?check_run_id=772917451#step:11:10, can you check?

Performing static analysis using mypy
pandas/_testing.py:1183: error: "ExtensionArray" has no attribute "asi8"

I have looked at that and I still don't understand why it's triggering. I didn't change that piece of code.
I can add an additional hasattr(right, "asi8") if it's OK for you, even though it's not really related to the original issue/PR.

WDYT? @TomAugspurger , @jreback

    if hasattr(left, "asi8") and hasattr(right, "asi8") and type(right) == type(left):

can you repro locally? I think something like
assert isinstance(left, DatetimeLikeArrayMixin) (and same for right) will work here

This commit makes `assert_almost_equal` accept both relative and
absolute precision when comparing numbers, through two new keyword
arguments: `rtol`, and `atol`, respectively.

Under the hood, `_libs.testing.assert_almost_equal` is now calling
`math.isclose`, instead of an adaptaion of
[numpy.testing.assert_almost_equal](https://docs.scipy.org/doc/numpy-1.17.0/reference/generated/numpy.testing.assert_almost_equal.html).
Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All green @jreback. I think the docs could use some expanding, but that can be done as a followup.

Use `rtol` and `atol` instead to define relative/absolute
tolerance, respectively. Similar to :func:`math.isclose`.
rtol : float, default 1e-5
Relative tolerance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is user-facing documentation, so these descriptions could be improved.

Could you copy the description from numpy.isclose, or link to it with a See Also?

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. minor doc comment & I think @TomAugspurger has some, but as a followon is ok. Thanks for doing this PR, very nice!

@@ -781,6 +781,9 @@ Deprecations
- The ``squeeze`` keyword in the ``groupby`` function is deprecated and will be removed in a future version (:issue:`32380`)
- The ``tz`` keyword in :meth:`Period.to_timestamp` is deprecated and will be removed in a future version; use `per.to_timestamp(...).tz_localize(tz)`` instead (:issue:`34522`)
- :meth:`DatetimeIndex.to_perioddelta` is deprecated and will be removed in a future version. Use ``index - index.to_period(freq).to_timestamp()`` instead (:issue:`34853`)
- :meth:`util.testing.assert_almost_equal` now accepts both relative and absolute
precision through the ``rtol``, and ``atol`` parameters, thus deprecating the
``check_less_precise`` parameter. (:issue:`13357`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also indicate #9457 here (for a followon PR is fine)

@jbrockmendel
Copy link
Member

@joaoleveiga want to make a PR to enforce this deprecation?

@joaoleveiga
Copy link
Contributor Author

@joaoleveiga want to make a PR to enforce this deprecation?

@jbrockmendel , I only had time to tackle this today but I see it was already taken care of :) thanks for the heads up. Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

assert_almost_equal assert_almost_equal / equals should allow access to np.allclose
7 participants